ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Private Attributes | List of all members
meta::index::metadata_file Class Reference

Used for reading document-level metadata for an index. More...

#include <metadata_file.h>

Public Member Functions

 metadata_file (const std::string &prefix)
 Opens the metadata file stored at prefix.
 
corpus::metadata get (doc_id d_id) const
 Obtains metadata for a document. More...
 
uint64_t size () const
 

Private Attributes

corpus::metadata::schema_type schema_
 the schema for this file
 
util::disk_vector< uint64_t > index_
 the seek positions for every document in this file
 
io::mmap_file md_db_
 the mapped file for reading metadata from
 

Detailed Description

Used for reading document-level metadata for an index.

The following two-file format is used:

<FieldCount> is the number of user-supplied metadata fields (they must be present for all documents). We add two in the grammar above since we always represent the length (integer) and unique-terms (integer) as metadata. The "length", "unique-terms", and "path" metadata names are reserved, but there can be more metadata if the user supplies it.

Member Function Documentation

§ get()

corpus::metadata meta::index::metadata_file::get ( doc_id  d_id) const

Obtains metadata for a document.

The object returned is a proxy and will look up metadata upon first request. If metadata is requested multiple times from the same metadata object, it will not be re-parsed from the file.

Parameters
d_idThe document id to look up metadata for
Returns
the metadata for the document

§ size()

uint64_t meta::index::metadata_file::size ( ) const
Returns
the number of documents in this database

The documentation for this class was generated from the following files: