ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Public Attributes | Static Public Attributes | Private Member Functions | Private Attributes | List of all members
meta::index::disk_index::disk_index_impl Class Reference

The implementation of a disk_index. More...

#include <disk_index_impl.h>

Public Member Functions

void initialize_metadata ()
 Loads the metadata file.
void load_labels (uint64_t num_docs=0)
 Loads the doc labels. More...
void load_term_id_mapping ()
 Loads the term_id mapping.
void load_label_id_mapping ()
 Loads the label_id mapping.
void save_label_id_mapping ()
 Saves the label_id mapping.
void set_label (doc_id id, const class_label &label)
 Sets the label for a document. More...
uint64_t total_unique_terms () const
label_id doc_label_id (doc_id id) const
std::vector< class_label > class_labels () const

Public Attributes

friend disk_index
 friend the interface

Static Public Attributes

static const std::vector< const char * > files
 Filenames used in the index. More...

Private Member Functions

label_id get_label_id (const class_label &lbl)

Private Attributes

std::string index_name_
 the location of this index
util::optional< util::disk_vector< label_id > > labels_
 Maps which class a document belongs to (if any). More...
util::optional< metadata_filemetadata_
 Stores additional metadata for each document.
util::optional< vocabulary_mapterm_id_mapping_
 Maps string terms to term_ids.
util::invertible_map< class_label, label_id > label_ids_
 Assigns an integer to each class label (used for liblinear mappings)
std::mutex mutex_
 mutex for thread-safe operations

Detailed Description

The implementation of a disk_index.

Member Function Documentation

§ load_labels()

void meta::index::disk_index::disk_index_impl::load_labels ( uint64_t  num_docs = 0)

Loads the doc labels.

num_docsThe number of documents stored in the index

§ set_label()

void meta::index::disk_index::disk_index_impl::set_label ( doc_id  id,
const class_label &  label 

Sets the label for a document.

idThe document id
labelThe new label

§ total_unique_terms()

uint64_t meta::index::disk_index::disk_index_impl::total_unique_terms ( ) const
the total number of unique terms in the index.

§ doc_label_id()

label_id meta::index::disk_index::disk_index_impl::doc_label_id ( doc_id  id) const
the label id for a given document.
idThe document id

§ class_labels()

std::vector< class_label > meta::index::disk_index::disk_index_impl::class_labels ( ) const
the possible class labels for this index

§ get_label_id()

label_id meta::index::disk_index::disk_index_impl::get_label_id ( const class_label &  lbl)
lblthe string class label to find the id for
the label_id of a class_label, creating a new one if necessary

Member Data Documentation

§ files

const std::vector< const char * > meta::index::disk_index::disk_index_impl::files
Initial value:
= {"/docs.labels", "/labelids.mapping", "/postings.index",
"/postings.index_index", "/termids.mapping", "/termids.mapping.inverse",
"/metadata.db", "/metadata.index"}

Filenames used in the index.

§ labels_

util::optional<util::disk_vector<label_id> > meta::index::disk_index::disk_index_impl::labels_

Maps which class a document belongs to (if any).

Each index corresponds to a doc_id (uint64_t).

The documentation for this class was generated from the following files: