ModErn Text Analysis
META Enumerates Textual Applications
metadata_writer.h
Go to the documentation of this file.
1 
10 #ifndef META_INDEX_METADATA_WRITER_H_
11 #define META_INDEX_METADATA_WRITER_H_
12 
13 #include <mutex>
14 
15 #include "meta/config.h"
16 #include "meta/corpus/document.h"
17 #include "meta/corpus/metadata.h"
18 #include "meta/util/disk_vector.h"
19 
20 namespace meta
21 {
22 namespace index
23 {
24 
29 {
30  public:
37  metadata_writer(const std::string& prefix, uint64_t num_docs,
38  corpus::metadata::schema_type schema);
39 
47  void write(doc_id d_id, uint64_t length, uint64_t num_unique,
48  const std::vector<corpus::metadata::field>& mdata);
49 
50  private:
52  std::mutex lock_;
53 
56 
58  uint64_t byte_pos_;
59 
61  std::ofstream db_file_;
62 
64  corpus::metadata::schema_type schema_;
65 };
66 }
67 }
68 #endif
util::disk_vector< uint64_t > seek_pos_
the index into the database file
Definition: metadata_writer.h:55
uint64_t byte_pos_
the current byte position in the database
Definition: metadata_writer.h:58
uint64_t length(const std::string &str)
Definition: utf.cpp:125
metadata_writer(const std::string &prefix, uint64_t num_docs, corpus::metadata::schema_type schema)
Constructs the writer.
Definition: metadata_writer.cpp:14
std::mutex lock_
a lock for thread safety
Definition: metadata_writer.h:52
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retrieval, data mining, and other applications of text processing.
Definition: analyzer.h:25
std::ofstream db_file_
the output stream for the database file
Definition: metadata_writer.h:61
corpus::metadata::schema_type schema_
the schema of the metadata we are writing
Definition: metadata_writer.h:64
Writes document metadata into the packed format for the index.
Definition: metadata_writer.h:28
void write(doc_id d_id, uint64_t length, uint64_t num_unique, const std::vector< corpus::metadata::field > &mdata)
Writes a document&#39;s metadata to the database and index.
Definition: metadata_writer.cpp:41