ModErn Text Analysis
META Enumerates Textual Applications
Classes | Public Types | Public Member Functions | Private Member Functions | Private Attributes | List of all members
meta::index::postings_inverter< Index > Class Template Reference

An interface for writing and merging inverted chunks of postings_data for a disk_index. More...

#include <postings_inverter.h>

Classes

class  producer
 The object that is fed postings_data by the index. More...
 

Public Types

using index_pdata_type = typename Index::index_pdata_type
 
using primary_key_type = typename index_pdata_type::primary_key_type
 
using secondary_key_type = typename index_pdata_type::secondary_key_type
 
using chunk_t = chunk< primary_key_type, secondary_key_type >
 
using postings_buffer_type = postings_buffer< primary_key_type, secondary_key_type >
 

Public Member Functions

 postings_inverter (const std::string &prefix, unsigned writers=8)
 Constructs a postings_inverter that writes to the given prefix. More...
 
producer make_producer (uint64_t ram_budget)
 Creates a producer for this postings_inverter. More...
 
uint32_t size () const
 
uint64_t final_size () const
 
void merge_chunks ()
 Merge the remaining on-disk chunks.
 
uint64_t unique_primary_keys () const
 

Private Member Functions

template<class Allocator >
void write_chunk (std::vector< postings_buffer_type, Allocator > &pdata)
 

Private Attributes

std::string prefix_
 The prefix for all chunks to be written.
 
std::atomic< uint32_t > chunk_num_ {0}
 The current chunk number.
 
std::priority_queue< chunk_tchunks_
 Queue of chunks on disk that need to be merged */.
 
std::mutex mutables_
 Mutex used for protecting the chunk queue.
 
parallel::semaphore sem_
 Semaphore used for limiting the number of threads writing to disk.
 
util::optional< uint64_t > unique_primary_keys_
 Number of unique primary keys encountered while merging.
 

Detailed Description

template<class Index>
class meta::index::postings_inverter< Index >

An interface for writing and merging inverted chunks of postings_data for a disk_index.

Constructor & Destructor Documentation

§ postings_inverter()

template<class Index >
meta::index::postings_inverter< Index >::postings_inverter ( const std::string &  prefix,
unsigned  writers = 8 
)

Constructs a postings_inverter that writes to the given prefix.

Parameters
prefixThe prefix for all chunks to be written
max_writersThe maximum number of allowed writing threads

Member Function Documentation

§ make_producer()

template<class Index >
auto meta::index::postings_inverter< Index >::make_producer ( uint64_t  ram_budget)

Creates a producer for this postings_inverter.

Producers are designed to be thread-local buffers of chunks that write to disk when their buffer is full.

Parameters
ram_bugdetThe estimated allowed size of this thread-local buffer
Returns
a new producer

§ size()

template<class Index >
uint32_t meta::index::postings_inverter< Index >::size ( ) const
Returns
the number of chunks this handler has written to disk.

§ final_size()

template<class Index >
uint64_t meta::index::postings_inverter< Index >::final_size ( ) const
Returns
the size, in bytes, of the last chunk written to disk after merging.

§ unique_primary_keys()

template<class Index >
uint64_t meta::index::postings_inverter< Index >::unique_primary_keys ( ) const
Returns
the number of unique primary keys seen while merging chunks.

§ write_chunk()

template<class Index >
template<class Allocator >
void meta::index::postings_inverter< Index >::write_chunk ( std::vector< postings_buffer_type, Allocator > &  pdata)
private
Parameters
pdataThe collection of postings_data objects to combine into a chunk

The documentation for this class was generated from the following files: