ModErn Text Analysis
META Enumerates Textual Applications
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
meta::topics::lda_model Class Referenceabstract

An LDA topic model base class. More...

#include <lda_model.h>

Inheritance diagram for meta::topics::lda_model:
meta::topics::lda_cvb meta::topics::lda_gibbs meta::topics::lda_scvb meta::topics::parallel_lda_gibbs

Public Member Functions

 lda_model (std::shared_ptr< index::forward_index > idx, uint64_t num_topics)
 Constructs an lda_model over the given set of documents and with a fixed number of topics. More...
 
virtual ~lda_model ()=default
 Destructor. More...
 
virtual void run (uint64_t num_iters, double convergence)=0
 Runs the model for a given number of iterations, or until a convergence criteria is met. More...
 
void save_doc_topic_distributions (const std::string &filename) const
 Saves the topic proportions \(\theta_d\) for each document to the given file. More...
 
void save_topic_term_distributions (const std::string &filename) const
 Saves the term distributions \(\phi_j\) for each topic to the given file. More...
 
void save (const std::string &prefix) const
 Saves the current model to a set of files beginning with prefix: prefix.phi, prefix.theta, and prefix.terms. More...
 
virtual double compute_term_topic_probability (term_id term, topic_id topic) const =0
 
virtual double compute_doc_topic_probability (doc_id doc, topic_id topic) const =0
 
uint64_t num_topics () const
 

Protected Member Functions

lda_modeloperator= (const lda_model &)=delete
 lda_models cannot be copy assigned.
 
 lda_model (const lda_model &)=delete
 lda_models cannot be copy constructed.
 

Protected Attributes

std::shared_ptr< index::forward_indexidx_
 The index containing the documents for the model.
 
size_t num_topics_
 The number of topics.
 
size_t num_words_
 The number of total unique words.
 

Detailed Description

An LDA topic model base class.

Required config parameters (for use with the ./lda executable):

1 inference = "inference-method" # gibbs, pargibbs, cvb, scvb
2 max-iters = 1000
3 alpha = 1.0
4 beta = 1.0
5 topics = 4
6 model-prefix = "prefix"

Optional config parameters: none.

Constructor & Destructor Documentation

meta::topics::lda_model::lda_model ( std::shared_ptr< index::forward_index idx,
uint64_t  num_topics 
)

Constructs an lda_model over the given set of documents and with a fixed number of topics.

Parameters
idxThe index containing the documents to use for the model
num_topicsThe number of topics to find
virtual meta::topics::lda_model::~lda_model ( )
virtualdefault

Destructor.

Made virtual to allow for deletion through pointer to base.

Member Function Documentation

virtual void meta::topics::lda_model::run ( uint64_t  num_iters,
double  convergence 
)
pure virtual

Runs the model for a given number of iterations, or until a convergence criteria is met.

Parameters
num_itersThe maximum allowed number of iterations
convergenceThe convergence criteria (this has different meanings for different subclass models)

Implemented in meta::topics::lda_cvb, meta::topics::lda_gibbs, and meta::topics::lda_scvb.

void meta::topics::lda_model::save_doc_topic_distributions ( const std::string &  filename) const

Saves the topic proportions \(\theta_d\) for each document to the given file.

Saves the distributions in a simple "human readable" plain-text format.

Parameters
filenameThe file to save \(\theta\) to
void meta::topics::lda_model::save_topic_term_distributions ( const std::string &  filename) const

Saves the term distributions \(\phi_j\) for each topic to the given file.

Saves the distributions in a simple "human readable" plain-text format.

Parameters
filenameThe file to save \(\phi\) to
void meta::topics::lda_model::save ( const std::string &  prefix) const

Saves the current model to a set of files beginning with prefix: prefix.phi, prefix.theta, and prefix.terms.

Parameters
prefixThe prefix for all generated files over this model
virtual double meta::topics::lda_model::compute_term_topic_probability ( term_id  term,
topic_id  topic 
) const
pure virtual
Returns
the probability that the given term appears in the given topic
Parameters
termThe term we are concerned with
topicThe topic we are concerned with

Implemented in meta::topics::lda_gibbs, meta::topics::lda_cvb, and meta::topics::lda_scvb.

virtual double meta::topics::lda_model::compute_doc_topic_probability ( doc_id  doc,
topic_id  topic 
) const
pure virtual
Returns
the probability that the given topic is picked for the given document
Parameters
docThe document we are concerned with
topicThe topic we are concerned with

Implemented in meta::topics::lda_gibbs, meta::topics::lda_cvb, and meta::topics::lda_scvb.

uint64_t meta::topics::lda_model::num_topics ( ) const
Returns
the number of topics in this model

The documentation for this class was generated from the following files: