ModErn Text Analysis META Enumerates Textual Applications
meta::sequence::crf Class Reference

Linear-chain conditional random field for POS tagging and chunking applications. More...

#include <crf.h>

Classes

struct  parameters
Wrapper to represent the parameters used during learning. More...

class  scorer
Internal class that holds scoring information for sequences under the current model. More...

class  tagger

class  viterbi_scorer
Scorer for performing viterbi-based tagging. More...

Public Member Functions

crf (const std::string &prefix)
Constructs a new CRF, storing model parameters in the given prefix. More...

double train (parameters params, const std::vector< sequence > &examples)
Trains a new CRF model on the given examples. More...

tagger make_tagger () const
Constructs a new tagging interface that references the current model. More...

uint64_t num_labels () const

Private Types

using double_matrix = util::dense_matrix< double >
A dense_matrix of doubles, used frequently in training and testing for holding score information under the model.

using feature_range = util::basic_range< crf_feature_id >
A range representing a set of feature functions (ids).

Private Member Functions

void initialize (const std::vector< sequence > &examples)
Initializes the CRF model based on the set of training examples. More...

Loads the CRF model from the files stored on disk.

void reset ()
Completely resets the model weights.

double calibrate (parameters params, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples)
Determines a good initial setting for the learning rate. More...

const double & obs_weight (crf_feature_id idx) const

double & obs_weight (crf_feature_id idx)

const double & trans_weight (crf_feature_id idx) const

double & trans_weight (crf_feature_id idx)

feature_range obs_range (feature_id fid) const

feature_range trans_range (label_id lbl) const

label_id observation (crf_feature_id idx) const

label_id transition (crf_feature_id idx) const

double epoch (parameters params, printing::progress &progress, uint64_t iter, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples, scorer &scorer)
Performs a single epoch of training. More...

double iteration (parameters params, uint64_t iter, const sequence &seq, scorer &scorer)
Performs a single iteration within a training epoch. More...

void gradient_observation_expectation (const sequence &seq, double gain)
Updates the model parameters based on the observation expectation part of the gradient. More...

void gradient_model_expectation (const sequence &seq, double gain, const scorer &scr)
Updates the model parameters based on the model expectation part of the gradient. More...

double l2norm () const

void rescale ()
Updates all of the weights by re-scaling by the current scale parameter, and sets the scale parameter to 1 after doing so.

Private Attributes

friend scorer

util::optional< util::disk_vector< crf_feature_id > > observation_ranges_
Represents the feature id range for a given observation: observation_ranges_[i] gives the start of a range of crf_feature_ids (indexing into the observation_weights_) that have fired for feature_id i, and observation_ranges_[i + 1] gives the end of the range. More...

util::optional< util::disk_vector< crf_feature_id > > transition_ranges_
Analogous to the observation range, but for transitions. More...

util::optional< util::disk_vector< label_id > > observations_
Represents the state that fired for a given observation feature. More...

util::optional< util::disk_vector< label_id > > transitions_
Represents the destination label for a given transition feature. More...

util::optional< util::disk_vector< double > > observation_weights_
The weights for all of the node-observation features. More...

util::optional< util::disk_vector< double > > transition_weights_
Weights for all of the transition features. More...

double scale_
the current decay factor applied to all of the weights

uint64_t num_labels_
the number of allowed labels

const std::string & prefix_
the prefix (folder) where model files are to be stored

Detailed Description

Linear-chain conditional random field for POS tagging and chunking applications.

Learned using l2 regularized stochastic gradient descent.

This CRF implementation uses node-observation features only. This means that feature templates look like $$f(o_t, s_t)$$ and $$f(s_{t-1}, s_t)$$ only. This is done for memory efficiency and to avoid overfitting.

http://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf

§ crf()

 meta::sequence::crf::crf ( const std::string & prefix )

Constructs a new CRF, storing model parameters in the given prefix.

If a crf model already exists in the given prefix, it will be loaded; otherwise, the directory will be created.

Parameters
 prefix The prefix (folder) to load/store model files

§ train()

 double meta::sequence::crf::train ( parameters params, const std::vector< sequence > & examples )

Trains a new CRF model on the given examples.

The examples are assumed to have been run through a sequence_analyzer first to generate features for every observation in every sequence.

Parameters
 params The parameters for the learning algorithm examples The labeled training examples
Returns
the loss for the last epoch during training

§ make_tagger()

 auto meta::sequence::crf::make_tagger ( ) const

Constructs a new tagging interface that references the current model.

Returns
a new tagging interface for this model

§ num_labels()

 uint64_t meta::sequence::crf::num_labels ( ) const
Returns
the number of labels possible under this model.

§ initialize()

 void meta::sequence::crf::initialize ( const std::vector< sequence > & examples )
private

Initializes the CRF model based on the set of training examples.

This function runs the "feature generation" portion of the training, where we try to find all state-observation and transition-observation functions that are active in the training data.

Parameters
 examples The training examples

§ calibrate()

 double meta::sequence::crf::calibrate ( parameters params, const std::vector< uint64_t > & indices, const std::vector< sequence > & examples )
private

Determines a good initial setting for the learning rate.

Based on Leon Bottou's SGD implementation.

Parameters
 params The parameters for learning indices The vector of shuffled indices for the random sampling examples The (unshuffled) training examples
Returns
The optimal t0 found by calibration, which determines the initial learning rate $$\eta$$.

§ obs_weight() [1/2]

 const double & meta::sequence::crf::obs_weight ( crf_feature_id idx ) const
private
Parameters
 idx The internal crf model feature id
Returns
a const reference to the weight associated with this feature

§ obs_weight() [2/2]

 double & meta::sequence::crf::obs_weight ( crf_feature_id idx )
private
Parameters
 idx The internal crf model feature id
Returns
a reference to the weight associated with this feature

§ trans_weight() [1/2]

 const double & meta::sequence::crf::trans_weight ( crf_feature_id idx ) const
private
Parameters
 idx The internal crf model feature id
Returns
a const reference to the weight associated with this feature

§ trans_weight() [2/2]

 double & meta::sequence::crf::trans_weight ( crf_feature_id idx )
private
Parameters
 idx The internal crf model feature id
Returns
a reference to the weight associated with this feature

§ obs_range()

 auto meta::sequence::crf::obs_range ( feature_id fid ) const
private
Parameters
 fid The external observation feature id
Returns
a range of internal crf model feature ids for state features that are active for this observation

§ trans_range()

 auto meta::sequence::crf::trans_range ( label_id lbl ) const
private
Parameters
 lbl The label
Returns
a range of internal crf model feature ids for transitions that are active for this state

§ observation()

 label_id meta::sequence::crf::observation ( crf_feature_id idx ) const
private
Parameters
 idx The internal crf model feature id
Returns
the label associated with this state-based feature id

§ transition()

 label_id meta::sequence::crf::transition ( crf_feature_id idx ) const
private
Parameters
 idx The internal crf model feature id
Returns
the destination label associated with this transition feature id

§ epoch()

 double meta::sequence::crf::epoch ( parameters params, printing::progress & progress, uint64_t iter, const std::vector< uint64_t > & indices, const std::vector< sequence > & examples, scorer & scorer )
private

Performs a single epoch of training.

Parameters
 params The learning parameters progress The progress logger to use iter The current epoch indices The shuffled indices for the random sampling examples The (not shuffled) training examples scorer The scorer to re-use
Returns
the loss for this training epoch

§ iteration()

 double meta::sequence::crf::iteration ( parameters params, uint64_t iter, const sequence & seq, scorer & scorer )
private

Performs a single iteration within a training epoch.

Parameters
 params The learning parameters iter The current number of total iterations ( $$t$$) seq The sequence to use to update model parameters scorer The scorer to re-use
Returns
the loss associated with this single iteration within the epoch

 void meta::sequence::crf::gradient_observation_expectation ( const sequence & seq, double gain )
private

Updates the model parameters based on the observation expectation part of the gradient.

Parameters
 seq The sequence to use gain The amount to scale the weight updates by

 void meta::sequence::crf::gradient_model_expectation ( const sequence & seq, double gain, const scorer & scr )
private

Updates the model parameters based on the model expectation part of the gradient.

Parameters
 seq The sequence to use gain The amount to scale the weight updates by scr The scorer to re-use for computing the marginal probabilities

§ l2norm()

 double meta::sequence::crf::l2norm ( ) const
private
Returns
the current l2 norm of the weights ( $$w^T w$$)

§ observation_ranges_

 util::optional > meta::sequence::crf::observation_ranges_
private

Represents the feature id range for a given observation: observation_ranges_[i] gives the start of a range of crf_feature_ids (indexing into the observation_weights_) that have fired for feature_id i, and observation_ranges_[i + 1] gives the end of the range.

(If i is the end, then the size of observation_weights_ gives the last id.)

§ transition_ranges_

 util::optional > meta::sequence::crf::transition_ranges_
private

Analogous to the observation range, but for transitions.

transition_ranges_[i] gives the start of a range of feature_ids (indexing into transition_weights_) that have fired for label_id i, and transition_ranges_[i+1] gives the end of the range. (If i is the end, then the size of transition_weights_ gives the last id.)

§ observations_

 util::optional > meta::sequence::crf::observations_
private

Represents the state that fired for a given observation feature.

This is a parallel vector with observation_weights_, where observations_[f] gives the label_id for the observation feature f.

§ transitions_

 util::optional > meta::sequence::crf::transitions_
private

Represents the destination label for a given transition feature.

This is a parallel vector with transition_weights_, where transitions_[f] gives the destination for transition feature f.

§ observation_weights_

 util::optional > meta::sequence::crf::observation_weights_
private

The weights for all of the node-observation features.

Indexes must be taken from the observation_ranges_ vector.

§ transition_weights_

 util::optional > meta::sequence::crf::transition_weights_
private

Weights for all of the transition features.

Indexes must be taken from the transition_ranges_ vector.

The documentation for this class was generated from the following files:
• /home/chase/projects/meta/include/meta/sequence/crf/crf.h
• /home/chase/projects/meta/src/sequence/crf/crf.cpp
• /home/chase/projects/meta/src/sequence/crf/tagger.cpp