ModErn Text Analysis
META Enumerates Textual Applications
Classes | Public Member Functions | Private Types | Private Member Functions | Private Attributes | List of all members
meta::sequence::crf Class Reference

Linear-chain conditional random field for POS tagging and chunking applications. More...

#include <crf.h>

Classes

struct  parameters
 Wrapper to represent the parameters used during learning. More...
 
class  scorer
 Internal class that holds scoring information for sequences under the current model. More...
 
class  tagger
 
class  viterbi_scorer
 Scorer for performing viterbi-based tagging. More...
 

Public Member Functions

 crf (const std::string &prefix)
 Constructs a new CRF, storing model parameters in the given prefix. More...
 
double train (parameters params, const std::vector< sequence > &examples)
 Trains a new CRF model on the given examples. More...
 
tagger make_tagger () const
 Constructs a new tagging interface that references the current model. More...
 
uint64_t num_labels () const
 

Private Types

using double_matrix = util::dense_matrix< double >
 A dense_matrix of doubles, used frequently in training and testing for holding score information under the model.
 
using feature_range = util::basic_range< crf_feature_id >
 A range representing a set of feature functions (ids).
 

Private Member Functions

void initialize (const std::vector< sequence > &examples)
 Initializes the CRF model based on the set of training examples. More...
 
void load_model ()
 Loads the CRF model from the files stored on disk.
 
void reset ()
 Completely resets the model weights.
 
double calibrate (parameters params, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples)
 Determines a good initial setting for the learning rate. More...
 
const double & obs_weight (crf_feature_id idx) const
 
double & obs_weight (crf_feature_id idx)
 
const double & trans_weight (crf_feature_id idx) const
 
double & trans_weight (crf_feature_id idx)
 
feature_range obs_range (feature_id fid) const
 
feature_range trans_range (label_id lbl) const
 
label_id observation (crf_feature_id idx) const
 
label_id transition (crf_feature_id idx) const
 
double epoch (parameters params, printing::progress &progress, uint64_t iter, const std::vector< uint64_t > &indices, const std::vector< sequence > &examples, scorer &scorer)
 Performs a single epoch of training. More...
 
double iteration (parameters params, uint64_t iter, const sequence &seq, scorer &scorer)
 Performs a single iteration within a training epoch. More...
 
void gradient_observation_expectation (const sequence &seq, double gain)
 Updates the model parameters based on the observation expectation part of the gradient. More...
 
void gradient_model_expectation (const sequence &seq, double gain, const scorer &scr)
 Updates the model parameters based on the model expectation part of the gradient. More...
 
double l2norm () const
 
void rescale ()
 Updates all of the weights by re-scaling by the current scale parameter, and sets the scale parameter to 1 after doing so.
 

Private Attributes

friend scorer
 
util::optional< util::disk_vector< crf_feature_id > > observation_ranges_
 Represents the feature id range for a given observation: observation_ranges_[i] gives the start of a range of crf_feature_ids (indexing into the observation_weights_) that have fired for feature_id i, and observation_ranges_[i + 1] gives the end of the range. More...
 
util::optional< util::disk_vector< crf_feature_id > > transition_ranges_
 Analogous to the observation range, but for transitions. More...
 
util::optional< util::disk_vector< label_id > > observations_
 Represents the state that fired for a given observation feature. More...
 
util::optional< util::disk_vector< label_id > > transitions_
 Represents the destination label for a given transition feature. More...
 
util::optional< util::disk_vector< double > > observation_weights_
 The weights for all of the node-observation features. More...
 
util::optional< util::disk_vector< double > > transition_weights_
 Weights for all of the transition features. More...
 
double scale_
 the current decay factor applied to all of the weights
 
uint64_t num_labels_
 the number of allowed labels
 
const std::string & prefix_
 the prefix (folder) where model files are to be stored
 

Detailed Description

Linear-chain conditional random field for POS tagging and chunking applications.

Learned using l2 regularized stochastic gradient descent.

This CRF implementation uses node-observation features only. This means that feature templates look like \(f(o_t, s_t)\) and \(f(s_{t-1}, s_t)\) only. This is done for memory efficiency and to avoid overfitting.

See also
http://homepages.inf.ed.ac.uk/csutton/publications/crftut-fnt.pdf

Constructor & Destructor Documentation

§ crf()

meta::sequence::crf::crf ( const std::string &  prefix)

Constructs a new CRF, storing model parameters in the given prefix.

If a crf model already exists in the given prefix, it will be loaded; otherwise, the directory will be created.

Parameters
prefixThe prefix (folder) to load/store model files

Member Function Documentation

§ train()

double meta::sequence::crf::train ( parameters  params,
const std::vector< sequence > &  examples 
)

Trains a new CRF model on the given examples.

The examples are assumed to have been run through a sequence_analyzer first to generate features for every observation in every sequence.

Parameters
paramsThe parameters for the learning algorithm
examplesThe labeled training examples
Returns
the loss for the last epoch during training

§ make_tagger()

auto meta::sequence::crf::make_tagger ( ) const

Constructs a new tagging interface that references the current model.

Returns
a new tagging interface for this model

§ num_labels()

uint64_t meta::sequence::crf::num_labels ( ) const
Returns
the number of labels possible under this model.

§ initialize()

void meta::sequence::crf::initialize ( const std::vector< sequence > &  examples)
private

Initializes the CRF model based on the set of training examples.

This function runs the "feature generation" portion of the training, where we try to find all state-observation and transition-observation functions that are active in the training data.

Parameters
examplesThe training examples

§ calibrate()

double meta::sequence::crf::calibrate ( parameters  params,
const std::vector< uint64_t > &  indices,
const std::vector< sequence > &  examples 
)
private

Determines a good initial setting for the learning rate.

Based on Leon Bottou's SGD implementation.

Parameters
paramsThe parameters for learning
indicesThe vector of shuffled indices for the random sampling
examplesThe (unshuffled) training examples
Returns
The optimal t0 found by calibration, which determines the initial learning rate \(\eta\).

§ obs_weight() [1/2]

const double & meta::sequence::crf::obs_weight ( crf_feature_id  idx) const
private
Parameters
idxThe internal crf model feature id
Returns
a const reference to the weight associated with this feature

§ obs_weight() [2/2]

double & meta::sequence::crf::obs_weight ( crf_feature_id  idx)
private
Parameters
idxThe internal crf model feature id
Returns
a reference to the weight associated with this feature

§ trans_weight() [1/2]

const double & meta::sequence::crf::trans_weight ( crf_feature_id  idx) const
private
Parameters
idxThe internal crf model feature id
Returns
a const reference to the weight associated with this feature

§ trans_weight() [2/2]

double & meta::sequence::crf::trans_weight ( crf_feature_id  idx)
private
Parameters
idxThe internal crf model feature id
Returns
a reference to the weight associated with this feature

§ obs_range()

auto meta::sequence::crf::obs_range ( feature_id  fid) const
private
Parameters
fidThe external observation feature id
Returns
a range of internal crf model feature ids for state features that are active for this observation

§ trans_range()

auto meta::sequence::crf::trans_range ( label_id  lbl) const
private
Parameters
lblThe label
Returns
a range of internal crf model feature ids for transitions that are active for this state

§ observation()

label_id meta::sequence::crf::observation ( crf_feature_id  idx) const
private
Parameters
idxThe internal crf model feature id
Returns
the label associated with this state-based feature id

§ transition()

label_id meta::sequence::crf::transition ( crf_feature_id  idx) const
private
Parameters
idxThe internal crf model feature id
Returns
the destination label associated with this transition feature id

§ epoch()

double meta::sequence::crf::epoch ( parameters  params,
printing::progress progress,
uint64_t  iter,
const std::vector< uint64_t > &  indices,
const std::vector< sequence > &  examples,
scorer scorer 
)
private

Performs a single epoch of training.

Parameters
paramsThe learning parameters
progressThe progress logger to use
iterThe current epoch
indicesThe shuffled indices for the random sampling
examplesThe (not shuffled) training examples
scorerThe scorer to re-use
Returns
the loss for this training epoch

§ iteration()

double meta::sequence::crf::iteration ( parameters  params,
uint64_t  iter,
const sequence seq,
scorer scorer 
)
private

Performs a single iteration within a training epoch.

Parameters
paramsThe learning parameters
iterThe current number of total iterations ( \(t\))
seqThe sequence to use to update model parameters
scorerThe scorer to re-use
Returns
the loss associated with this single iteration within the epoch

§ gradient_observation_expectation()

void meta::sequence::crf::gradient_observation_expectation ( const sequence seq,
double  gain 
)
private

Updates the model parameters based on the observation expectation part of the gradient.

Parameters
seqThe sequence to use
gainThe amount to scale the weight updates by

§ gradient_model_expectation()

void meta::sequence::crf::gradient_model_expectation ( const sequence seq,
double  gain,
const scorer scr 
)
private

Updates the model parameters based on the model expectation part of the gradient.

Parameters
seqThe sequence to use
gainThe amount to scale the weight updates by
scrThe scorer to re-use for computing the marginal probabilities

§ l2norm()

double meta::sequence::crf::l2norm ( ) const
private
Returns
the current l2 norm of the weights ( \(w^T w\))

Member Data Documentation

§ observation_ranges_

util::optional<util::disk_vector<crf_feature_id> > meta::sequence::crf::observation_ranges_
private

Represents the feature id range for a given observation: observation_ranges_[i] gives the start of a range of crf_feature_ids (indexing into the observation_weights_) that have fired for feature_id i, and observation_ranges_[i + 1] gives the end of the range.

(If i is the end, then the size of observation_weights_ gives the last id.)

§ transition_ranges_

util::optional<util::disk_vector<crf_feature_id> > meta::sequence::crf::transition_ranges_
private

Analogous to the observation range, but for transitions.

transition_ranges_[i] gives the start of a range of feature_ids (indexing into transition_weights_) that have fired for label_id i, and transition_ranges_[i+1] gives the end of the range. (If i is the end, then the size of transition_weights_ gives the last id.)

§ observations_

util::optional<util::disk_vector<label_id> > meta::sequence::crf::observations_
private

Represents the state that fired for a given observation feature.

This is a parallel vector with observation_weights_, where observations_[f] gives the label_id for the observation feature f.

§ transitions_

util::optional<util::disk_vector<label_id> > meta::sequence::crf::transitions_
private

Represents the destination label for a given transition feature.

This is a parallel vector with transition_weights_, where transitions_[f] gives the destination for transition feature f.

§ observation_weights_

util::optional<util::disk_vector<double> > meta::sequence::crf::observation_weights_
private

The weights for all of the node-observation features.

Indexes must be taken from the observation_ranges_ vector.

§ transition_weights_

util::optional<util::disk_vector<double> > meta::sequence::crf::transition_weights_
private

Weights for all of the transition features.

Indexes must be taken from the transition_ranges_ vector.


The documentation for this class was generated from the following files: