ModErn Text Analysis
META Enumerates Textual Applications
Classes | Typedefs | Functions
meta::learn Namespace Reference

Generic learning algorithms and support data structures. More...

Classes

class  dataset
 Represents an in-memory view of a set of documents for running learning algorithms over. More...
 
class  dataset_view
 A non-owning view of a dataset. More...
 
struct  instance
 Represents an instance in the dataset, consisting of its id and feature_vector. More...
 
class  l2norm_transformer
 Transformer to normalize all unit vectors to unit length. More...
 
class  labeled_dataset
 
class  sgd_model
 A generic stochastic gradient descent learner for binary classification or regression. More...
 
class  tfidf_transformer
 Transformer for converting term frequency vectors into tf-idf weight vectors. More...
 

Typedefs

using feature_id = term_id
 
using feature_vector = util::sparse_vector< feature_id, double >
 

Functions

void print_liblinear (std::ostream &os, const feature_vector &weights)
 
template<class TransformFunction >
void transform (dataset &dset, TransformFunction &&trans)
 Transforms the feature vectors of a dataset in place using the given transformation function. More...
 
void tfidf_transform (dataset &dset, index::inverted_index &idx, index::ranking_function &rnk)
 Transforms the feature vectors of a dataset in place to be tf-idf features using the given index for term statistics and ranker for tf-idf weight definitions. More...
 
void l2norm_transform (dataset &dset)
 Transforms the feature vectors of a dataset in place to be unit length according to their L2 norm. More...
 

Detailed Description

Generic learning algorithms and support data structures.

Function Documentation

§ transform()

template<class TransformFunction >
void meta::learn::transform ( dataset dset,
TransformFunction &&  trans 
)

Transforms the feature vectors of a dataset in place using the given transformation function.

TransformFunction must have an operator() that takes a learn::instance by mutable reference and changes its feature values in-place. For example, a simple TransformFunction might be one that normalizes all of the feature vectors to be unit length.

Parameters
dsetThe dataset to be transformed
transThe transformation function to be applied to all feature_vectors in dset

§ tfidf_transform()

void meta::learn::tfidf_transform ( dataset dset,
index::inverted_index idx,
index::ranking_function rnk 
)

Transforms the feature vectors of a dataset in place to be tf-idf features using the given index for term statistics and ranker for tf-idf weight definitions.

Parameters
dsetThe dataset to be transformed
idxThe inverted_index to use for term statistics like df
rnkThe ranker to use to define tf-idf weights (via its score_one())

§ l2norm_transform()

void meta::learn::l2norm_transform ( dataset dset)

Transforms the feature vectors of a dataset in place to be unit length according to their L2 norm.

Parameters
dsetThe dataset to be transformed