ModErn Text Analysis
META Enumerates Textual Applications
Public Types | Public Member Functions | Private Attributes | List of all members
meta::classify::multiclass_dataset Class Reference
Inheritance diagram for meta::classify::multiclass_dataset:
meta::learn::labeled_dataset< class_label > meta::learn::dataset

Public Types

using class_label_map = util::invertible_map< label_type, label_id >
 
using class_label_iterator = class_label_map::iterator
 
- Public Types inherited from meta::learn::labeled_dataset< class_label >
using label_type = class_label
 
- Public Types inherited from meta::learn::dataset
using instance_type = instance
 
using const_iterator = std::vector< instance_type >::const_iterator
 
using iterator = std::vector< instance_type >::iterator
 
using size_type = std::vector< instance_type >::size_type
 

Public Member Functions

 multiclass_dataset (std::shared_ptr< index::forward_index > idx)
 Creates an in-memory dataset from a forward_index. More...
 
template<class DocIdContainer >
 multiclass_dataset (std::shared_ptr< index::forward_index > idx, DocIdContainer &&dcont)
 Creates an in-memory dataset from a forward_index and a list of document ids.
 
template<class ForwardIterator >
 multiclass_dataset (std::shared_ptr< index::forward_index > idx, ForwardIterator begin, ForwardIterator end)
 Creates an in-memory dataset from a forward_index and a range of doc_ids, represented as iterators.
 
template<class ForwardIterator >
 multiclass_dataset (std::shared_ptr< index::inverted_index > idx, ForwardIterator begin, ForwardIterator end)
 Creates an in-memory listing of documents from an inverted_index and a range of doc_ids, represented as iterators. More...
 
 multiclass_dataset (std::shared_ptr< index::inverted_index > idx)
 Creates an in-memory listing of documents from an inverted_index. More...
 
template<class DocIdContainer >
 multiclass_dataset (std::shared_ptr< index::inverted_index > idx, DocIdContainer &&cont)
 Creates an in-memory listing of documents from an inverted_index and a container of doc_ids. More...
 
template<class ForwardIterator >
 multiclass_dataset (ForwardIterator begin, ForwardIterator end, size_type total_features)
 Creates an in-memory dataset from a pair of iterators. More...
 
template<class ForwardIterator , class FeatureVectorFunction , class LabelFunction >
 multiclass_dataset (ForwardIterator begin, ForwardIterator end, size_type total_features, FeatureVectorFunction &&featurizer, LabelFunction &&labeller)
 Creates an in-memory dataset from a pair of iterators, a function to convert to a feature_vector and a function to obtain a label.
 
size_type total_labels () const
 
label_id label_id_for (const class_label &lbl) const
 
class_label label_for (label_id lid) const
 
class_label_iterator labels_begin () const
 
class_label_iterator labels_end () const
 
void print_liblinear (std::ostream &os, const instance_type &instance) const
 
- Public Member Functions inherited from meta::learn::labeled_dataset< class_label >
 labeled_dataset (std::shared_ptr< index::forward_index > idx, LabelFunction &&labeller)
 Creates an in-memory dataset from a forward_index. More...
 
 labeled_dataset (std::shared_ptr< index::forward_index > idx, DocIdContainer &&dcont, LabelFunction &&labeller)
 Creates an in-memory dataset from a forward_index, a range of document identifiers (as collection), and a LabelFunction to assign labels to document identifiers.
 
 labeled_dataset (std::shared_ptr< index::forward_index > idx, ForwardIterator begin, ForwardIterator end, LabelFunction &&labeller)
 Creates an in-memory dataset from a forward_index and a range of doc_ids, represented as iterators.
 
 labeled_dataset (std::shared_ptr< index::inverted_index > idx, ForwardIterator begin, ForwardIterator end)
 Creates an in-memory dataset from an inverted_index and a range fo doc_ids, represented as iterators. More...
 
 labeled_dataset (ForwardIterator begin, ForwardIterator end, size_type total_features)
 Creates an in-memory dataset from a pair of iterators. More...
 
 labeled_dataset (ForwardIterator begin, ForwardIterator end, size_type total_features, FeatureVectorFunction &&featurizer, LabelFunction &&labeller)
 Creates an in-memory dataset from a pair of iterators, a function to convert to a feature_vector, and a function to obtain a label.
 
label_type label (const instance_type &inst) const
 
- Public Member Functions inherited from meta::learn::dataset
template<class ForwardIterator , class ProgressTrait = printing::default_progress_trait>
 dataset (std::shared_ptr< index::forward_index > idx, ForwardIterator begin, ForwardIterator end, ProgressTrait=ProgressTrait{})
 Creates an in-memory dataset from a forward_index and a range of doc_ids, represented as iterators.
 
template<class ForwardIterator , class ProgressTrait = printing::default_progress_trait>
 dataset (std::shared_ptr< index::inverted_index > idx, ForwardIterator begin, ForwardIterator end, ProgressTrait=ProgressTrait{})
 Creates an in-memory listing of documents from an inverted_index and a range of doc_ids, represented as iterators. More...
 
template<class ForwardIterator >
 dataset (ForwardIterator begin, ForwardIterator end, size_type total_features)
 Creates an in-memory dataset from a pair of iterators. More...
 
template<class ForwardIterator , class FeatureVectorFunction >
 dataset (ForwardIterator begin, ForwardIterator end, size_type total_features, FeatureVectorFunction &&featurizer)
 Creates an in-memory dataset from a pair of iterators and a function to convert to a feature_vector.
 
const_iterator begin () const
 
iterator begin ()
 
const_iterator end () const
 
iterator end ()
 
size_type size () const
 
size_type total_features () const
 
const instance_typeoperator() (size_type index) const
 

Private Attributes

class_label_map label_id_mapping_
 the mapping from label <-> label_id
 

Constructor & Destructor Documentation

§ multiclass_dataset() [1/5]

meta::classify::multiclass_dataset::multiclass_dataset ( std::shared_ptr< index::forward_index idx)
inline

Creates an in-memory dataset from a forward_index.

This loads the entire index into memory, so you should only use this constructor with small datasets.

For large datasets (where large is defined as "larger than available RAM", use one of the constructors that takes a range (or collection) of document ids to load in to load in just a specific section of the index.

§ multiclass_dataset() [2/5]

template<class ForwardIterator >
meta::classify::multiclass_dataset::multiclass_dataset ( std::shared_ptr< index::inverted_index idx,
ForwardIterator  begin,
ForwardIterator  end 
)
inline

Creates an in-memory listing of documents from an inverted_index and a range of doc_ids, represented as iterators.

Note that this constructor will not load any feature_vectors, nor any class_labels, as this is just a thin wrapper around a set of document ids. This is mainly for use with the knn classifier.

§ multiclass_dataset() [3/5]

meta::classify::multiclass_dataset::multiclass_dataset ( std::shared_ptr< index::inverted_index idx)
inline

Creates an in-memory listing of documents from an inverted_index.

Note that this constructor will not load any feature_vectors, nor any class_labels, as this is just a thin wrapper around a set of document ids. This is mainly for use with the knn classifier.

§ multiclass_dataset() [4/5]

template<class DocIdContainer >
meta::classify::multiclass_dataset::multiclass_dataset ( std::shared_ptr< index::inverted_index idx,
DocIdContainer &&  cont 
)
inline

Creates an in-memory listing of documents from an inverted_index and a container of doc_ids.

Note that this constructor will not load any feature_vectors, nor any class_labels, as this is just a thin wrapper around a set of document ids. This is mainly for use with the knn classifier.

§ multiclass_dataset() [5/5]

template<class ForwardIterator >
meta::classify::multiclass_dataset::multiclass_dataset ( ForwardIterator  begin,
ForwardIterator  end,
size_type  total_features 
)
inline

Creates an in-memory dataset from a pair of iterators.

The dereferenced type must have a conversion operator to a feature_vector and a conversion operator to a class_label.

Member Function Documentation

§ total_labels()

size_type meta::classify::multiclass_dataset::total_labels ( ) const
inline
Returns
the number of unique labels in the dataset

§ label_id_for()

label_id meta::classify::multiclass_dataset::label_id_for ( const class_label &  lbl) const
inline
Returns
the label_id associated with this label

§ label_for()

class_label meta::classify::multiclass_dataset::label_for ( label_id  lid) const
inline
Returns
the class_label associated with this label_id

The documentation for this class was generated from the following file: