ModErn Text Analysis
META Enumerates Textual Applications
rocchio.h
Go to the documentation of this file.
1 
10 #ifndef META_INDEX_ROCCHIO_H_
11 #define META_INDEX_ROCCHIO_H_
12 
14 
15 namespace meta
16 {
17 namespace index
18 {
19 
48 class rocchio : public ranker
49 {
50  public:
52  const static util::string_view id;
53 
55  const static constexpr float default_alpha = 1.0f;
56 
58  const static constexpr float default_beta = 0.8f;
59 
61  const static constexpr uint64_t default_k = 10;
62 
67  const static constexpr uint64_t default_max_terms = 50;
68 
69  rocchio(std::shared_ptr<forward_index> fwd);
70 
71  rocchio(std::shared_ptr<forward_index> fwd,
72  std::unique_ptr<ranker>&& initial_ranker,
73  float alpha = default_alpha, float beta = default_beta,
74  uint64_t k = default_k, uint64_t max_terms = default_max_terms);
75 
76  rocchio(std::istream& in);
77 
78  void save(std::ostream& out) const override;
79 
80  std::vector<search_result>
81  rank(ranker_context& ctx, uint64_t num_results,
82  const filter_function_type& filter) override;
83 
84  private:
85  std::shared_ptr<forward_index> fwd_;
86  std::unique_ptr<ranker> initial_ranker_;
87  const float alpha_;
88  const float beta_;
89  const uint64_t k_;
90  const uint64_t max_terms_;
91 };
92 
96 template <>
97 std::unique_ptr<ranker> make_ranker<rocchio>(const cpptoml::table& global,
98  const cpptoml::table& local);
99 }
100 }
101 #endif
static const constexpr float default_beta
Default value of beta, the positive document weight parameter.
Definition: rocchio.h:58
A ranker scores a query against all the documents in an inverted index, returning a list of documents...
Definition: ranker.h:159
Stores a list of postings_stream and other relevant information for performing document-at-a-time ran...
Definition: ranker.h:103
static const constexpr uint64_t default_k
Default value for k, the number of feedback documents to retrieve.
Definition: rocchio.h:61
A non-owning reference to a string.
Definition: string_view.h:51
static const constexpr float default_alpha
Default value of alpha, the original query weight parameter.
Definition: rocchio.h:55
static const constexpr uint64_t default_max_terms
Default value for max_terms, the number of new terms to add to the new query.
Definition: rocchio.h:67
Implements the Rocchio algorithm for pseudo-relevance feedback.
Definition: rocchio.h:48
void save(std::ostream &out) const override
Saves the ranker to a stream.
Definition: rocchio.cpp:68
The ModErn Text Analysis toolkit is a suite of natural language processing, classification, information retrieval, data mining, and other applications of text processing.
Definition: analyzer.h:25
std::vector< search_result > rank(ranker_context &ctx, uint64_t num_results, const filter_function_type &filter) override
Scores a query using a document-at-a-time strategy.
Definition: rocchio.cpp:79
static const util::string_view id
Identifier for this ranker.
Definition: rocchio.h:52
std::unique_ptr< ranker > make_ranker< rocchio >(const cpptoml::table &global, const cpptoml::table &local)
Specialization of the factory method used to create rocchio rankers.
Definition: rocchio.cpp:143