|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object it.unimi.dsi.fastutil.ints.AbstractIntIterator it.unimi.dsi.mg4j.search.score.AbstractScorer it.unimi.dsi.mg4j.search.score.AbstractIndexScorer it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer uk.ac.gla.dcs.renaissance.mg4j.scorers.RelevanceLMScorer
public class RelevanceLMScorer
A scorer implementing the Lavrenko/Croft Language Modelling Approach. See Lavrenko, Victor, and W Bruce Croft. Relevance based language models. In Proceedings of the 24th Annual International Conference on Research and development in Information Retrieval, edited by W B Croft, D Harper, D H Kraft, and J Zobel, 120-127. New York: ACM, 2001.
This scorer implements the conditional sampling approach described in the paper above.
Field Summary |
---|
Fields inherited from class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer |
---|
currWeight, index2Weight |
Fields inherited from class it.unimi.dsi.mg4j.search.score.AbstractIndexScorer |
---|
currIndex, n |
Fields inherited from class it.unimi.dsi.mg4j.search.score.AbstractScorer |
---|
documentIterator |
Constructor Summary | |
---|---|
RelevanceLMScorer(double lambda,
IndexConfiguration[] idxCfgs,
it.unimi.dsi.mg4j.document.DocumentCollection collection,
Collection<bpiwowar.utils.Pair<Integer,Float>> rfDocuments)
Lambda is a smoothing constant used to calculate P(w|M_d). |
|
RelevanceLMScorer(IndexConfiguration[] idxCfgs,
it.unimi.dsi.mg4j.document.DocumentCollection collection,
Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)
Use a default lambda (0.6) |
Method Summary | |
---|---|
it.unimi.dsi.mg4j.search.score.Scorer |
copy()
|
void |
init(it.unimi.dsi.mg4j.index.Index[] indexes,
String[] queryTerms,
int[] queryTermIndexNumbers)
Initialises this scorer. |
void |
init(it.unimi.dsi.mg4j.index.Index[] indexes,
String[] queryTerms,
int[] queryTermIndexNumbers,
Set<Integer> documentUniverse)
Initialises this scorer. |
void |
init(it.unimi.dsi.mg4j.index.Index index,
String[] queryTerms)
If there is only one index involved in the game, we can safely use this simpler constructor |
void |
init(it.unimi.dsi.mg4j.index.Index index,
String[] queryTerms,
Set<Integer> documentUniverse)
If there is only one index involved in the game, we can safely use this simpler constructor |
double |
score()
|
double |
score(it.unimi.dsi.mg4j.index.Index index)
|
double |
score(int documentID)
Returns the score for the given document |
void |
setRFDocuments(Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)
To be able to process a new topic with this scorer instance, we need to set a new set of relevant documents coming from the base scorer. |
boolean |
usesIntervals()
|
void |
wrap(it.unimi.dsi.mg4j.search.DocumentIterator d)
|
Methods inherited from class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer |
---|
getWeights, setWeights |
Methods inherited from class it.unimi.dsi.mg4j.search.score.AbstractScorer |
---|
hasNext, nextDocument, nextInt, skip |
Methods inherited from class it.unimi.dsi.fastutil.ints.AbstractIntIterator |
---|
next, remove |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface it.unimi.dsi.mg4j.search.score.Scorer |
---|
getWeights, nextDocument, nextInt, setWeights |
Methods inherited from interface it.unimi.dsi.fastutil.ints.IntIterator |
---|
skip |
Methods inherited from interface java.util.Iterator |
---|
hasNext, next, remove |
Constructor Detail |
---|
public RelevanceLMScorer(IndexConfiguration[] idxCfgs, it.unimi.dsi.mg4j.document.DocumentCollection collection, Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)
idxCfgs
- The index configurations of each index. Used to get some
required statistics. Please make sure that this array is in
line with the term visitor we use in wrap()
.collection
- the document collection (needed to extract certain
statistics)judgedDocuments
- the documents used for building the relevance modelpublic RelevanceLMScorer(double lambda, IndexConfiguration[] idxCfgs, it.unimi.dsi.mg4j.document.DocumentCollection collection, Collection<bpiwowar.utils.Pair<Integer,Float>> rfDocuments)
lambda
- smoothing constantidxCfgs
- used to get some needed statisticscollection
- the document collection (needed to extract certain statistics)rfDocuments
- the documents used for building the relevance modelMethod Detail |
---|
public void setRFDocuments(Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)
judgedDocuments
- the documents that were judgedpublic double score(it.unimi.dsi.mg4j.index.Index index)
score
in interface it.unimi.dsi.mg4j.search.score.Scorer
public double score() throws IOException
score
in interface it.unimi.dsi.mg4j.search.score.Scorer
score
in class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer
IOException
public double score(int documentID) throws IOException
documentID
- the document ID
IOException
public boolean usesIntervals()
usesIntervals
in interface it.unimi.dsi.mg4j.search.score.Scorer
public it.unimi.dsi.mg4j.search.score.Scorer copy()
copy
in interface it.unimi.dsi.lang.FlyweightPrototype<it.unimi.dsi.mg4j.search.score.Scorer>
copy
in interface it.unimi.dsi.mg4j.search.score.Scorer
public void wrap(it.unimi.dsi.mg4j.search.DocumentIterator d) throws IOException
wrap
in interface it.unimi.dsi.mg4j.search.score.Scorer
wrap
in class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer
IOException
public void init(it.unimi.dsi.mg4j.index.Index index, String[] queryTerms, Set<Integer> documentUniverse) throws IOException
index
- the indexqueryTerms
- the query terms
IOException
public void init(it.unimi.dsi.mg4j.index.Index index, String[] queryTerms) throws IOException
index
- the indexqueryTerms
- the query terms
IOException
public void init(it.unimi.dsi.mg4j.index.Index[] indexes, String[] queryTerms, int[] queryTermIndexNumbers) throws IOException
indexes
- An array of indexesqueryTerms
- the query terms (in MG4J the term part of a term-index pair)queryTermIndexNumbers
- the query term index numbers (in MG4J the index part of a
term-index pair). It must be valid index numbers for the
indexes
array.
IOException
public void init(it.unimi.dsi.mg4j.index.Index[] indexes, String[] queryTerms, int[] queryTermIndexNumbers, Set<Integer> documentUniverse) throws IOException
indexes
- An array of indexesqueryTerms
- the query terms (in MG4J the term part of a term-index pair)queryTermIndexNumbers
- the query term index numbers (in MG4J the index part of a
term-index pair). It must be valid index numbers for the
indexes
array.documentUniverse
- the document universe to consider. If set to null
, all
documents and terms in the index are considered.
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |