RelevanceLMScorer (Quantum Information Access framework 0.0.1-SNAPSHOT API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

uk.ac.gla.dcs.renaissance.mg4j.scorers
Class RelevanceLMScorer

java.lang.Object
  it.unimi.dsi.fastutil.ints.AbstractIntIterator
      it.unimi.dsi.mg4j.search.score.AbstractScorer
          it.unimi.dsi.mg4j.search.score.AbstractIndexScorer
              it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer
                  uk.ac.gla.dcs.renaissance.mg4j.scorers.RelevanceLMScorer

All Implemented Interfaces:: it.unimi.dsi.fastutil.ints.IntIterator, it.unimi.dsi.lang.FlyweightPrototype<it.unimi.dsi.mg4j.search.score.Scorer>, it.unimi.dsi.mg4j.search.score.DelegatingScorer, it.unimi.dsi.mg4j.search.score.Scorer, Iterator<Integer>

public class RelevanceLMScorer
extends it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer
implements it.unimi.dsi.mg4j.search.score.DelegatingScorer
extends it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer
implements it.unimi.dsi.mg4j.search.score.DelegatingScorer

A scorer implementing the Lavrenko/Croft Language Modelling Approach. See Lavrenko, Victor, and W Bruce Croft. Relevance based language models. In Proceedings of the 24th Annual International Conference on Research and development in Information Retrieval, edited by W B Croft, D Harper, D H Kraft, and J Zobel, 120-127. New York: ACM, 2001.

This scorer implements the conditional sampling approach described in the paper above.

Author:: Ingo Frommholz

Field Summary

Fields inherited from class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer
`currWeight, index2Weight`

Fields inherited from class it.unimi.dsi.mg4j.search.score.AbstractIndexScorer
`currIndex, n`

Fields inherited from class it.unimi.dsi.mg4j.search.score.AbstractScorer
`documentIterator`

Constructor Summary
`RelevanceLMScorer(double lambda, IndexConfiguration[] idxCfgs, it.unimi.dsi.mg4j.document.DocumentCollection collection, Collection<bpiwowar.utils.Pair<Integer,Float>> rfDocuments)` Lambda is a smoothing constant used to calculate P(w\|M_d).
`RelevanceLMScorer(IndexConfiguration[] idxCfgs, it.unimi.dsi.mg4j.document.DocumentCollection collection, Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)` Use a default lambda (0.6)

Method Summary
`it.unimi.dsi.mg4j.search.score.Scorer`	`copy()`
`void`	`init(it.unimi.dsi.mg4j.index.Index[] indexes, String[] queryTerms, int[] queryTermIndexNumbers)` Initialises this scorer.
`void`	`init(it.unimi.dsi.mg4j.index.Index[] indexes, String[] queryTerms, int[] queryTermIndexNumbers, Set<Integer> documentUniverse)` Initialises this scorer.
`void`	`init(it.unimi.dsi.mg4j.index.Index index, String[] queryTerms)` If there is only one index involved in the game, we can safely use this simpler constructor
`void`	`init(it.unimi.dsi.mg4j.index.Index index, String[] queryTerms, Set<Integer> documentUniverse)` If there is only one index involved in the game, we can safely use this simpler constructor
`double`	`score()`
`double`	`score(it.unimi.dsi.mg4j.index.Index index)`
`double`	`score(int documentID)` Returns the score for the given document
`void`	`setRFDocuments(Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)` To be able to process a new topic with this scorer instance, we need to set a new set of relevant documents coming from the base scorer.
`boolean`	`usesIntervals()`
`void`	`wrap(it.unimi.dsi.mg4j.search.DocumentIterator d)`

Methods inherited from class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer
`getWeights, setWeights`

Methods inherited from class it.unimi.dsi.mg4j.search.score.AbstractScorer
`hasNext, nextDocument, nextInt, skip`

Methods inherited from class it.unimi.dsi.fastutil.ints.AbstractIntIterator
`next, remove`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Methods inherited from interface it.unimi.dsi.mg4j.search.score.Scorer
`getWeights, nextDocument, nextInt, setWeights`

Methods inherited from interface it.unimi.dsi.fastutil.ints.IntIterator
`skip`

Methods inherited from interface java.util.Iterator
`hasNext, next, remove`

Constructor Detail

RelevanceLMScorer

public RelevanceLMScorer(IndexConfiguration[] idxCfgs,
                         it.unimi.dsi.mg4j.document.DocumentCollection collection,
                         Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)

Use a default lambda (0.6)

Parameters:: idxCfgs - The index configurations of each index. Used to get some required statistics. Please make sure that this array is in line with the term visitor we use in wrap().; collection - the document collection (needed to extract certain statistics); judgedDocuments - the documents used for building the relevance model

RelevanceLMScorer

public RelevanceLMScorer(double lambda,
                         IndexConfiguration[] idxCfgs,
                         it.unimi.dsi.mg4j.document.DocumentCollection collection,
                         Collection<bpiwowar.utils.Pair<Integer,Float>> rfDocuments)

Lambda is a smoothing constant used to calculate P(w|M_d). It can be any value between (incl.) 0 and 1.

Parameters:: lambda - smoothing constant; idxCfgs - used to get some needed statistics; collection - the document collection (needed to extract certain statistics); rfDocuments - the documents used for building the relevance model

Method Detail

setRFDocuments

public void setRFDocuments(Collection<bpiwowar.utils.Pair<Integer,Float>> judgedDocuments)

To be able to process a new topic with this scorer instance, we need to set a new set of relevant documents coming from the base scorer. Make sure the same index and collection is used! If this isn't possible, better create a new scorer instance.

Parameters:: judgedDocuments - the documents that were judged

score

public double score(it.unimi.dsi.mg4j.index.Index index)

Specified by:: score in interface it.unimi.dsi.mg4j.search.score.Scorer

score

public double score()
             throws IOException

Specified by:: score in interface it.unimi.dsi.mg4j.search.score.Scorer
Overrides:: score in class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer

Throws:: IOException

score

public double score(int documentID)
             throws IOException

Returns the score for the given document

Parameters:: documentID - the document ID
Returns:: the score
Throws:: IOException

usesIntervals

public boolean usesIntervals()

Specified by:: usesIntervals in interface it.unimi.dsi.mg4j.search.score.Scorer

copy

public it.unimi.dsi.mg4j.search.score.Scorer copy()

Specified by:: copy in interface it.unimi.dsi.lang.FlyweightPrototype<it.unimi.dsi.mg4j.search.score.Scorer>
Specified by:: copy in interface it.unimi.dsi.mg4j.search.score.Scorer

wrap

public void wrap(it.unimi.dsi.mg4j.search.DocumentIterator d)
          throws IOException

Specified by:: wrap in interface it.unimi.dsi.mg4j.search.score.Scorer
Overrides:: wrap in class it.unimi.dsi.mg4j.search.score.AbstractWeightedScorer

Throws:: IOException

init

public void init(it.unimi.dsi.mg4j.index.Index index,
                 String[] queryTerms,
                 Set<Integer> documentUniverse)
          throws IOException

If there is only one index involved in the game, we can safely use this simpler constructor

Parameters:: index - the index; queryTerms - the query terms
Throws:: IOException

init

public void init(it.unimi.dsi.mg4j.index.Index index,
                 String[] queryTerms)
          throws IOException

If there is only one index involved in the game, we can safely use this simpler constructor

Parameters:: index - the index; queryTerms - the query terms
Throws:: IOException

init

public void init(it.unimi.dsi.mg4j.index.Index[] indexes,
                 String[] queryTerms,
                 int[] queryTermIndexNumbers)
          throws IOException

Initialises this scorer. To be compatible with a MG4J scorer which may server several indexes, we regard term-index pairs. The whole collection is considered here.

Parameters:: indexes - An array of indexes; queryTerms - the query terms (in MG4J the term part of a term-index pair); queryTermIndexNumbers - the query term index numbers (in MG4J the index part of a term-index pair). It must be valid index numbers for the indexes array.
Throws:: IOException

init

public void init(it.unimi.dsi.mg4j.index.Index[] indexes,
                 String[] queryTerms,
                 int[] queryTermIndexNumbers,
                 Set<Integer> documentUniverse)
          throws IOException

Initialises this scorer. To be compatible with a MG4J scorer which may server several indexes, we regard term-index pairs.

Parameters:: indexes - An array of indexes; queryTerms - the query terms (in MG4J the term part of a term-index pair); queryTermIndexNumbers - the query term index numbers (in MG4J the index part of a term-index pair). It must be valid index numbers for the indexes array.; documentUniverse - the document universe to consider. If set to null, all documents and terms in the index are considered.
Throws:: IOException

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

uk.ac.gla.dcs.renaissance.mg4j.scorers Class RelevanceLMScorer

RelevanceLMScorer

RelevanceLMScorer

setRFDocuments

score

score

score

usesIntervals

copy

wrap

init

init

init

init

uk.ac.gla.dcs.renaissance.mg4j.scorers
Class RelevanceLMScorer