Relevance feedback and query expansion
query refinement
Global
methods include:
• Query
expansion/reformulationwith a thesaurus orWordNet
• Query
expansion via automatic thesaurus generation
• Techniques like spelling
correction
Local methods
•
Relevance feedback
• Pseudo
relevance feedback, also known as Blind relevance feedback
• (Global) indirect
relevance feedback
The idea
of relevance feedback (RF)
is to involve the user in the retrieval
process so as to improve the final result set.
The
basic procedure is:
• The
user issues a (short, simple) query.
• The
system returns an initial set of retrieval results.
• The
user marks some returned documents as relevant or non-relevant.
• The
system computes a better representation of the information need based on the
user feedback.
• The
system displays a revised set of retrieval results.
Rocchio
algorithm
The
Rocchio Algorithm is the classic algorithm for implementing relevance feedback.
It models a way of incorporating relevance feedback information into the vector
space model.
Relevance
feedback can improve both recall and precision.
Probabilistic
relevance feedback
Rather
than reweighting the query in a vector space, if a user has told us some
relevant and nonrelevant documents, then we can proceed to build a classifier.
The use
only collection statistics and information about the term distribution within
the documents judged relevant. They preserve no memory of the original query.
Cases
where relevance feedback alone is not sufficient include:
·
Misspellings.
·
Cross-language information retrieval.
·
Mismatch of searcher’s vocabulary versus
collection vocabulary.
Secondly,
the relevance feedback approach requires relevant documents to be similar to
each other.