Probabilistic Information Retrieval The document ranking problem We have a collection of documents User issues a query a list of documents needs to be returned Ranking method is core of an IR system In what order do we present documents to the user? We want the best document to be first second best second, etc Idea: Rank by probability of relevance of the document w.r.t information need P(relevant document, query
Probabilistic Information Retrieval 6 The document ranking problem ▪ We have a collection of documents ▪ User issues a query ▪ A list of documents needs to be returned ▪ Ranking method is core of an IR system: ▪ In what order do we present documents to the user? ▪ We want the “best” document to be first, second best second, etc…. ▪ Idea: Rank by probability of relevance of the document w.r.t. information need ▪ P(relevant|documenti , query)
Probabilistic Information Retrieval Probability Basics Recall a few probability basics or events a and b Bayes Rule p(a,b)=p(anb)=p(a bp(b)=p((a p(a bp(b)=p(b ap(ay P(alb)=p(bla)p(a=sp(blapla prior p(b)∑ pbx(x) Posterion x=aa Odds: O(a)=p(a)
Probabilistic Information Retrieval 7 Recall a few probability basics ▪ For events a and b: ▪ Bayes’ Rule ▪ Odds: = = = = = = = x a a p b x p x p b a p a p b p b a p a p a b p a b p b p b a p a p a b p a b p a b p b p b a p a , ( | ) ( ) ( | ) ( ) ( ) ( | ) ( ) ( | ) ( | ) ( ) ( | ) ( ) ( , ) ( ) ( | ) ( ) ( | ) ( ) 1 ( ) ( ) ( ) ( ) ( ) p a p a p a p a O a − = = Posterior Prior Probability Basics
Probabilistic Information Retrieval Probability Ranking Principle The probability ranking principle If a reference retrieval system s response to each request is a ranking of the documents in the collection in order of decreasing probability of relevance to the user who submitted the request, where the probabilities are estimated as accurately as possible on the basis of whatever data have been made available to the system for this purpose, the overall effectiveness of the system to its user will be the best that is obtainable on the basis of those data [1960s/1970s]S Robertson, W.S. Cooper, M.E. Maron van Rijsbergen (1979: 113 ); Manning Schutze(1999: 538)
Probabilistic Information Retrieval 8 The Probability Ranking Principle “If a reference retrieval system's response to each request is a ranking of the documents in the collection in order of decreasing probability of relevance to the user who submitted the request, where the probabilities are estimated as accurately as possible on the basis of whatever data have been made available to the system for this purpose, the overall effectiveness of the system to its user will be the best that is obtainable on the basis of those data.” ▪ [1960s/1970s] S. Robertson, W.S. Cooper, M.E. Maron; van Rijsbergen (1979:113); Manning & Schütze (1999:538) Probability Ranking Principle
Probabilistic Information Retrieval Probability Ranking Principle Probability ranking principle Let x be a document in the collection Let r represent relevance of a document w r t. a given(fixed) query and let Nr represent non-relevance. R=0, 1)vS NR/R Need to find p(rx)-probability that a document x is relevant. P(RIx)=p(r RP(R) p(R, P(NR)-prior probability p(x) of retrieving a(non) relevant document P(NRLx-p(xINRP(nr p(r x)+p(nr x)= p(xR),p(x NR)-probability that if a relevant(non-relevant) document is retrieved it is x
Probabilistic Information Retrieval 9 Probability Ranking Principle Let x be a document in the collection. Let R represent relevance of a document w.r.t. a given (fixed) query and let NR represent non-relevance. ( ) ( | ) ( ) ( | ) ( ) ( | ) ( ) ( | ) p x p x NR p NR p NR x p x p x R p R p R x = = p(x|R), p(x|NR) - probability that if a relevant (non-relevant) document is retrieved, it is x. Need to find p(R|x) - probability that a document x is relevant. p(R),p(NR) - prior probability of retrieving a (non) relevant document p(R | x) + p(NR | x) =1 R={0,1} vs. NR/R Probability Ranking Principle
Probabilistic Information Retrieval Probability Ranking Principle Probability ranking principle(Prp Simple case: no selection costs or other utility concerns that would differentially weight errors BayesOptimal Decision Rule x is relevant iff p(rx)>p(nrx PRP in action: Rank all documents by p(rix Theorem Using the prp is optimal, in that it minimizes the loss (Bayes risk under 1/0 loss Provable if all probabilities correct, etc. [e. g ripley 1996]
Probabilistic Information Retrieval 10 Probability Ranking Principle (PRP) ▪ Simple case: no selection costs or other utility concerns that would differentially weight errors ▪ Bayes’ Optimal Decision Rule ▪ x is relevant iff p(R|x) > p(NR|x) ▪ PRP in action: Rank all documents by p(R|x) ▪ Theorem: ▪ Using the PRP is optimal, in that it minimizes the loss (Bayes risk) under 1/0 loss ▪ Provable if all probabilities correct, etc. [e.g., Ripley 1996] Probability Ranking Principle