Filtering Clicks Click deviation CD(d, p) for a result d in position p CD(d,p)=O(d, p)-E(p o(d,p: observed click frequency for a document in a rank position p over all instances of a given query Elp expected click frequency at rank p averaged across a∥ queries 17/N
Filtering Clicks • Click deviation CD(d, p) for a result d in position p: O(d,p): observed click frequency for a document in a rank position p over all instances of a given query E(p): expected click frequency at rank p averaged across all queries 17/N
08: Evaluating Search Engines 8. 1 Why Evaluate 8.2 The evaluation Corpus 8.3 Logging 8.4 Effectiveness metrics 8.5 Efficiency Metrics 8.6 Training, Testing, and Statistics 18/N
08: Evaluating Search Engines 8.1 Why Evaluate 8.2 The Evaluation Corpus 8.3 Logging 8.4 Effectiveness Metrics 8.5 Efficiency Metrics 8.6 Training, Testing, and Statistics 18/N
Effectiveness measures A is set of relevant documents B is set of retrieved documents Relevant Non-Relevant Retrieved A∩B A∩B Not retrieved A∩B A∩B A∩B Recall A∩B Precision 19/N
Effectiveness Measures A is set of relevant documents, B is set of retrieved documents 19/N
Classification Errors False Positive(Type l error a non-relevant document is retrieved Fallout- lAng A False Negative(Type ll error) a relevant document is not retrieved Recall is related to the false negatives 20 N
Classification Errors • False Positive (Type I error) – a non-relevant document is retrieved • False Negative (Type II error) – a relevant document is not retrieved – Recall is related to the false negatives 20/N