Evaluation Unranked Retrieval Evaluation a combined measure. F Combined measure that assesses precision/recall tradeoff is F measure weighted harmonic mean F (B2+1)PR a+(1-a) BP+R R People usually use balanced F,measure e,withβ=1orω= Harmonic mean is a conservative average See cJ van Rijsbergen, Information Retrieval
Evaluation 16 A combined measure: F ▪ Combined measure that assesses precision/recall tradeoff is F measure (weighted harmonic mean): ▪ People usually use balanced F1 measure ▪ i.e., with = 1 or = ½ ▪ Harmonic mean is a conservative average ▪ See CJ van Rijsbergen, Information Retrieval P R PR P R F + + = + − = 2 2 ( 1) 1 (1 ) 1 1 Unranked Retrieval Evaluation
Evaluation Unranked Retrieval Evaluation F, and other averages Combined measures 100 Minimum Maximum 60 Arithmetic 40 Geometric Harmonic 20 0 0 20 60 100 Precision(Recall fixed at 70%) 17
Evaluation 17 F1 and other averages Combined Measures 0 2 0 4 0 6 0 8 0 100 0 2 0 4 0 6 0 8 0 100 Precision (Recall fixed at 70%) Minimum Maximum Arithmetic Geometric Harmonic Unranked Retrieval Evaluation