1	Fundamentals of Information Retrieval Module 3 Min-Yen KAN Evaluation Metrics*
2	References for Today Witten, Moffat and Bell (99) Managing Gigabytes, Chapters 3-5.
3	Evaluation Contingency Table
4	Sensitivity, specificity, positive and negative predictive value
5	Evaluation Metrics Precision = Positive Predictive Value “ratio of the number of relevant documents retrieved over the total number of documents retrieved” how much extra stuff did you get? Recall = Sensitivity “ratio of relevant documents retrieved for a given query over the number of relevant documents for that query in the database” how much did you miss?
6	P/R: an example Rank Decision R_@r P_@r 1 R 10% 100% 2 10% 50% 3 10% 33% 4 R 20% 50% 5 R 30% 60% 6 30% 50% 7 R 40% 57% 8 40% 50% 9 40% 44% 10 40% 40% 11 40% 36% 12 R 50% 42% 13 R 60% 46% 14 R 70% 50% … 22 R 100% 45%
7	Precision / Recall Interpolated precision gives a non-increasing curve But doesn’t factor in the size of the corpus Previous example on a corpus of 25 docs = 40% precision On a corpus of 2.5 M docs = also 40%
8	Factoring in size of a corpus Look at how P/R or Sn/Sp varies as a function of rank: Choose a number of different ranks and calculate P/R or Sn/Sp Correspond to vertical lines on graphs at right Plot Sn vs. 1-Sp to get points for ROC curve. Interpolate curve.
9	ROC Curve Look at the probability or rate of detection What does the diagonal represent? How do we compare ROC curves versus each other?
10	Getting a single number 11 pt average Average precision at each .1 interval in recall Precision at recall point (% or absolute) F Measure Ratio of precision to recall: F_b = (e.g., F₃= weight precision heavier) Area under ROC curve (Accuracy) 1 = perfect, .9 excellent, .5 worthless