Notes
Slide Show
Outline
1
Fundamentals of Information Retrieval
  • Module 3               Min-Yen KAN
  • Evaluation Metrics*


2
References for Today
  • Witten, Moffat and Bell (99) Managing Gigabytes, Chapters 3-5.
3
Evaluation Contingency Table
4
Sensitivity, specificity,
 positive and negative predictive value
5
Evaluation Metrics
  • Precision = Positive Predictive Value
    • “ratio of the number of relevant documents retrieved over the total number of documents retrieved”
    • how much extra stuff did you get?
  • Recall = Sensitivity
    • “ratio of relevant documents retrieved for a given query over the number of relevant documents for that query in the database”
    • how much did you miss?
6
P/R: an example
  • Rank Decision R@r P@r
  • 1 R 10% 100%
  • 2 10% 50%
  • 3 10% 33%
  • 4 R 20% 50%
  • 5 R 30% 60%
  • 6 30% 50%
  • 7 R 40% 57%
  • 8 40% 50%
  • 9 40% 44%
  • 10 40% 40%
  • 11 40% 36%
  • 12 R 50% 42%
  • 13 R 60% 46%
  • 14 R 70% 50%
  • …
  • 22 R 100% 45%
7
Precision / Recall
  • Interpolated precision
    gives a non-increasing
    curve


  • But doesn’t factor in
    the size of the corpus


    • Previous example on a corpus of 25 docs = 40% precision
    • On a corpus of
      2.5 M docs = also 40%



8
Factoring in size of a corpus
  • Look at how P/R or Sn/Sp varies as a function of rank:


  • Choose a number of different ranks and calculate P/R or Sn/Sp
    • Correspond to vertical lines on graphs at right
    • Plot Sn vs. 1-Sp to get points for ROC curve.  Interpolate curve.


9
ROC Curve
  • Look at the probability or rate of detection


  • What does the
    diagonal represent?


  • How do we compare
    ROC curves versus
    each other?


10
Getting a single number
  • 11 pt average
    • Average precision at each .1
      interval in recall


  • Precision at recall point (% or absolute)


  • F Measure
    • Ratio of precision to recall:              Fb =
    • (e.g., F3 = weight precision heavier)



  • Area under ROC curve (Accuracy)
    • 1 = perfect, .9 excellent, .5 worthless