26 Oct 2004
CS 5244: DL Extended Services
30
R-measure
¡Normalized sum of lengths of all suffixes of the text repeated in other documents
¡
¡
¡ where Q(S|T1…Tn) = length of longest prefix of S repeated in any one document
¡
lComputed easily using suffix array data structure
lMore effective than simple longest common substring