Current Research Trends
n Content-based approaches [Michael03]
n HTML body text
n Title and headings
n Anchor text, etc
n Link-based approaches
n Link structure infers information about pages
n Surfing behavior of users can be abstract into
patterns