13 Nov 2004
WIDM 04: Lee et al. Co-training Web Block Classification
9
PARCELS
•PARser for Content Extraction & Layout Structure
•
•Goals:
–Coarse-grained classification
–Fine-grained information extraction
–Work on a variety of sources
–Open-source, reference implementation
–