Survey on Metadata Extraction and Indexing

Agenda

Automatic Document Metadata Extraction using Support Vector Machine

Metadata elements

Single-class vs. Multi-classes lines

Two-Step Extraction

Line Classification – step 1

Line Classification – step 2

Chunk identification

Summary

Automatic Identification and Organization of Index Terms for Interactive Browsing

Problems discussed

Index Identification
                    ---- Head sorting method

What constitutes useful index
                   ---- how to evaluate

Coherence - 3 ratings

Usefulness of index terms

Usefulness of index terms

Thoroughness of coverage of document

Summary

Issues

Thank You!!   =)