Understanding Documents via Concept Links
| DUC 2005 System Task | ||
| Targeted Sentences | ||
| Our Approach | ||
| System Overview | ||
| Concept Link | ||
| Sentence Similarity | ||
| Sentence Ranker: A modified MMR | ||
| Evaluation | ||
| Conclusions | ||
| Task Definition in [Amigo et al, 04] | ||
| … topic-oriented, informative multi-document summarization, … compressed version of a set of documents … | ||
| Topic Creation Instructions | ||
| to formulate a topic out of interesting aspects | ||
| “At least 25 documents must each contribute some material to the answer” of a quest of the topic | ||
| Our view of the task | ||
| A general, and topic-oriented summary. | ||
| Good DUC 2005 summary: an extract consists of sentences that | |||
| highly representative | |||
| highly relevant to the topic | |||
| General | |||
| Specific: named entities are favored | |||
| with minimal redundancy | |||
| There exists a Concept Link between each pair of similar concepts | |||
| Concept Similarity: maximal sense overlapping (Banerjee et al, 2003) | |||
| Consider all senses of each concept | |||
| Extended sense Sx: | |||
| Synset + Gloss + hypernymy + meronymy set(1 level) | |||
| 1) A year ago Mr Douglas Hurd foreign secretary became the first UK cabinet minister to visit Argentina since the 1982 Falkland islands conflict. | |
| 2) Today Argentina gets out the red carpet for the UK Duke of York the first official royal visitor since the end of the Anglo Argentine Falklands war in 1982. |
Concept Links between sentences
| Sum of “strength” of concept links |
| Original Weight: Representative Power |
| MMR modified |
Conclusions:
A simple system features
| Concept Link: new way to calculate sentence similarity; | ||
| no chunker/parser involved | ||
| concept differs from NPs in Lexical Chain | ||
| Considering sentence similarity/relatedness via Concept Link: | ||
| Alleviate the influence of expression variations; (but might involve inaccurate sense guess) | ||
| Outperforms Word co-occurrence approach | ||
| Minimizing Redundancy via Modified MMR; | ||
| No extra heuristics involved. | ||
| Error analysis; | |
| How to automatically set parameters; | |
| Comparison with alternative Similarity Measures; | |
| How about more knowledge (syntactic, semantic parsers …)? | |
| … |