Notes
Slide Show
Outline
1
 
2
System Architecture
3
What’s New This Year
  • Approximate matching of grammatical dependency relations for answer extraction
  • Soft matching patterns in identifying definition sentences.
    • See [Cui et al., 2004a] and [Cui et al., 2004b]
  • Exploiting definitions to answer factoid and list questions.
4
Outline
  • System architecture
  • New Features in TREC-13 QA Main Task
    • Approximate Dependency Relation Matching for Answer Extraction
    • Soft Matching Patterns for Definition Generation
    • Definition Sentences in Answering Topically-Related Factoid/List Questions
  • Conclusion
5
Dependency Relation Matching in QA
  • Tried before
    • PIQASso and MIT systems have applied dependency relations in QA.
    • Used exact match of relations to extract answers directly.
  • Why need to consider dependency relations?
    • An upper bound of 70% for answer extraction (Light et al., 2001)
      • Many NE’s with the same type appearing close to each other.
    • Some questions don’t have NE-type targets.
      • E.g. what does AARP stand for?
6
Extracting Dependency Relation Triples
  • Minipar-based (Lin, 1998) dependency parsing


  • Relation triple: two anchor words and their relationship
    • E.g.  <“desk”,  complement, “on”> for “on the desk”.

  • Relation path: path of relations between two words
    • E.g., <“desk”, mod, complement “floor”> for “on the desk at the fourth floor”


7
Examples of relation triples
  • Q: What American revolutionary general turned over West Point to the British?
  • q1) General    sub       obj                       West Point
  • q2) West Point  mod      pcomp-n British
  • A: …… Benedict Arnold’s plot to surrender West Point to the British ……
  • s1) Benedict Arnold  poss     s sobj West Point
  • s2) West Point mod      pcomp-n British


  • Can’t be extracted by exact match of relations.



8
Learning Relation Similarity
  • We need a measure to find the similarity between two different paths.
  • Adopt a statistical method to learn similarity from past QA pairs.


  • Training data preparation
    • Around 1,000 factoid question-answer pairs from the past two years’ TREC QA task.
    • Extract all relation paths between all non-trivial words
      • 2,557 path pairs.
    • Align the paths according to identical anchor nodes.
9
Using Mutual Information to Measure Relation Co-occurrence
  • Two relations’ similarity measured by their co-occurrences in the question and answer paths.
  • Variation of mutual information (MI)



    • a to discount the score of two relations appearing in long paths.
10
Measuring Path Similarity – 1
  • We adopt two methods to compute path similarity using different relation alignment methods.
  • Option 1: ignore the words of those relations along the given paths – Total Path Matching.
    • A path consists of only a list of relations.
    • Relation alignment by permutation of all possibilities.
    • Adopt IBM’s Model 1 for statistical translation:




11
Measuring Path Similarity – 2
  • Option 2: consider the words of those relations along a path – Triple Matching.
    • A path consists of a list of relations and their words.
    • Only those relations with matched words count.
    • Deliberately ignore long dependency relationship.




12
Selecting Answer Strings Statistically
  • Use the top 50 ranked sentences from the passage retrieval module for answer extraction.
  • Evaluate the path similarity for relation paths between the question target / answer candidate and other question terms.



  • Non-NE questions: evaluate all noun/verb phrases.


13
Discussions on Evaluation Results
  • The use of approximate relation matching outperforms our previous answer extraction technique.
    • 22% improvement for overall questions.
    • 45% improvement for Non-NE questions (69 out of 230 questions).

  • The two path similarity measurements do not make obvious difference.
    • Total Path Matching performs slightly better than Triple Matching.
    • Triple Matching doesn’t degrade the performance because Minipar can’t resolve long distance dependency as well.


14
Outline
  • System architecture
  • New Experiments in TREC-13 QA Main Task
    • Approximate Dependency Relation Matching for Answer Extraction
    • Soft Matching Patterns for Definition Generation
    • Definition Sentences in Answering Topically-Related Factoid/List Questions
  • Conclusion
15
Question Typing and Passage Retrieval for Factoid/List Q’s
  • Question typing
    • Leveraging our past question typology and rule-based question typing module.
    • Offline tagging of the whole TREC corpus using our rule-based named entity tagger.
  • Passage retrieval – on two sources:
    • Topic-relevant document set by the document retrieval module: NUSCHUA1 and 2.
    • Definition sentences for a specific topic by the definition generation module: NUSCHUA3
  • Question-specific wrappers on definitions.



16
Exploiting Definition Sentences to Answer Factoid/List Questions
  • Conduct passage retrieval for factoid/list questions on the definition sentences about the topic.
    • Much more efficient due to smaller search space.
    • Average accuracy of 0.50, lower than that over all topic-related documents.
      • Due to low recall – imposed cut-off for selecting definition sentences (naďve use of definitions).
      • Some sentences for answering factoid/list questions are not definition sentences.
17
Exploiting Definitions from External Knowledge
  • Pre-complied wrappers for extraction of specific fields of information for list questions
    • Works, product names and person titles.
    • From both generated definition sentences and existing definitions: cross validation.
    • Achieves F-measure of 0.81 for 8 list questions about works.

18
Outline
  • System architecture
  • New Experiments in TREC-13 QA Main Task
    • Approximate Dependency Relation Matching for Answer Extraction
    • Soft Matching Patterns for Definition Generation
    • Definition Sentences in Answering Topically-Related Factoid/List Questions
  • Conclusion
19
Conclusion
  • Approximate relation matching for answer extraction
    • Still have a hard time in dealing with difficult questions.
      • Dependency relation alignment problem – words  often can’t be matched due to linguistic variations.
      • Semantic matching of words/phrases is needed with relation matching.
  • More effective use of topic related sentences in answering factoid/list questions.


20
Q & A