1
2	Passage Retrieval in Question Answering
3	Density Based Passage Retrieval Method However, density based can err when …
4	Our Solution Examine the relationship between words Dependency relations Exact match of relations for answer extraction Has low recall because same relations are often phrased differently Fuzzy match of dependency relationship Statistical similarity of relations
5	Measuring Sentence Similarity
6	Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions
7	Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions
8	What Dependency Parsing is Like Minipar (Lin, 1998) for dependency parsing Dependency tree Nodes: words/chunks in the sentence Edges (ignoring the direction): labeled by relation types
9	Extracting Relation Paths Relation path Vector of relations between two nodes in the tree
10	Paired Paths from Question and Answer
11	Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions
12	Measuring Path Match Degree Employ a variation of IBM Translation Model 1 Path match degree (similarity) as translation probability MatchScore (P_Q, P_S) → Prob (P_S\| P_Q) Relations as words Why IBM Model 1? No “word order” – bag of undirected relations No need to estimate “target sentence length” Relation paths are determined by the parsing tree
13	Calculating Translation Probability (Similarity) of Paths
14	Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions
15	Training and Testing
16	Approach 1: MI Based Measures bipartite co-occurrences in training path pairs Accounts for path length (penalize those long paths) Uses frequencies to approximate mutual information
17	Approach 2: EM Based Employ the training method from IBM Model 1 Relation mapping scores = word translation probability Utilize GIZA to accomplish training Iteratively boosting the precision of relation translation probability Initialization – assign 1 to identical relations and a small constant otherwise
18	Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Can relation matching help? Can fuzzy match perform better than exact match? Can long questions benefit more? Can relation matching work on top of query expansion? Conclusions
19	Evaluation Setup Training data 3k corresponding path pairs from 10k QA pairs (TREC-8, 9) Test data 324 factoid questions from TREC-12 QA task Passage retrieval on top 200 relevant documents by TREC
20	Comparison Systems MITRE –baseline Stemmed word overlapping Baseline in previous work on passage retrieval evaluation SiteQ – top performing density based method using 3 sentence window NUS Similar to SiteQ, but using sentences as passages Strict Matching of Relations Simulate strict matching in previous work for answer selection Counting the number of exactly matched paths Relation matching are applied on top of MITRE and NUS
21	Evaluation Metrics Mean reciprocal rank (MRR) Measure the mean rank position of the correct answer in the returned rank list On the top 20 returned passages Percentage of questions with incorrect answers Precision at the top one passage
22	Performance Evaluation All improvements are statistically significant (p<0.001) MI and EM do not make much difference given our training data EM needs more training data MI is more susceptible to noise, so may not scale well
23	Performance Variation to Question Length Long questions, with more paired paths, tend to improve more Using the number of non-trivial question terms to approximate question length
24	Error Analysis Mismatch of question terms e.g. In which city is the River Seine Introduce question analysis Paraphrasing between the question and the answer sentence e.g. write the book → be the author of the book Most of current techniques fail to handle it Finding paraphrasing via dependency parsing (Lin and Pantel)
25	Performance on Top of Query Expansion On top of query expansion, fuzzy relation matching brings a further 50% improvement However query expansion doesn’t help much on a fuzzy relation matching system Expansion terms do not help in paring relation paths
26	Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions
27	Conclusions Proposed a novel fuzzy relation matching method for factoid QA passage retrieval Brings dramatic 70%+ improvement over the state-of-the-art systems Brings further 50% improvement over query expansion Future QA systems should bring in relations between words for better performance Query expansion should be integrated to relation matching seamlessly
28	Q & A