August 17, 2005
Generic Soft Pattern Models for Definitional QA
22/28
Performance Evaluation
•Soft pattern matching outperforms hard matching
•Bigram and PHMM models perform better than the previously proposed soft pattern method
–Previous soft pattern method is not optimized
•Manual F3 scores correlate well with automatic R3 scores
–
0.4971
(+7.30%)**
0.5088
(+9.83%)**
0.4937
(+6.56%)**
0.4633
F3
0.2496
(+9.18%)
0.2553
(+11.67%)*
0.2378
(+4.00%)
0.2286
R3E
0.2234
(+6.08%)
0.2303
(+9.37%)
0.2233
(+6.00%)
0.2106
R3A
PHMM SP
Bigram SP
Original SP
HP