Soft Pattern Models

August 17, 2005

Generic Soft Pattern Models for Definitional QA

13/28

Bigram Soft Pattern Model

Bigram prob

Slot-aware unigram prob

P ( Ins ) = P(“known”|S-2) + P(“as”|S-1) + P(“,”|S1) + P(“DT$”|S2) + P(“known as”) + P(“, DT$”)

•To estimate the interpolation mixture weight λ

–Expectation Maximization (EM) algorithm

•Count words and general tags separately

–Avoid overwhelming frequency count of general tags


	Here we consider the pattern matching problem as a token sequence generation problem. So, we take the token sequence t1 till tL from the test instance and calculate its probability according to the training data. In a typical bigram model, this generation prob is multiplication of bigram probs. Here, we use linear interpolation to smooth the bigram probs and introduce two terms….