A Noisy-Channel Approach to Question Answering Authors: Echihabi and Marcu Date: 2003 Presenters: Omer Percin Emin Yigit Koksal Ustun Ozgur IR 2013
Question Answering Approaches Matching same words Does not always work Who is the leader of France? Hadjenberg, who is the leader of France Jewish Community... Solution: Some Kind of Transformation Needed
Noisy Channel Approach IDEA: Question is a noisy version of the Answer. ANSWER QUESTION NOISY CHANNEL MODEL
QA System Two Modules IR Engine: Get relevant documents and their sentences Answer Identification Module From these sentences, identify substrings S(A) as candidate answers For each substring (answer candidate) and sentence, calculate probability of obtaining the Question text using a Generative Model
Answer Identification Module HORIZONTAL CUTTER PERMUTER ANSWER MARKER FERTILIZER REPLACE R CANDIDATE SENTENCE QUESTION p1 p2p3 p4p5 P(Q|A) = p1 x p2 x p3 x p4 x p5 GENERATIVE MODEL
FERTILIZER REPLACER PERMUTER CUTTER ANSWER MARKER
Training and Testing Use the generative model Training: Deterministic cutter Answer known Testing: Given a Question Exhaustive cutter (try each cut for each sentence) Exhaustive answer marker (try each word as possible answer) Generate probabilities: P(Q| Sentence+Cut+Answer) Select Sentence+Cut+Answer combination with Max. P
Performance Not bad compared to systems with 10s modules, only 2 modules Beats QA-base in MRR on TREC datasets ( vs 0.291) QA-base was state-of-art and ranked 2-7 in TREC in last 3 years (in 2003)