Download presentation
Presentation is loading. Please wait.
Published byTheodore McKinney Modified over 9 years ago
1
Huizhong Doan, Yonbo Cao, Chin-Yew Lin and Yong Yu Shanghai Jiao Tong University & MSRA ACL 2008 2008/7/91Rick Liu
2
2008/7/92Rick Liu
3
Question Search Help users to search previous answers 2008/7/93Rick Liu Any nice hotels in Berlin or Hamburg? How long does it take to Hamburg from Berlin? Cheap hotels in Berlin?
4
2008/7/94Rick Liu
5
Identifying question topic & focus Question tree Determining the tree cut Modeling question topic & focus for search Language model 2008/7/9Rick Liu5
6
Topic terms BaseNP, WH-ngram Topic profile probability distribution of categories Specificity inverse of the entropy of the topic profile Topic chain topic terms ordered by specificity value (desc) Topic tree 2008/7/9Rick Liu6
7
2008/7/9Rick Liu7
8
M = ( Γ, θ ) Γ = [ C1, C2,.. Ck ], tree cut Θ = [ P(C1), P(C2),.. P(Ck) ], prob param vector A cut is any set of nodes Σ i=1..k P( Ci ) = 1 2008/7/9Rick Liu8
9
2008/7/9Rick Liu9 [n 0, n 11 ], [n 12, n 21, n 22, n 23 ], [n 13, n 24 ] [n 11, n 21, n 22, n 23, n 24 ]
10
2008/7/9Rick Liu10 Minimum Description Length Ref : Li and Abe, 1998
11
2008/7/9Rick Liu11
12
P( q | q ) q : queried question q : targeted question 2008/7/9Rick Liu12 ~ ~
13
Yahoo! Answers Resolved questions travel : 314,616 items computers & internet : 210,785 items Tree fields title ( only used ) description answers 2008/7/9Rick Liu13
14
Employed Vector Space Model Manual judgments : relevant / irrelevant Baseline : VSM, LMIR Evaluation : MAP, R-precision, MRR 2008/7/9Rick Liu14
15
2008/7/9Rick Liu15
16
2008/7/9Rick Liu16
17
2008/7/9Rick Liu17
18
Examine the correctness of question topics and question foci 200 queried question => 69 question incorrect (a) Only have the head part ( 59 ) (b) Incorrect order ( 10 ) (a) explains why λ is 0.7 2008/7/9Rick Liu18
19
FAQ data Community based Jeon et al., 2005 Compared four different retrieval methods ▪ Vector space model ▪ Okapi ▪ Language model ▪ Translation-based model Translation-based model performed the best 2008/7/9Rick Liu19
20
Lexical chasm Where to stay in Hamburg? The best hotel in Hamburg? IBM model 1 Use question titles and question description as the parallel corpus 2008/7/9Rick Liu20
21
2008/7/9Rick Liu21
22
1) Data Structure 2) Use MDL-based Tree Cut Model to Identify 3) A new form of language modeling for question search 4) Extensive experiments 2008/7/9Rick Liu22 Now only community-based From forum sites / FAQ sites
23
2008/7/9Rick Liu23
24
2008/7/9Rick Liu24
25
2008/7/9Rick Liu25
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.