Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deep Processing for Restricted Domain QA Yi Zhang Universit ä t des Saarlandes

Similar presentations


Presentation on theme: "Deep Processing for Restricted Domain QA Yi Zhang Universit ä t des Saarlandes"— Presentation transcript:

1 Deep Processing for Restricted Domain QA Yi Zhang Universit ä t des Saarlandes yzhang@coli.uni-sb.de

2 Outline  Why deep processing  QA in QUETAL  Restricted domain question answering Grammar extension & lexicon acquisition Robust deep processing Parse disambiguation Semantic answer matching

3 Why Deep? Is Shallow Processing Enough? For TREC-like QA evaluation YES  (in most cases) YES However, for restricted domain QA  More complicated questions  Less information redundancy for data intensive approach  Domain knowledge available

4 Deep Processing Provides  More fine-grained linguistic analysis Long distance dependency Agreements …  Semantic Representation MRS/RMRS

5 General Problems with Deep Processing  Robustness Lexicon Compound NP  Specificity “ John saw Mary ”  Efficiency (not discussed here)

6 Deep Processing  MRS/RMRS (Robust) Semantic representation with underspecification.  HPSG Grammars LinGO ERG Grammar Other grammars (German, Japanese, Modern Greek, Norwegian, Chinese, … )  HoG Hybrid shallow & deep processing architecture with uniformed semantic representation (RMRS).

7 QA in QUETAL (1)  Hybrid shallow & deep approach  Cross-lingual QA  QA on Texts Semi-structured documents Database

8 QA in QUETAL (2) NLQ IR Schema Syntax Ana. Dependency Parser TAG for En/De Q. Seman Ana. Seman Q. Ana. Q-type A-type Q-focus Ans. Planning & Generation GetData IR Query Planner Info Source Texts IEFact DB Result Merge

9 QA in QUETAL (3) Deep processing in QUETAL HPSG grammar used for question analysis. Documents are processed with relatively shallow methods. Answer matching with RMRS.

10 Restricted Domain QA  More complicated questions  Less documents with better quality  Domain specific ontology available

11 Restricted Domain QA – an Example Shanghai City Planning Exhibition Hall [LOC_1] is located to the east of the City Hall [LOC_2], …, setting off with the crystal-like Grand Theatre [LOC_3] to the west. Where is the City Hall of Shanghai? Between Shanghai City Planning Exhibition Hall and the Grand Theatre. Domain Onto.

12 Open Topics  Grammar extension & automated lexicon acquisition  Robust deep processing  Semantic answer matching  Cross-lingual

13 Grammar Extension Tourism Domain  ERG extended for “ RONDANE ” -- Norway mountain area tourism  1.4K sentences  15 word/sentence  coverage > 74%  Shanghai tourist guide from http://www.shanghai.gov.cn http://www.shanghai.gov.cn  1,600 sentences  18 word/sentence

14 Test on RONDANE corpus

15 Test on RONDANE Corpus

16 Grammar Extension  ERG lexicon  It is relatively easier to automated the lexicon acquisition for nouns Lexicon Entry # Top 10 Leaf Types Lexicon Coverage Verb289177% Noun687396% Adj.250590%

17 Automated Lexicon Acquisition  POS tagging  Name entity recognition  Statistical models finding the best lexical type for unknown noun.

18 Robust Deep Processing  Back-off to RMRS generated with intermediate or shallow parsers (HoG architecture).  Keep non-full parsing charts and corresponding MRS fragments for semantic answer matching.

19 Parse Disambiguation  Select the best parse with statistical models (Toutanova et al. 2002)

20 Answer Matching with (R)MRS  Semantic answer matching Create semantic patterns for each question type.  where -> locate_v(e, x1, x2) Semantic distance measurement.  pred1(x)&pred2(x) pred1(x)&pred2(y)  Query expansion Synonym substitution Semantic structure replacement  give_v(e1, x1, x2, x3) => receive_v(e2, x2, x1, x3)

21 Work Plan  Narrow down my focus onto one of the topics above.  Continue the Chinese HPSG grammar development.

22 References  Baldwin, Timothy, Emily M. Bender, Dan Flickinger, Ara Kim and Stephan Oepen (to appear) Road-testing the English Resource Grammar over the British National Corpus, In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal.  Ulrich Callmeier. 2002. PET – a platform for experimentation with efficient HPSG processing techniques. In Collaborative Language Engineering. CSLI Publications, Stanford, USA.  Hans Uszkoreit. 2002. New chances for deep linguistic processing. In Proc. of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan.  Ann Copestake, Dan Flickinger, Ivan A. Sag, and Carl Pollard. 2003. Minimal recursion semantics: An introduction. Under review.  Timothy Baldwin and Francis Bond. 2003. Learning the countability of English nouns from corpus data. In Proc. of the 41st Annual Meeting of the ACL, pages 463–70, Sapporo, Japan.  Carol, J. and Fang, A. Automatic Acquisition of Verb Subcategorisations and their Impact on the Performance of an HPSG Parser. IJCNLP 2004  Oepen, Stephan, Dan Flickinger, Kristina Toutanova, Christoper D. Manning. 2002. LinGO Redwoods: A Rich and Dynamic Treebank for HPSG In Proceedings of The First Workshop on Treebanks and Linguistic Theories (TLT2002), Sozopol, Bulgaria.  Toutanova, Kristina, Christoper D. Manning, Stephan Oepen. 2002. Parse Ranking for a Rich HPSG Grammar In Proceedings of The First Workshop on Treebanks and Linguistic Theories (TLT2002), Sozopol, Bulgaria.  Stephan Oepen. [incr tsdb()] - Competence and Performance Laboratory. User Manual.Technical Report. Computational Linguistics. Saarland University (in preparation).  Robert Malouf and Gertjan van Noord. 2004. "Wide coverage parsing with stochastic attribute value grammars." In IJCNLP-04 Workshop: Beyond shallow analyses - Formalisms and statistical modeling for deep analyses.  Toutanova, Kristina, Christopher D. Manning, Stuart M. Shieber, Dan Flickinger, and Stephan Oepen. 2002. Parse Disambiguation for a Rich HPSG Grammar. First Workshop on Treebanks and Linguistic Theories (TLT2002), pp. 253-263. Sozopol, Bulgaria.


Download ppt "Deep Processing for Restricted Domain QA Yi Zhang Universit ä t des Saarlandes"

Similar presentations


Ads by Google