ALTA Workshop’04, Macquarie University, Sydney 8 December 2004 Luiz Augusto Sangoi Pizzato Using.

Slides:



Advertisements
Similar presentations
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
Advertisements

Group 3 Chad Mills Esad Suskic Wee Teck Tan. Outline  System and Data  Document Retrieval  Passage Retrieval  Results  Conclusion.
Ke Liu1, Junqiu Wu2, Shengwen Peng1,Chengxiang Zhai3, Shanfeng Zhu1
Group Members: Satadru Biswas ( ) Tanmay Khirwadkar ( ) Arun Karthikeyan Karra (05d05020) CS Course Seminar Group-2 Question Answering.
© author(s) of these slides including research results from the KOM research network and TU Darmstadt; otherwise it is specified at the respective slide.
Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig.
1 / 22 Issues in Text Similarity and Categorization Jordan Smith – MUMT 611 – 27 March 2008.
Data Mining and Text Analytics in Music Audi Sugianto and Nicholas Tawonezvi.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
ANLE1 CC 437: Advanced Natural Language Engineering ASSIGNMENT 2: Implementing a query expansion component for a Web Search Engine.
 Mark Sanderson, University of Sheffield University of Sheffield CIIR, University of Massachusetts Deriving concept hierarchies from text Mark Sanderson,
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Using TF-IDF to Determine Word Relevance in Document Queries
What is a document? Information need: From where did the metaphor, doing X is like “herding cats”, arise? quotation? “Managing senior programmers is like.
Classifying Sentences using Induced Structure Menno Van Zaanen Luiz Augusto Pizzato Diego Mollá-Aliod Centre for Language Technology.
C SC 620 Advanced Topics in Natural Language Processing Lecture 10 2/19.
Ever wondered what you can do in Paris well I am going to show you.
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
Processing of large document collections Part 10 (Information extraction: multilingual IE, IE from web, IE from semi-structured data) Helena Ahonen-Myka.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Protein Secondary Structure Prediction with inclusion of Hydrophobicity information Tzu-Cheng Chuang, Okan K. Ersoy and Saul B. Gelfand School of Electrical.
Finding Similar Questions in Large Question and Answer Archives Jiwoon Jeon, W. Bruce Croft and Joon Ho Lee Retrieval Models for Question and Answer Archives.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Which of the two appears simple to you? 1 2.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Searching for training offers by competency name Log into My Learning Link, go to My Account 1.
A Language Independent Method for Question Classification COLING 2004.
WELCOME. BIG PICTURE Section A Identify a search engine Search for cost of driving lessons Advanced searches Comparing the search engines evaluation Section.
CS 6998 NLP for the Web Columbia University 04/22/2010 Analyzing Wikipedia and Gold-Standard Corpora for NER Training William Y. Wang Computer Science.
Automatic Set Instance Extraction using the Web Richard C. Wang and William W. Cohen Language Technologies Institute Carnegie Mellon University Pittsburgh,
Pattern Recognition with N-Tuple Systems Simon Lucas Computer Science Dept Essex University.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
©2003 Paula Matuszek CSC 9010: Text Mining Applications Dr. Paula Matuszek (610)
Digital libraries and web- based information systems Mohsen Kamyar.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
公司 標誌 Question Answering System Introduction to Q-A System 資訊四 B 張弘霖 資訊四 B 王惟正.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Question Classification using Support Vector Machine Dell Zhang National University of Singapore Wee Sun Lee National University of Singapore SIGIR2003.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Part-of-Speech Tagging with Limited Training Corpora Robert Staubs Period 1.
Multilingual Search Shibamouli Lahiri
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Review Day. ERROR ANALYSIS- Students have completed the following problems incorrectly. Write a sentence for each on a lined sheet of paper explaining.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Final Project Presentation Information Extraction Learning to Extract Signature and Reply Lines from Vitor R. Carvalho.
Unit 2 Interviews. Aims  This unit aims at helping students learn the knowledge and language skills of having a successful job interview.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Mohammad Alqahtani, Dr. Eric Atwell
Queensland University of Technology
A Brief Introduction to Distant Supervision
An Automatic Construction of Arabic Similarity Thesaurus
Automatic Detection of Causal Relations for Question Answering
CSE 635 Multimedia Information Retrieval
Searching with context
C SC 620 Advanced Topics in Natural Language Processing
Presentation transcript:

ALTA Workshop’04, Macquarie University, Sydney 8 December 2004 Luiz Augusto Sangoi Pizzato Using a Trie-based Structure for Question Analysis

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (2/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Question analysis Trie structure Question trie Building and retrieving using the trie Evaluation of the Technique Further work “Using a Trie-based Structure for Question Analysis” Outline

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (3/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Our question analyser tries to answer two meta-questions: What is the kind of answer I have to provide? Define the expected answer type (EAT). What is the subject of the question? Define the question focus. Question on question

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (4/21) In: ALTA Workshop Macquarie University, Sydney. 8 December EAT Handcrafted rules Normally by the use of RE WordNet top concepts (Moldovan et al., 2003) High quality results Support Vector Machines (SVM) (Zhang and Lee, 2003) Good results using a large training set Focus Discard question’ stopwords. Some approaches

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (5/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Trie structure a|b|c|d|e|f|...|z a|b|c|d|...|r|...|z car a|b|c|d|e|f|...|z a|b|c|d|...|r|...|z a|b|c|d|e|f|...|z zebra

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (6/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Patterns QuestionPatternEAT Where is Chile?^ Where is !LOC $LOC Who is the dean of ICS?^ Who is the !POS of !ORG $NAME Who is J. Smith?^ Who is !NAME $DESC Who is J. Smith of ICS?^ Who is !NAME of !ORG $DESC How far is Athens?^ How far is !LOC $NO How tall is Sting?^ How tall is !NAME $NO

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (7/21) In: ALTA Workshop Macquarie University, Sydney. 8 December where 6 who 18 how 7 is 13 !NAME 9 !POS 10 of 11 !ORG 12 $ (eoq) 14 $ (eoq) 8 the 15 of 16 !ORG 17 $ (eoq) 3 is 4 !LOC 5 $ (eoq) 19 far 20 is 21 !LOC 22 $ (eoq) 23 tall 24 is 25 !NAME 26 $ (eoq) ^ (boq) Question Trie

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (8/21) In: ALTA Workshop Macquarie University, Sydney. 8 December NodesInformation (EAT, Frequency) 1(LOC,1),(NAME,1),(DESC,2),(NUMBER,2) 2-5(LOC,1) 6-7(NAME,1),(DESC,2) 8-12(NAME,1) 13(DESC,2) 14-17(DESC,1) 18(NUMBER,2) 19-26(NUMBER,1) 1 2 where 6 who 18 how 7 is 13 !NAME 9 !POS 10 of 11 !ORG 12 $ (eoq) 14 $ (eoq) 8 the 15 of 16 !ORG 17 $ (eoq) 3 is 4 !LOC 5 $ (eoq) 19 far 20 is 21 !LOC 22 $ (eoq) 23 tall 24 is 25 !NAME 26 $ (eoq) ^ (boq) Question Trie

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (9/21) In: ALTA Workshop Macquarie University, Sydney. 8 December who 7 is 13 !NAME 14 $ (eoq) 15 of 16 !ORG 17 $ (eoq) ^ (boq) $^whoisJohnSmithofMacquarieUniversity ? ? $^whoisMadonna ? Look-ahead process

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (10/21) In: ALTA Workshop Macquarie University, Sydney. 8 December JustAsk logs; 4.8% NL questions of were NL questions unique NL questions 23% with some language problems: Why this search not word? Unusual language: Do u offer any scholarships 4 physiotherapy? Speculative questions: Will I get a job in Australia after finishing my MBA? MQ Questions

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (11/21) In: ALTA Workshop Macquarie University, Sydney. 8 December JustAsk questions were randomly selected and semi- automatically tagged according to a XML like structure Who is Luiz Pizzato ? Total number of questions: – Who 212 – What 208 – Where 203 – How 529 – Other types: Am I, Are there, Can I, Do you, Is there, I want, I need, Which, Does, Tell me, Why, Have you, Could you, May I, Will I, Was I, Would you, Whom Training Set

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (12/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Evaluation - EAT

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (13/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Evaluation – Focus

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (14/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Question Trie without Entities 1 2 where 6 who 19 how 7 is 13 J. 9 dean 10 of 11 ICS 12 $ (eoq) 15 $ (eoq) 8 the 16 of 17 ICS 18 $ (eoq) 3 is 4 Chile 5 $ (eoq) 20 far 21 is 22 Athens 23 $ (eoq) 24 tall 25 is 26 Sting 27 $ (eoq) ^ (boq) 14 Smith

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (15/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Evaluation – TREC-2003

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (16/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Comparison with SVM (Zhang and Lee, 2003)

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (17/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Concluding remarks The developed technique offers reasonable results using no linguistic resources. Future developments Define guidelines for the EAT markup and review the markup of the MQ questions Adding POS and semantic information from WordNet may replace entity markup

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (18/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Combine lexical and POS information Who is John Smith? is VBZ EAT freq NAME1 DESC1 Who WP EAT freq NAME1 DESC1^ EAT freq NAME1 DESC1 $ EAT freq NAME1 John NNP EAT freq NAME1 Smith NNP EAT freq NAME1 John Smith NNP EAT freq NAME 1 Further Work

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (19/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Dell Zhang and Wee Sun Lee Question classification using support vector machines. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR-03), pages 26–32. ACM Press. Dan Moldovan, Marius Paşca, Sanda Harabagiu, and Mihai Surdeanu Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst., 21(2):133–154. References

Pizzato, Luiz Augusto Sangoi. Using a Trie-based Structure for Question Analysis. (20/21) In: ALTA Workshop Macquarie University, Sydney. 8 December Acknowledgments My supervisors Dr. Diego Mollá-Aliod Dr. Rolf Schwitter Dr. Cecile Paris

ALTA Workshop’04, Macquarie University, Sydney 8 December 2004 Luiz Augusto Sangoi Pizzato Using a Trie-based Structure for Question Analysis