Presentation is loading. Please wait.

Presentation is loading. Please wait.

MOQA Meaning Oriented Question Answering An AQUAINT Project from ILIT.

Similar presentations


Presentation on theme: "MOQA Meaning Oriented Question Answering An AQUAINT Project from ILIT."— Presentation transcript:

1 MOQA Meaning Oriented Question Answering An AQUAINT Project from ILIT

2 CRL is a research department in the School of Arts and Sciences at NMSU Director: Jim Cowie Currently has a staff of 10 PhDs Mainly focuses on language engineering research Languages include – Arabic, Farsi, Turkish, Spanish, Chinese, Japanese, Korean Contact: Jim Cowie – jcowie@crl.nmsu.edu

3 Advanced-technology company in Ithaca, New York Founded in 1990 by Dr. Richard Kittredge, Dr. Tanya Korelsky, and Dr. Owen Rambow. Goal is to transform results from research in natural language processing into practical software applications. Has developed a core set of text generation tools Current focus is on expanding the range of applications for this technology, with a particular focus on the Web. Contact: Tanya Korelsky – tanya@cogentex.com

4 The Institute for Language and Information Technologies at University of Maryland Baltimore County Sergei Nirenburg, Director Opened September 2002 with a team of 3 senior personnel –Sergei Nirenburg –Stephen Beale –Marge McShane Contact: Sergei Nirenburg – sergei@cs.umbc.edu ILIT

5 Meaning-Oriented Question-Answering with Ontological Semantics Domain: travel and meetings –question understanding and interpretation; – determining the answer and – presenting the answer two kinds of data source –open text (in English, Arabic and Persian) –Structured Fact Repository containing instances of ontological entities

6 Project Tasks Design and Implementation of System Architecture Knowledge Acquisition Question Understanding Question Interpretation Answer Determination Answer Formulation Documentation; User and Evaluator Training; Testing; and System Evaluation

7 Dialog and Self- Awareness-related Answer Determination: (for running commentary and workflow and context- related communication) Question Interpretation:  task context  dialog context  user profile  analyst team profile Question Understanding Answer Formulation and Presentation Input : User Question in English Output : System Response in English Task-Oriented Answer Determination from Fact Database: IE from Fact Database NL Query Generation: in English, Arabic and one of Persian, Russian, Spanish Answer Determination from open text:  IR  IE  Production of TMRs for Textual Fillers of IE Templates NL Query FACT REPOSITORY: including instances of goals, plans, scripts ::ONTOLOGY including goals, plans, scripts LEXICONS Each Language in System: including names and phrases Knowledge Sources Processing Modules and Intermediate Results Goal and Plan Processing Manager System Working Memory Extended TMR: adds a statement of active goals, plans and scripts in the system System Response in TMR Basic Text Meaning Representation (TMR) Goal Attainment and Plan Execution Agenda Using XML

8 Development Methodology Rapid Prototyping Using pre-existing components Grow the various development activities toward an integrated system Today – look at one example of each User Interaction Analysis Resources XML - TMR

9 Deliverables A QA system in the domain of travel and meetings, with a capability to search for information in open texts in three languages and in a structured, ontology-based Fact DB; an enhanced text analysis system for each of the languages; a question interpretation module that takes into account user goals and the context of the dialog; an integrated IR/IE module working on open text in three languages, on the basis of ontologically defined extraction templates;

10 Deliverables (Cont.) an ontology of about 6,500 concepts; A Fact DB of about 100,000 facts; a system for automating the acquisition of the Fact DB; a semantic lexicon for each of the languages in the system, at about 20,000 entries a decision-making module that determines the answer(s) and system action(s) at each step of the dialog/task processing; an intuitive and intelligent multi-modal user interface, which uses natural language generation in answers and for query validation

11 Project Status Approval to spend given from August 21 st 2002 UMBC – Ontology and English Lexicon Improvement, Development of Scripts, Meaning Based Text Analysis CoGenTex – Interface design, Human Factors, Text Generation NMSU – Text-preprocessing, Arabic and Farsi resources and analysis, data collection, system integration.

12 Corpora Being Collected Arabic : –http://www.aljazirah.nethttp://www.aljazirah.net –http://news.bbc.co.uk/hi/arabic/news/http://news.bbc.co.uk/hi/arabic/news/ –http://www.irna.com/ar/index.shtmlhttp://www.irna.com/ar/index.shtml English: –http://www.cnn.comhttp://www.cnn.com –http://www.bbc.co.uk/worldservice/index.shtmlhttp://www.bbc.co.uk/worldservice/index.shtml –http://www.irna.com/en/http://www.irna.com/en/ Persian: –http://www.hamshahri.net/http://www.hamshahri.net/ –http://www.bbc.co.uk/persian/index.shtmlhttp://www.bbc.co.uk/persian/index.shtml –http://www.irna.com/pe/index.shtmlhttp://www.irna.com/pe/index.shtml

13 Ontology An ontology is a formally and semantically defined repository of concepts and relations about the world. –Including knowledge about events, objects, and work flow scripts Linked to the ontology are: –fact repositories, including facts about actual events, objects, places, personalities, etc. –lexica, defining words in a language in ontological terms –“ onomastica ”, or multilingual proper name lists

14 Structured Common Fact Repository Uniform organization for all kinds of data Support for multiple applications and tools Semantically anchored in general ontology Constantly updated; today, manually; tomorrow, semi-automatically; long-term, automatically Supports both domain knowledge and workflow specification

15 Populating the Fact Repository Original Text CIA Report 02 [15 October, 2002, from a source, said to be credible, in Jordan]: A man named Majed H., using a faked Jordanian passport and travel visa, traveled from Aman, Jordan to Chicago, Illinois on 12 July, 2002. Majed H. is now known to have resided in Afghanistan for two years (1996-1997) and has been identified as a member of Al- Qaeda. Gloss An unnamed source, who is very reliable, informed the CIA sometime between July 12, 2002 and October 15, 2002, of a travel- event by Majed H. on July 12, 2002, from Aman (Amman) Jordan to Chicago Illinois. In order to take the trip, Majed H. used a fake passport and visa issued by Jordan. Majed H. was located in Afghanistan from January 1, 1996, through December 31, 1997, and is a member of Al-Qaeda.

16 Populating the Fact Repository (2) Gloss An unnamed source, who is very reliable, informed the CIA sometime between July 12, 2002 and October 15, 2002, of a travel- event by Majed H. on July 12, 2002, from Aman (Amman) Jordan to Chicago Illinois. In order to take the trip, Majed H. used a fake passport and visa issued by Jordan. Majed H. was located in Afghanistan from January 1, 1996, through December 31, 1997, and is a member of Al-Qaeda. Facts (12 Total) INFORM-1 AGENT: SOCIAL-ROLE-4 BENEFICIARY: ORGANIZATION-1 THEME: TRAVEL-EVENT-0 TIME: <> 07/12/2002 10/15/2002 MODALITY-EPISTEMIC: > 0.6 …………… NATION-2 HAS-NAME: "Jordan" CITY-1 HAS-NAME: "Amman" IN-NATION: NATION-2

17 LEXICON: English lexical entry mapped to concept “EXIT”

18 LEXICON: Chinese lexical entry mapped to concept “EXIT”

19 Using Ontology to Support Retrieval Documents need to be retrieved using the language of the document The representation of queries in the system is in terms of ontological concepts and “facts” We will use the ontology to support retrieval in all three languages Current experiment uses Chinese and Spanish- Ontology-Language lexicons exist for these languages

20 User Specified Query

21 Mapped to Associated Concepts

22 Concepts Map to Language of Documents

23 Concept-Word Mappings

24 Further Mapping of Concepts

25 Generation Tasks Months 1-6 Subtask 1: First prototype of intelligent question answering user interface, involving hypertext generation (December demo) Subtask 2: Gathering of end users feedback on the interface functionality, look-and-feel and user customization Subtask 3: Design of extensions to cover broader collection of concepts from the ontology Milestone at next 6 months: Report on user feedback

26 MOQA User Interface Support for both natural languages queries and structured queries Intuitive web-based multi-modal interface for answers Tables, text, maps, time line, and social network graphs are interconnected by hyperlinks Natural language generation used in answers and for query validation Implemented using XML-based technology Positive reviews at the kick-off from an HCI expert and program management

27 MOQA User Interface – Query Page Support for NL- based queries and structured queries Structured query validation with automatically generated NL paraphrase WYSIWYM editing of structured queries

28 MOQA User Interface –Underlying XML Queries Uses standard XML technology (e.g. XML- compliant browser, XML parsers, etc.) Supports modularity – the XML representation is viewable and exchangeable between subsystems Assures automatic validation of query instances using a query class hierarchy described by XML schemas Uses logical expressions in XML to support complex queries

29 MOQA User Interface – Results Page (kick- off concept demo version) Concept demo helped to perform requirements analysis Demonstrated integrated display using tables, textual summary, map and time line Illustrated filtering table data by using hyperlinks in text

30 MOQA User Interface - Details Page (concept demo) Demonstrated display of additional types of information including social networks and source document extracts Illustrated “drill-down” by hyperlinks and typical follow-up queries based on underlying ontology

31 MOQA User Interface – Research Plans Presentation of partially understood natural language queries Personalization of answer presentation both content-wise (based on user expertise) and form- wise (based on user presentation preferences) Intelligent maintenance of session history based on typical work flow and collaboration patterns within groups Interface portability between subject domains Incremental evolution based on validation by domain experts


Download ppt "MOQA Meaning Oriented Question Answering An AQUAINT Project from ILIT."

Similar presentations


Ads by Google