Ontology Development To Support AQUA Question-Answering Richard Fikes Jessica Jenkins Bill MacCartney Rob McCool Deborah McGuinness Knowledge Systems Laboratory.

Ontology Development To Support AQUA Question-Answering Richard Fikes Jessica Jenkins Bill MacCartney Rob McCool Deborah McGuinness Knowledge Systems Laboratory Stanford University www.ksl.stanford.edu

Knowledge Systems Laboratory, Stanford University2 KSL and the WMD Coalition  Tools for ontology creation, evolution, and maintenance  Coalition teams have adopted KSL’s Ontolingua and Chimaera as a standard for ontology development, maintenance, and analysis (Additionally, we use other internal tools – JTP, DQL, IW)  Initial evaluation and ongoing support  KSL evaluated initial stage of WMD ontology using our Chimaera tools, reviewed findings with other teams, and taught others how to use the tools themselves

Knowledge Systems Laboratory, Stanford University3 KSL and the WMD Coalition  Tools for ontology creation, evolution, and maintenance  Initial evaluation and ongoing support  Knowledge representation and consulting work  KSL providing some core new KB development for CNS core – Russian naval facilities and Newly Independent States facilities – and is augmenting this with extraction results  Providing consultation on sources / merging opportunities - Counter-terrorism KBs built for DARPA (HPKB, ISX-Cyladian- HORUS, SAIC-Cycorp, … ) semantic web for the military (SWMU ontology tutorial), ontologies for information fusion, etc.

Knowledge Systems Laboratory, Stanford University4 KSL and the WMD Coalition  Tools for ontology creation, evolution, and maintenance  Initial evaluation and ongoing support  Knowledge representation and consulting work  Knowledge extraction  KSL knowledge extraction tools for RNF and NIS  New focus on utilizing other important useful KB sources  SUMO is a core ontology for ontology sharing: ~3,900 axioms; relations and sets; processes and objects; temporal, spatial, and mereological relations; agents, etc.  Domain ontologies: WMDs, terrorism, biological viruses, …  Written in SUO-KIF, a proprietary dialect of KIF  Published by Teknowledge under GNU public license as part of IEEE SUO working group (ontology.teknowledge.com/)

Knowledge Systems Laboratory, Stanford University5SUMO  SUMO requires translation to be used with any reasoner  KSL has successfully translated SUMO to plain-vanilla KIF  Full translation: complete semantic content retained  Highly portable: should be fully compatible with most FOL reasoners  Accurate: most test queries demonstrably answerable from translated axioms  However, the result is not yet fully usable  Most test queries not answered from full axiom set in reasonable time  SUMO was not designed for efficient automated reasoning  Solution: a smarter translator, and some SUMO “brain surgery”  Translation to DAML may provide another path…  The existing translation is quite lossy, but enables some query-answering  Further work will enable a more complete and accurate translation

Knowledge Systems Laboratory, Stanford University6 DAML Versions of SUMO, WMD, and Terrorism  DAML translations of SUMO, WMD, and 5 terrorism-related ontologies and knowledge bases (KBs) provided by Teknowledge ClassesPropertiesInstancesTriplesDroppedAxiomsSUMO577188683524~800 WMD186969102159 TerrorismOntology840020044 TerrorismKBs21289296682919 Total8491983029144133822

Knowledge Systems Laboratory, Stanford University7 DAML Versions of SUMO, WMD, and Terrorism  A few simple translations were used by Teknowledge to translate original KIF content to DAML  Example: subrelation (subrelation father parent) </daml:ObjectProperty>  Example: KIF triples to RDF triples (part MadridSpain Spain) </rdf:Description>

Knowledge Systems Laboratory, Stanford University8 DAML Versions of SUMO, WMD, and Terrorism  Teknowledge’s DAML files provide a great starting point, but there are a few problems  Syntactic errors and issues with resolving references across files -- these problems are easy to fix.  A large amount of the original KIF content was dropped in the translation to DAML. Reincorporating some of this content is trivial, but it is generally nontrivial. (instance part TransitiveRelation) Example (trival): TransitiveRelation …</daml:TransitiveProperty>

Knowledge Systems Laboratory, Stanford University9 KIF -> DAML Example (nontrivial) (=> (and (instance ?SUBSTANCE BiochemicalAgent) (possesses ?AGENT ?SUBSTANCE)) (capability BiochemicalAttack agent ?AGENT)) (capability BiochemicalAttack agent ?AGENT)) Original KIF Translation of capability [ternary to binary relation] </rdf:Property> </rdf:Property> </rdf:Property>

Knowledge Systems Laboratory, Stanford University10 KIF -> DAML Example (nontrivial) cntd. (=> (and (instance ?SUBSTANCE BiochemicalAgent) (possesses ?AGENT ?SUBSTANCE)) (capability BiochemicalAttack agent ?AGENT)) (capability BiochemicalAttack agent ?AGENT)) Original KIF Translation of “?AGENT has BiochemicalAttack capability if it possesses a BiochemicalAgent” </sumo:Capability> …

Knowledge Systems Laboratory, Stanford University11 Query-Answering Example 1  “What has the capability of being the agent of a biochemical attack?”  Query pattern: (capability ?agt Biochemical-Attack-Agent-Capability)  Knowledge in the ontology:  A thing is an “Agent-With-Biochemical-Attack-Capability” if and only if it – >Has a capability “Biochemical-Attack-Agent-Capability” or >Possesses a “Biochemical-Agent”  An “Agent-With-Biochemical-Attack-Capability” has capability “Biochemical-Attack- Agent-Capability”  A “Nerve-Agent” is a “Biochemical-Agent”  If AGT has capability “Biochemical-Attack-Agent-Capability”, then AGT is capable of being an “agent” in a “Biochemical-Attack”  If C is the capability of playing role R in processes of type PT, and AGT is known to have played role R in a process of type PT, then AGT has capability C  Knowledge from documents:  “Al-Qaida” is a “Foreign-Terrorist-Organization” that possesses a “Nerve-Agent”  “Aum-Supreme-Truth-Chemical-Attack-27-Jun-94 is a “Chemical-Attack” whose agent is “Aum-Supreme-Truth”

Knowledge Systems Laboratory, Stanford University12 Query-Answering Example 1  “What has the capability of being the agent of a biochemical attack?”  Query pattern: (capability ?agt Biochemical-Attack-Agent-Capability)  Answer: “Al-Qaida”  “Al-Qaida” is a “Foreign-Terrorist-Organization” that possesses a “Nerve-Agent” [from documents]  A thing is an “Agent-With-Biochemical-Attack-Capability” if and only if it – >Has a capability “Biochemical-Attack-Agent-Capability” or >Possesses a “Biochemical-Agent” [from ontology]  An “Agent-With-Biochemical-Attack-Capability” has capability “Biochemical-Attack- Agent-Capability” [from ontology]  If AGT has capability “Biochemical-Attack-Agent-Capability”, then AGT is capable of being an “agent” in a “Biochemical-Attack”[from ontology]  A “Nerve-Agent” is a “Biochemical-Agent” [from ontology]

Knowledge Systems Laboratory, Stanford University13 Query-Answering Example 1  “What has the capability of being the agent of a biochemical attack?”  Query pattern: (capability ?agt Biochemical-Attack-Agent-Capability)  Answer: “Aum-Supreme-Truth”  “Aum-Supreme-Truth-Chemical-Attack-27-Jun-94 is a “Chemical-Attack” whose agent is “Aum-Supreme-Truth” [from documents]  Playing the role “agent” in a “Biochemical-Attack” requires the capability “Biochemical-Attack-Agent-Capability” [from ontology]  If playing role R in a process of type PT requires capability C, and Agt plays role R in a process of type PT, then Agt has capability C [from ontology]  “Aum-Supreme-Truth” has capability “Biochemical-Attack-Agent-Capability”

Knowledge Systems Laboratory, Stanford University14 Query-Answering Example 2  “Who are the agents of attacks that used the same type of weapons as “Recent-Attack-001?”  Query pattern: (type Recent-Attack-001 ?res) (onProperty ?res instrument) (hasClass ?res ?inst-type) (type ?attack ?res) (agent ?attack ?agt)  Must-bind variables: ?agt ?attack  Knowledge in the ontology:  A “Mortar-Attack” has an instrument of type “Mortar”  Knowledge from documents:  “Recent-Attack-001” is a Thing that has an instrument of type “Mortar”  “Revolutionary-Armed-Forces-Of-Colombia-Mortar-Attack-1-Jul-00” is a “Mortar- Attack” that has agent “Revolutionary-Armed-Forces-Of-Colombia”.  Answer: “Revolutionary-Armed-Forces-Of-Colombia”

AQUA Program Plan  Overview of the project  Goal is to create a system that can answer complex questions  With plus up funding, we now have an end-to- end system. Makes use of KSL’s Ontolingua Knowledge Server and Java Theorem Prover (JTP) to develop answers to queries  Uses SAIC and other technology to automatically populate KBs with information from new text sources  Uses multiple extractors from multiple sources to answer queries >KSL extractor >UMBC/NMSU extractor >IBM extractor

SAIC KIF  TMR Mapper/Translator NL QUESTION CNS TEST DATA MOQA Text  TMR Translator IBM Text  KIF Translator KIF-Formatted Question KIF Answer/ Proof tree Ontolingua Knowledge Server ---------------- JAVA Theorem Prover KSL generated explanation A D SAIC TMR  KIF Mapper/Translator C MOQA NL  TMR Query Processor B MOQA TMR  NL Answer Processor NL ANSWER E DATA SAIC (A-F) MOQA KSL IBM SAIC- Team CNS Data Generation F KSL Extractor AQUA Current Plans

AQUA Initial Concept QUESTION NL Query Interlingua Query KIF Query KIF Answer Interlingua Answer NL Answer ANSWER NMSU Query Processor SAIC Interlingua  KIF Translator KSL Java Theorem Prover SAIC KIF  Interlingua Translator NMSU NL Generator

Key Tasks - SAIC  Perform translation of Onyx/UMBC extracted TMRs to KIF (Item A)  Align two disparate ontologies  Translate terms once aligned  Both formalized queries and extracted text need to be translated  Develop CNS WMD ontology  Co-ordinate subcontractors and develop system interfaces

Key Tasks - Onyx  Provide formalized translation of NL queries (MOCA – item B)  Perform extraction of CNS data into text (MOCA – item A)

Key Tasks - IBM  Assist in relations extraction from text into WMB ontology

KSL’s Current Activities  JTP – Hybrid reasoning for query answering  Includes a temporal reasoner  Is a DQL (DAML Query Language) server  Knowledge Base Partitioning – Enabling Q-A from large scale KBs using parallel heterogeneous reasoners  Inference Web – Providing understandable explanations for derived query answers  Knowledge extraction from semi-structured documents  Tables, lists, outlines, property-value pairs, etc.

SAIC Current Activities  SAIC  In-house Ontolingua server with JTP now installed and in use in development efforts  Ontology is available as part of demonstration in the demo rooms  Please visit the SAIC/KSL demo stand

SAIC Current Activities (cont.)  SAIC spearheading a federation of a WMD ontology development effort, assisted by Stanford KSL  Begun development of CNS ontology. Ontology is currently 700 terms and viewable in our in-house version of Ontolingua. (Demo available)

SAIC Current Activities (cont.)  Discussions underway with Sergei to put Onyx under subcontract to SAIC. Subcontract to go out as soon as possible.  Labor division is defined and agreed to  Major issue – Due to subcontract issues Onyx is still not under subcontract. This affects Q?A ayatem development rates as this task is on the critical path for system development.  Distributed ontology to KSL and IBM.  Development of the ontology is critical in order to allow the extractors to function appropriately

WMD Ontology Creation Initial -Confederation Assignments  Stanford/KSL: NIS-Facilities (439 terms) and Russian-Naval-Facilities (365 terms)  IBM: MPT-Topic (771 terms)  Xerox-Parc: Missiles-Topic (765 terms)  Tecknowledge: NIS-Nuclear-Weapons- Aggregate (219 terms)  Battelle: Nuclear-Safety-Assistance (36 terms)

Knowledge Systems Laboratory, Stanford University26 Year Two Project Goals  Complete CNS ontology development  Participate in TREC  System is still immature  Novel appoach  Significant potential for further development  Refine interfaces and determine system metrics to ensure maximum performance in future system iterations

Knowledge Systems Laboratory, Stanford University27 TREC participation  SAIC is signed up for TREC participation this year.  A multi-pronged approach is possible with the current architecture  With the SAIC/Onyx route and NL interface, gives the initial capability for an end-to-end system with restricted domain and range  Formatted queries possible for IBM extraction  System will be very immature in year 1 and likely achieve poor TREC scores, but will mature in multiple and novel directions over time

Knowledge Systems Laboratory, Stanford University28 Future Plans  Continue multi-pronged approach (running multiple extractors over a uniform KB)  Plan further enhancements (Possibly add more extractors or reasoners  Leverage multiple KB approach to optimize research in multi-partition reasoning  Develop effective metrics to determine efficacy of this approach and which pathways are optimal

Knowledge Systems Laboratory, Stanford University29 Future plans (Cont)  Work on implementing latter Proof tree to NL mocha interface in the future (Reverse TMR to KIF)  Transition from KIF to DAML format where possible  Extend range and capabilities of question answering. Initial participation will be limited in terms of domain and range of questions.

Ontology Development To Support AQUA Question-Answering Richard Fikes Jessica Jenkins Bill MacCartney Rob McCool Deborah McGuinness Knowledge Systems Laboratory.

Similar presentations

Presentation on theme: "Ontology Development To Support AQUA Question-Answering Richard Fikes Jessica Jenkins Bill MacCartney Rob McCool Deborah McGuinness Knowledge Systems Laboratory."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ontology Development To Support AQUA Question-Answering Richard Fikes Jessica Jenkins Bill MacCartney Rob McCool Deborah McGuinness Knowledge Systems Laboratory.

Similar presentations

Presentation on theme: "Ontology Development To Support AQUA Question-Answering Richard Fikes Jessica Jenkins Bill MacCartney Rob McCool Deborah McGuinness Knowledge Systems Laboratory."— Presentation transcript:

Similar presentations

About project

Feedback