NLDB 2004 ORAKEL: A Natural Language Interface to an F-Logic Knowledge Base Philipp Cimiano Institute AIFB University of Karlsruhe NLDB 2004
Outline Aim & Scope System Architecture F-Logic in a Nutshell Semantic Construction Component Lexicon Generation Component Conclusion & Further Work
NLDB 2004 Aims & Scope Aim: accessing a KB via natural language translating language into logical queries Who owns a company? Who owns every company? Who does not own a company? Scope: consider only factoid questions simple syntactic structure: SVO +PP?
NLDB 2004 System Architecture General Lexicon Domain Lexicon Lexicon Generation Parsing + Sem. Construction NL query answer F-Logic KB Ontobroker logical query
NLDB 2004 F-Logic in A Nutshell frames/methods: microsoft[name ->„Microsoft“; boss -> bill_gates; revenue(2002) -> ] subclasses: company::organization class-membership: microsoft:company queries:
NLDB 2004 Semantic Construction compositional semantics approach to map NL questions to F-Logic queries relies on LTAG-style parsing formalism developed in [Muskens 01] extension to accommodate ontological concepts [Cimiano & Reyle 03]
NLDB 2004 Who owns Microsoft? Who
NLDB 2004 Who owns Microsoft? Who owns
NLDB 2004 Who owns Microsoft? owns Who Microsoft
NLDB 2004 Who owns Microsoft? Who owns Microsoft ?
NLDB 2004 Who owns Microsoft? owns Who Microsoft ?
NLDB 2004 Who owns Microsoft? Who ? ownsMicrosoft
NLDB 2004 Who owns Microsoft? ? Who ownsMicrosoft eiei
NLDB 2004 Who owns Microsoft? Who ownsMicrosoft eiei ?
NLDB 2004 Lexicon Generation (part I) person[own->company] use corpus (BNC) parse it (LoPar) extract with tgrep S V O (transitive) S V PP (intransitive + PP) S V O PP (transitive + PP) N PP N PP PP find most appropriate synset for each argument w.r.t WordNet [Resnik 97]
NLDB 2004 Lexicon Generation (part II) person[own -> company] own: transitive: (45.90%) subj: obj: instransitive + PP: (4.10%) subj: PP(by): take the one maximizing: repeat the whole process with synonyms of most frequent synset, i.e. have and possess
NLDB 2004 Lexicon Generation (part III) person[own->company] own(subj: person, obj: company) Generate elementary trees: Who owns Microsoft? Which company does Bill Gates own? Which company is owned by Bill Gates?
NLDB 2004 Lexicon Generation (part IV) Microsoft Microsoft:company company
NLDB 2004 Evaluation 5 ontologies from the DAML library: beer wine personal information general information about organizations university activities acquire an appropriate subcategorization frame for the binary relations
NLDB 2004 Results Ontology#PropsDom.+RangeNon-CompositeCorrect% Beer Wine Personal General University Total
NLDB 2004 Conclusion ORAKEL: translating NL questions into logical form theoretical (parsing) framework lexicon generation differs from: AQUALOG (map triples to an ontology) FrameNet (mapping lexical/semantic representations) AID (use TFIDF-based similarity) Schema-based (map words to database columns)
NLDB 2004 Further Work Further applications: NL generation from an ontology Mapping from syntax to ontological structures (MOSES) Further Work: user feedback answer formulation real life evaluation application to construct the KB
NLDB 2004 EKAW 2004 Workshop on the Application of Language and Semantic Technologies to support Knowledge Management Submission deadline: 4. July!!! Topics: Multi-lingual systems, Information Extraction, Ontology Learning, Document Indexing, Retrieval and Browsing, Approaches to Semantic Annotation, Smart Browsing, Semantic Search, Question Answering, Enterprise Content Management Organizing Committee: Philipp Cimiano, AIFB, University of Karlsruhe, Germany. Fabio Ciravegna, Natural Language Processing Group, University of Sheffield, UK Enrico Motta, Knowledge Media Institute, The Open University, UK. Victoria Uren, Knowledge Media Institute, The Open University, UK.