QUIRK:Project Progress Report Monterey, June Cycorp IBM
Task decomposition Question understanding –English question to CycL query Question answering –CycL Query to CycL bindings and proof trees Answer ranking and presentation –CycL bindings and proof trees to cognitively reasonable English rendition
Knowledge-based query decomposition (grandfathers (PersonNamedFn “Charles Bonaparte”) ?GRAND) (and (parents (PersonNamedFn “Charles Bonaparte”) ?PARENT) (father ?PARENT ?GRAND)) (?PARENT. “Jerome Napoleon Bonaparte”) (?GRAND. (PersonNamedFn “Jerome Bonaparte”)
Integration of heterogeneous sources We have achieved integration of: Knowledge representation (Cyc’s Knowledge Base) Textual documents (TREC corpus via GuruQA search engine) Databases (NIMA, USGS)
CycL to GuruQA translator Implemented query language object data-structures Of which the GuruQA syntactic objects are instances Can be subclassed to target other IR engines
CycL to GuruQA translator (numberOfInhabitants Roma) Italy population POPULATE$)
CycL to GuruQA translator (businessContacts @PHR(0 business business partner)) *Bill_Clinton PERSON$)
Answer selector Implemented so as to be able to access Textract and other existing or future (text analysis) services receives as input –original CycL query –TREC paragraph retrieved by GuruQA returns a list of –CycL bindings (reified or virtual) –their score
Answer selector QUERY: (employees ?ORG (PersonNamedFn “Mary Jones”)) Step 1: learns new, typed, vocabulary from Textract “Mary Jones’ office at ACME Inc has a lovely view of the bridge over the Moldava as her previous office at SUPRA Ltd did.” (PersonNamedFn “Mary Jones”) ====isa==> Person (CompanyNamedFn “ACME, Inc”) ==isa==> Company (RiverNamedFn “Moldava”) =======isa==> River (CompanyNamedFn “SUPRA Ltd”) ==isa==> Company
Answer selector QUERY: (employees ?ORG (PersonNamedFn “Mary Jones”)) Step 2: Retrieves CycL interpretations from paragraph “Mary Jones’ office at ACME Inc has a lovely view of the bridge over the Moldava as her previous office at SUPRA Ltd did.” (PersonNamedFn “Mary Jones”) OfficeRoom (CompanyNamedFn “ACME, Inc”) Bridge (RiverNamedFn “Moldava”) (CompanyNamedFn “SUPRA Ltd”)
Answer selector Step 3: filter answers by predicate argument constraints Q: (employees ?ORG (PersonNamedFn “Mary Jones”)) OrganizationPerson (CompanyNamedFn “ACME, Inc”) (CompanyNamedFn “Supra, Ltd”) Bridge OfficeRoom (RiverNamedFn “Moldava”) (PersonNamedFn “Mary Jones”)
Answer selector Step 4: rank retrieved bindings by their proximity to the ground terms from the CycL query in the paragraph. QUERY: (employees ?ORG (PersonNamedFn “Mary Jones”)) “Mary Jones’ office at ACME Inc has a lovely view of the bridge over the Moldava as her previous office at SUPRA Ltd did.” (CompanyNamedFn “ACME Inc”) => 3 words away (CompanyNamedFn “SUPRA Ltd”) => 20 words away
Answer selector Will soon be using dependency analyzer to improve answer selection Mary Jones works at ACME Inc. works [A_nx0V] 1 args: Mary_Jones [A_NXN] 0 args: modifiers: at [B_vxPnx] 2 args: ACME_Inc. [A_NXN] 3 args: modifiers:
Semantic Knowledge Source Integration (SKSI) Declarative description of knowledge source via Knowledge Base edits. Declarative description of access path via Knowledge Base edits On-line integration with Cyc’s Inference Engine
Backchaining into NIMA database QUERY: (locatedAtPoint-SurfaceGeographical (PlaceNamedFn "Faedis") (LatLongFn ?LAT ?LONG )) Declarative description of knowledge source via KB edits: (implies (NimaGnsPred ?PLACE ?LAT ?LONG) (locatedAtPoint-SurfaceGeographical ?PLACE (LatLongFn ?LAT ?LONG)))
Converting CycL into NIMA SQL query (NimaGnsPred (PlaceNamedFn “Faedis”) ?LAT ?LONG) "SELECT dd_lat,dd_long FROM gns WHERE (full_name_nd = 'Faedis')” dd_lat | dd_long | (1 row) (?LAT d0) (?LONG d0)
NIMA integration example query Query: (and (placeOfDeath AntoninArtaud ?PLACE) (locatedAtPoint-SurfaceGeographical ?PLACE (LatLongFn ?LAT ?LONG ))) Answer: (?PLACE. CityOfIvry-sur-SeineFrance) (?LAT d0) (?LONG d0)
QUIRK’s Blackboard: CyBlack CycL-to-GuruQA translator Paragraph retriever Answer Selector Inference Engine are all connected to the Cyc Blackboard more agents can be added without disturbing the existing architecture
Use of external oracles Examples from As alternative source of textual information –Search engines (e.g. Google) –U.S. Code of Federal Regulations –Searches SEC's EDGAR database As providers of services –Obtain list of ATM machines from ZIP codes –Find under whose name a telephone number is registered –Find out about current weather conditions in a given geographical region –Validate Street Address Generally: code support for inference via Web Services
Example:Google client Implements –(webCorpusTermDocumentCount TERM ?COUNT) Potential for declarative representation in an ontology of expert sources Can be used to –improve generation of GuruQA queries –order search by approximating relative importance/obscurity of terms
(and (isa ?WHO GermanPerson) (isa ?WHO Musician)) Frankenstein: 0.5MMozart: 1.5M KarlMarx: 290KSting:1.25M Beethoven:113KBeethoven: 113K……… Harry Tasso:92KeithGodchaux:1280
Query interpretation tools We have integrated from other Cycorp projects: –Question parser with disambiguation dialog –Scenario constructor (for queries that would be too complex to express in a Natural Language)
Question parser and clarification dialog Which paintings about war did Picasso create?
Question parser and clarification dialog SQ1 SQ2 SQ3
Scenario constructor