QUIRK:Project Progress Report December 3-5 2002 Cycorp IBM.

QUIRK:Project Progress Report December 3-5 2002 Cycorp IBM

Notable Progress Query decomposition extensions Argument-structure approximation Syntactic analysis of textual sources Reflexive justifications

Single Literal Query Decomposition P? Q?, R?, Z? Q?R? Z? (Q & R & Z)  P

Multi Literal Query Decomposition P? Q?, R?, Z? Q?, R? Z? (Q & R & Z)  P

Multi Literal Query Decomposition (likes Bob ?X) (isa X French), (isa X Movie), (likes Amy X) ((isa X French) & (isa X Movie) & (likes Amy X))  (likes Bob X) (likes Amy X)(isa X French), (isa X Movie),

Examples Joins in external DBs (NIMA, USGS… ) –Airports in Travis County, TX –Hospitals located in port cities –... Web services, e.g. IMDB –Actors from the ‘50s As a bridge between KR formats

Davidsonian KR bridge Wellington defeated Napoleon in Waterloo. (thereExists ?EV (and (isa ?EV DefeatingAnOpponent) (performedBy ?EV Wellington) (objectActedOn ?EV Napoleon) (eventOccursAt ?EV Waterloo)))

Argument-Type bridge John lives in a French village (thereExists ?V (and (isa ?V Village) (geographicalSubRegions France ?V) (residesInRegion John ?V)))

Registration of multi-literal removal modules at the moment sufficiently few such modules exist that they can be defined in code plans for declarative registration of such modules in Cyc’s KB even with run-time KB edits.

Arg based query generation (thereExists ?EV (and (isa ?EV AttackOnObject) (maleficiary ?EV Djibouti) (performedBy ?EV ?WHO)) [SUBJ [VERB OBJ]] @PHR(2 PERSON$ attack *Djibouti)

Secretary Input: –A CycL query such as (president France ?WHO) –A textual paragraph Output: a ranked list of CycL terms that –represent entities mentioned in the paragraph –are type-appropriate as substitutions for the free variables in the query (?WHO:Person) Three types of Secretary

Secretary 1 Use IBM’s Talent system to learn new lexical entries Tag paragraph with lexical mappings Select type-appropriate CycL tags Rank them by proximity to query focus, as determined by recorded position in the paragraph of all the ground terms in the CycL query

Secretary 2 Use IBM’s Talent system to learn new lexical entries Use output of UPenn’s dependency parser to generate a set of CycL interpretation of the paragraph Select “best” interpretation Return CycL entity in the appropriate relationship to the query’s predicate.

Secretary 3 Use IBM’s Talent system to learn new lexical entries Use output of UPenn’s dependency parser to generate a set of CycL interpretation of the paragraph Select “best” interpretation and turn it into a virtual assertion in Cyc’s KB Ask the original query in the KB so obtained Return all answers.

General observations Secretary 2 and 3 have better precision than Secretary 1, but much lower recall –possibly due to the non-verb-like nature of many Cyc predicates; need to check if the same holds true of multi-literal events Linear proximity of Secretary 1 is almost as good as the argument based analysis of Secretary 2 and 3

Introspective Justifications

Dialog evaluation Basic knowledge representation performed for each of the topics Used KRAKEN GUI for interpretation of questions Used KRAKEN NL generation for reporting answers to analyst

KRAKEN GUI Which paintings about war did Picasso create?

Contextual vs Keyhole approach Several questions asked simultaneously: –“Need background data on the Cuban dissident Elizardo Sanchez to include birth data, education, work ethics, organization affiliations to name a few.” Analyst happy with a summary of all facts known about an entity of interest

Lessons learned Analysts like to ask/see questions/anwers in context Single question/single answer approach could be extended to: –dossier about entity X –preliminary dialog on desired properties of dossier inferred from properties of entity X Justifications become interesting only if answers are sufficiently surprising.

Definitional Questions Evaluation Expectations: –large answer set –both redundancy AND irrelevance –opportunity for structuring answer set by salient features of question focus Actual experience –limited answer set –mostly redundancy

Original Plan Use appositives to learn type of entity “Massimo Cacciari”  “Venice Mayor” Use Cyc to –understand type (a kind of elected official) –generatelist of questions salient for the type when was he elected? what is his party affiliation? … Answer salient questions from textual sources

Revised Plan Use syntactic analysis to extract appositives and relevant VPs Cluster strings so extracted Return one string from each cluster, ranked by the size of the cluster.

Lessons learned Punctuation and function words are crucial Textual sources don’t always support an analysis by “salient features” Semantic analysis not necessarily useful of the end result is expected to be a string that could be easily interpreted by the analyst.

QUIRK:Project Progress Report December 3-5 2002 Cycorp IBM.

Similar presentations

Presentation on theme: "QUIRK:Project Progress Report December 3-5 2002 Cycorp IBM."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

QUIRK:Project Progress Report December 3-5 2002 Cycorp IBM.

Similar presentations

Presentation on theme: "QUIRK:Project Progress Report December 3-5 2002 Cycorp IBM."— Presentation transcript:

Similar presentations

About project

Feedback