Ontology Mapping with Cyc doug foxvog 14 July
Ontology Mapping Completed Mapping Projects Mapping Vocabulary CycL Syntax and Terminology Importation Importation Suggests Additional Knowledge Mapping to Less Expressive Ontology Summary 2
Ontology Mapping at Cycorp Several projects have linked external ontologies, taxonomies, and term code sets to Cyc. Non-proprietary imported ontology projects included the mapping and/or importation of: –NAICS and UNSPSC – over 10,000 classes (industries, products) –FIPS ,400 individuals (geopolitical entities) –MeSH - over 250 classes (anatomical parts) –SNOMED - over 500 classes (anatomical parts) –SENSUS - over 200 terms (mathematical & logical terms) –OpenDirectory - over 10,000 terms (website types by topic) –WordNet ,000+ terms (general) –1997 CIA World Factbook - over 5900 facts (geopolitical) No non-proprietary ontology exports 3
Mapping Vocabulary #$synonymousExternalConcept #$overlappingExternalConcept #$extConceptOverlapsColAndReln These specify mapping of terms by relating Cyc term to term in other ontology (#$synonymousExternalConcept ?THING ?ONTOLOGY ?STRING) (#$extConceptOverlapsColAndReln ?CLASS ?PRED ?ONTOLOGY ?STRING). 4
Mapping Vocabulary #$MeaningInSystemFn Used to express the meaning of a term in an external ontology which has not (yet) been defined in Cyc. (#$MeaningInSystemFn #$GalwayCityOntology “quayStreetGalway”). 5
Contexts in Cyc A context ( #$Microtheory ) is a logical world in which formulae are true/false/ unknown. Assertions true in a context need not be true outside the context. Each context is defined as a subcontext (using #$genlMt ) of one or more contexts. Cyc defines partial ordering of contexts via #$genlMt (a transitive predicate). There is a universal common #$genlMt. 6
Context of Imported Assertions Imported assertions are often made in a context specific to the external ontology. This context is placed as a subcontext (using #$genlMts ) to the context(s) for the type of concept which the external ontology is about. Mapping assertions are made in separate context, if linkage to be maintained. Imported assertions raised to more general context if appropriate. 7
Importation of an Ontology Create or Match terms in Cyc to terms in ontology –Initial mapping Translate definitional assertions to CycL Translate non-definitional assertions to CycL Search for matching Cyc concepts Merge with matching Cyc concepts –Refine mapping Assisted refinement of information 8
Importation of an Ontology Specialized tools built for importation of massive databases and taxonomies (e.g., NAIC -- North American Industry Classification System) “Slurping” tools used for standard databases Shell/editor scripts used for tabular data These tools provide a first pass -- generating classes, relations, individuals, & mappings, and expressing relationships specified in the source. Phrase parsing for (proposed) names #$isa s and #$genls connect to Cyc ontology Second pass: generality level, suggested KE. 9
Importation of an Ontology If an ontology being imported has a notion that is not in Cyc, a term for that notion is normally added to the Cyc ontology. Functional terms, using #$MeaningInSystemFn, may be created if there are no obvious direct mappings so that assertions in the external ontology can be represented. Functional term definition may be used All x such that y #$SubcollectionWithRelation [ From | To ][ Type ] Fn As Cyc terms are created (or discovered) with the correct meanings –The functional term is normally replaced. –A #$synonymousExternalConcept assertion is made to express the mapping. 10.
Special Cases #$MeaningInSystemFn term is maintained or replaced with #$overlappingExternalConcept Predicate with different argument order –#$genlInverse (2x) for binary predicate –Use rule for higher-order predicate –New meta-predicate if becomes common Attribute/Relation/Class/ mismatch –Use rule or meta-predicate Term not interesting for Cyc –Poorly defined or motivated term Special vocabulary for mapping DB schema 10
Importation Second Pass Redundancy Removal –Assertion subsumed by existing more general assertion Generality Level Checking –Should assertion be more general/specific –Options given of term replacements Intermediate Class Suggestion –Multiple sibling classes with similar assertions Suggested Knowledge Entering 11
Importation Suggests Additional Knowledge #$requiredArg1Pred and #$requiredArg2Pred assertions suggest roles that should be asserted for instances of given types. A number of other predicates are also used to more or less firmly suggest other assertions that should be made for instances of certain classes: During suggestion phase, the imported concept is more firmly linked to Cyc ontology. 12
KE Suggestion Predicates keStrongSuggestion keGenlsStrongConsiderationPreds keGenlsStrongSuggestionPreds keWeakSuggestionPreds keGenlsStrongSuggestionInverse keRequirementPreds keStrongConsideration keRequirementTernaryPreds keStrongSuggestionPreds keGenlsWeakSuggestionPreds keStrongSuggestionInverse keWeakSuggestionInverse keWeakSuggestion keGenlsWeakSuggestionInverse keInducedStrongSuggestionPreds keSuggestionApplies keInteractionRequirement keStrongConsiderationPreds keInteractionStrongSuggestion keRequirement keNeighborSuggestion kePlausibleConsideration kePredArgStrongSuggestionInverse kePredArgWeakSuggestionPreds kePredArgWeakSuggestionInverse kePredArgStrongSuggestionPreds relationExistsAll keStrongConsiderationInverse relationAllExists 13
Mapping Assertions to a Less Expressive Ontology Assertions that cannot be asserted in that ontology need not be ignored when generalized forms of such assertions can be generated. Can generalize relations Can generalize/specialize arguments Useful meta-predicates for specializing arguments are #$transitiveViaArg and #$transitiveViaArgInverse. 14.
Inferred Mapping Ontology 1 sentence to map: (residesInRegion GeorgeWBush NorthwestWashington) Ontology 2 cannot express since does not have NorthwestWashington, but has DC: (synonymousExternalConcept (TerritoryFn CityOfWashingtonDC) Ontology2 "washingtonDCUSA") (synonymousExternalConcept residesInRegion Ontology2 "resides") (synonymousExternalConcept GeorgeWBush Ontology2 "georgeWalkerBush"). 15
Inferred Mapping (cont.) Cyc knows that #$residesInRegion is transitive via #$geographicalSubRegions : (transitiveInArgInverse residesInRegion 2 geographicalSubRegions) Cyc knows that Northwest DC is part of DC: (geographicalSubRegions (TerritoryFn WashingtonDC) NorthwestWashington) Cyc concludes something directly mappable: (residesInRegion GeorgeWBush (TerritoryFn WashingtonDC)) And translates: georgeWalkerBush[resides->> washingtonDCUSA]. 16
Summary New classes, relations, and individuals are created to merge in a foreign ontology. Unwanted terms are functionally linked. Special forms are used for irregular mappings. Tools suggest new knowledge to enter. Imported assertions have own context. On export, assertions are generalized (weakened) as necessary. 17
(and (isa ?QUESTION RequestingInformation) (topicOf ?QUESTION (SubcollectionOfWithRelationToTypeFn OntologyMerging informationTerminal CycTheCollection)) (startsRelativeToEndOf ?QUESTION (MinutesDuration 15) Now)) Questions?. 18
CycL Syntax Lisp-format syntax –(relation ?ARG1... ?ARGN) –If relation is a predicate the value is true or false. –If relation is a function, the value is an instance of the function’s result class. Constant syntax: #$Name ( can be displayed w/out “ #$ ” ) Names have initial letter and subsequent letters, numbers, ‘-’, and/or ‘_’ Variable syntax: ?VARIABLE A1
CycL Syntax Conventions Name is lower-case for predicates, otherwise starts with capital letter Function name ends in " Fn " (and start with capital letter) No Class/Individual syntactical distinction Multi-word names have initial caps for all interior words Names, incl. variable names, should imply semantics A2
CycL Terminology #$Collection = Class, something that has instances #$Individual –non-set/class (incl. relations, strings, numbers) #$FirstOrderCollection (“concept”) –Class whose instances are only #$Individual s #$isa = instance of #$genls = subclass of (“is-a”) #$genlPreds = subrelation ( #$genlInverse...) #$Microtheory -- context for assertions – logical world in which formula is true #$genlMt = all assertions lift to first context Operators are built-in code-supported #$Predicate s: –#$and #$or #$not #$implies #$arity #$thereExists –#$assertedSentence #$knownSentence #$arg1Isa –#$interArg[Isa/Genl/Reln/Format]N-M #$resultIsa A3
CycL Examples Sentence: (#$isa #$Individual #$Collection) Rule: (#$implies (#$and (#$isa ?INSTANCE ?CLASS) (#$genls ?CLASS ?SUPERCLASS)) (#$isa ?INSTANCE ?SUPERCLASS)) Function: (#$comment (#$BodyPartFn #$DouglasFoxvog #$Nose) “Doug Foxvog’s nose”) Argument Type Specification: (#$isa #$BodyPartFn #$UnaryFunction) (#$arg1Isa #$BodyPartFn #$Animal) (#$arg2Isa #$BodyPartFn #$UniqueAnatomicalPartType) (#$resultIsa #$BodyPartFn #$AnimalBodyPart) (#$resultIsaArg #$BodyPartFn 2) A4 (#$BodyPartFn #$DouglasFoxvog (LeftFn #$Ear)) “Doug Foxvog’s left ear”)