Flarenet-Silt workshop on Ontology and Lexicon September-19 th -2009, Pisa Division of semantic labor over vocabulary and ontology layers Piek Vossen,

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

By David J Smith. If the worlds population was 100 people...
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
COSMO and the Defining Vocabulary: Next Steps The Foundation Ontology as a Conceptual Defining Vocabulary Patrick Cassidy Ontology Summit 2007 April 23,
Exploiting ebXML Registry Semantics in the eHealth Domain*
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
Building Wordnets Piek Vossen, Irion Technologies.
FP7, Information Day Call 5, Luxembourg, May 11-12, 2009 KYOTO (ICT ) Yielding Ontologies for Transition-Based Organization FP7: Intelligent Content.
Division of semantic labor in the Global WordNet Grid Piek Vossen, VU University Amsterdam German Rigau, University of the Basque Country 5 th Global Wordnet.
Maritime Knowledge Base Semantic Application Semantic Exchange Workshop February 17th, 2009 Eric Freese Semantic Web, XML & Geospatial Technologist Copyright.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
FACTORING Think Distributive property backwards Work down, Show all steps ax + ay = a(x + y)
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Limitations of the relational model 1. 2 Overview application areas for which the relational model is inadequate - reasons drawbacks of relational DBMSs.
1 ICS-FORTH & Univ. of Crete SeLene November 15, 2002 A View Definition Language for the Semantic Web Maganaraki Aimilia.
|epcc| NeSC Workshop Open Issues in Grid Scheduling Ali Anjomshoaa EPCC, University of Edinburgh Tuesday, 21 October 2003 Overview of a Grid Scheduling.
Copyright 2006 Digital Enterprise Research Institute. All rights reserved. MarcOnt Initiative Tools for collaborative ontology development.
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
1/ 26 AGROVOC and the OWL Web Ontology Language: the Agriculture Ontology Service - Concept Server OWL model NKOS workshop Alicante,
ZMQS ZMQS
George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.
Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,
Configuration management
© 2011 TIBCO Software Inc. All Rights Reserved. Confidential and Proprietary. Towards a Model-Based Characterization of Data and Services Integration Paul.
ABC Technology Project
1 Evaluations in information retrieval. 2 Evaluations in information retrieval: summary The following gives an overview of approaches that are applied.
Squares and Square Root WALK. Solve each problem REVIEW:
1 ISWC-2003 Sanibel Island, FL IMG, University of Manchester Jeff Z. Pan 1 and Ian Horrocks 1,2 {pan | 1 Information Management.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
OBJECT-ORIENTEDNESS. What is Object-Orientedness? model system as a collection of interacting objects O-O Modelling  O-O Programming ► similar conceptual.
Chapter 5 Test Review Sections 5-1 through 5-4.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Use Case: Populating Business Objects.
Co-funded by the European Union Semantic CMS Community Designing Semantic CMS – Part I Copyright IKS Consortium 1 Lecturer Organization Date of presentation.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Week 1.
We will resume in: 25 Minutes.
How Cells Obtain Energy from Food
McGraw-Hill©The McGraw-Hill Companies, Inc., 2001 Chapter 16 Integrated Services Digital Network (ISDN)
Enrichment of Library Authority Files by Linked Open Data Sources
Balancing Lexicographic and Ontological Considerations in Ontology Development First International Workshop on Ontological Analysis Trento, IT July,
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
歐盟科研架構計畫之人文及社會科學領域 (EU-FP7 SSH) 計畫徵求 說明會 國立中山大學 - 歐盟科研架構計畫之人文及社會科學國家聯絡據點 KYOTO (ICT ) Yielding Ontologies for Transition-Based Organization.
The Semantic Web – WEEK 5: RDF Schema + Ontologies The “Layer Cake” Model – [From Rector & Horrocks Semantic Web cuurse]
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Metadata Schema for CERIF Andrei Lopatenko Vienna University of Technology
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Linked Open Data Dataset from Related Documents Petya Osenova and Kiril Simov IICT-BAS LDL-2016, LREC, Portoroz.
KYOTO (ICT ) Knowledge Yielding Ontologies for Transition-Based Organization Intelligent Content and Semantics The First KYOTO Workshop February.
ece 627 intelligent web: ontology and beyond
ece 720 intelligent web: ontology and beyond
Presentation transcript:

Flarenet-Silt workshop on Ontology and Lexicon September-19 th -2009, Pisa Division of semantic labor over vocabulary and ontology layers Piek Vossen, VU University Amsterdam

Flarenet-Silt Workshop, September 19th 2009, Pisa 2 Modeling knowledge in a domain Knowledge needs to be divided over different lexical and ontological layers because: –the volume of terms and concepts is too large –the terms are linguistically too diverse Division of knowledge over different layers implies: –Precisely define the relations between lexical and ontological layers –Precisely define the inferencing based on the distributed knowledge layers

Flarenet-Silt Workshop, September 19th 2009, Pisa 3 Repositories for Kyoto project on the environment domain Term database: 500,000 terms per 1,000 documents per language Open data project: –DBPedia: 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies. The knowledge base consists of 274 million pieces of information (RDF triples). –GeoNames: 8 million geographical names and consists of 6.5 million unique features whereof 2.2 million populated places and 1.8 million alternate names. Domain thesauri and taxonomies: Species 2000: 2,1 million species Wordnets for 7 languages: about 50,000 to 120,000 synsets per language Ontologies: SUMO, DOLCE, SIMPLE

Domain T TV TV V T TT Species Domain Domain Domain Kyoto Knowledge Base Ontology Base concepts Wn DBPediaTerms 500K 2,100K DOLCE/OntoWordnet Terms 500K Species 2,100K Domain V

Flarenet-Silt Workshop, September 19th 2009, Pisa 5 Species in the ontology - Implies to store 2.1 million species twice in the ontology.

Flarenet-Silt Workshop, September 19th 2009, Pisa 6 Should all knowledge be stored in the central ontology? Vocabularies are too large for full inferencing with current reasoners Vocabularies are linguistically too diverse to be represented in an ontology Inferencing capabilities of formal ontologies is not needed for all levels of knowledge

Flarenet-Silt Workshop, September 19th 2009, Pisa 7 Division of linguistic labor principle Putnam 1975: –No need to know all the necessary and sufficient properties to determine if something is "gold" –Assume that there is a way to determine these properties and that domain experts know how to recognize instances of these concepts. –Speakers can still use the word "gold" and communicate useful information

Flarenet-Silt Workshop, September 19th 2009, Pisa 8 Division of semantic labor principle Digital version of Putnam (1975): –Computer does not need to have all the necessary and sufficient properties to determine if something is a "European tree frog" –Computer assumes that there is a way to determine this and that domain experts (people) know how to recognize instances of these concepts. –Computers can still reason with semantics and do useful stuff with textual data

Flarenet-Silt Workshop, September 19th 2009, Pisa 9 What does the computer need to know? Distinction between rigid and non-rigid (Welty & Guarino 2002): –being a "cat" is essential to individual's existence and therefore rigid –being a "pet" is a temporarily role and therefore non- rigid; a cat can become a pet and stop being a pet without ceasing to exist –Felix is born as a cat and will always be a cat, but during some period Felix can become a pet and stop being a pet while it continuous to exist All 2.1 million species are rigid concepts

Flarenet-Silt Workshop, September 19th 2009, Pisa 10 What does the computer need to know? Roles and processes in documents have more information value than the defining properties of species: –Species defined in terms of physical properties already known to expert; –Roles such as "invasive species", "migration species", "threatened species" express THE important properties of instances of species Roles are typically the terms we learn from the text not the species!

Flarenet-Silt Workshop, September 19th 2009, Pisa 11 Ontology relations based on Dolce Perdurants EndurantsQualities Organism PhysicalObject Role OrganismRole BreederRoleMigratorRoleBreededRole Events BreedProcess MigrateProcess HasOffspring hasRole hasNotPreCondition hasPostCondition playRole

Flarenet-Silt Workshop, September 19th 2009, Pisa 12 Wordnet-ontology-relations Rigid synsets: –Synset:Endurant; Synset:Perdurant; Synset:Quality: –sc_equivalenceOf (= relation in WN-SUMO) or sc_subclassOf (+ relation in WN-SUMO) Non-rigid synsets: –Synset:Role; Synset:Endurant –sc_domainOf: range of ontology types that restricts a role –sc_playRole: role that is being played Rigidity can be detected automatically (Rudify, 80% precision, IAG 80%) and is stored in wordnets as attributes to synsets

Flarenet-Silt Workshop, September 19th 2009, Pisa 13 Lexicalization of process-related concepts {obstruct, obturate, impede, occlude, jam, block, close up}Verb, English -> sc_equivalenceOf ObstructionPerdurant {obstruction, obstructor, obstructer, impediment, impedimenta}Noun, English -> sc_domainOf PhysicalObject -> sc_playRole ObstructingRole {migration birds}Noun, English -> sc_domainOf Bird -> sc_playRole MigratorRole {migration}Verb, English -> sc_ equivalenceOf MigrationProcess {migration area}Noun, English -> sc_domainOf PhysicalObject -> sc_ playRole TargetRole

Flarenet-Silt Workshop, September 19th 2009, Pisa 14 Lexicalization of process-related concepts {create, produce, make}Verb, English -> sc_ equivalenceOf ConstructionProcess {artifact, artefact}Noun, English -> sc_domainOf PhysicalObject -> sc_playRole ConstructedRole {kunststof}Noun, Dutch // lit. artifact substance -> sc_domainOf AmountOfMatter -> sc_playRole ConstructedRole {meat}Noun, English -> sc_domainOf Cow, Sheep, Pig -> sc_playRole EatenRole {,, }Noun, Chinese -> sc_domainOf Cow, Sheep, Pig, Rat, Mole, Monkey -> sc_playRole EatenRole { غذاء, لحم, طعام}Noun, Arabic -> sc_domainOf Cow, Sheep -> sc_playRole EatenRole

Flarenet-Silt Workshop, September 19th 2009, Pisa 15 Division of labor in knowledge sources Eleutherodactylus augusti Eleutherodactylus Leptodactylidae Anura Amphibia Chordata Animalia Eleutherodactylus atrabracus barking frog frog:1, toad:1, toad frog:1, anuran:1, batrachian:1, salientian:1 amphibian:3 vertebrate:1,craniate:1 chordate:1 animal:1 Base Concept 2.1 million species100,000 synsets1,000 types endurant physical-object organism endemic frog endangered frog poisonous frog alien frog 500,000 terms Skos database Wordnet-LMFOntology-OWL-DL Term database

Flarenet-Silt Workshop, September 19th 2009, Pisa 16 How to make inferences? Sparql queries to large Virtuoso databases: Aligned Species 2000, DBPedia Sql queries to term database Graph matching on wordnets Reasoning on a small ontology

Flarenet-Silt Workshop, September 19th 2009, Pisa 17 Relations in the ontology Highways in the Humber Estuary obstruct the migration of birds. // endurants// perdurants (subclass, Road, PhysicalObject)(subclass, ObstructionPerdurant, Perdurant) (subclass, Organism, PhysicalObject)(hasRole, ObstructionPerdurant, ObstructingRole) (playedBy, ObstructingRole, PhysicalObject) // roles(subclass, MigrationProcess, Process) (subclass, LocationRole, Role)(hasRole, MigrationProcess, MigratorRole) (subclass, MigratorRole, Role)(hasRole, MigrationProcess, MigrationTargetRole) (subclass, MigrationTargetRole, Role)(playedBy, MigratorRole, Organism) (subclass, ConstructorRole, Role) (subclass, ConstructedRole, Role) (subclass, ObstructingRole, Role) (subclass, ObstructedRole, Role)

Flarenet-Silt Workshop, September 19th 2009, Pisa 18 Instantiation of the ontology Highways in the Humber Estuary obstruct the migration of birds. (instanceOf, 0, Location) <!obstruction (instanceOf, 1, Road)(instanceHasRole, 3, 5) //involves obstructing role (instanceOf, 2, Organism)(instanceHasRole, 3, 6) //involves obstructed role (instanceOf, 3, ObstructionPerdurant)(instanceHasRole, 3, 8) //takes place in location (instanceOf, 4, MigrationProcess)(instancePlay, 1, 5) //highways play this obstructing role (instanceOf, 5, ObstructingRole)(instancePlay, 2, 6) //birds play this obstructed role (instanceOf, 6, ObstructedRole)(instancePlay, 0, 8) //Humber Estuary plays LocationRole (instanceOf, 7, MigratorRole) <!migration (instanceOf, 8, LocationRole)(instanceHasRole, 4, 7) //involves a migrator role (instanceHasRole, 4, 9) //involves target location (instanceHasRole, 4, 10) //has LocationRole (instancePlay, 2, 7) //birds play this migrator role (instancePlay, 0, 8) //Humber Estuary plays location role

Flarenet-Silt Workshop, September 19th 2009, Pisa 19 Ontology relations (DOLCE) Endurant (objects), Perdurant (processes), Quality subClassOf, equivalentTo, generic-constituent relations: Endurant:Endurant, Perdurant:Perdurant, Quality:Quality Role hierarchy below endurant: –OrganismRole -> BreedingRole –MigrationRole -> BirdMigrationRole playedBy relation: Role:Endurant hasRole relation: Perdurant:Role instanceOf: Instance:Endurant/Perdurant instancePlay: Instance:Role

Flarenet-Silt Workshop, September 19th 2009, Pisa 20 How to integrate the data? Species 2000 vocabulary: 2,171,281 concepts in MySql database with parent relations: –Kingdom -> Class -> Order -> Family -> Genus -> Species -> Infra species –Animalia -> Chordata -> Amphibia -> Anura -> Leptodactylidae - > Eleutherodactylus -> Eleutherodactylus augusti Converted to SKOS format Aligned with DBPedia for language labels Aligned with Wordnet using vocabulary and relation mappings Published in Virtuoso, accessed with SPARQL queries

Flarenet-Silt Workshop, September 19th 2009, Pisa 21 How to integrate data? Extending language labels using DBPedia Language Species 2000DBPedia extension English 69,045834,821 Spanish 1,731358,499 Italian 17,552215,511 Dutch 5,397185,437 Chinese 58,77483,756 Japanese 4,625139,754

Flarenet-Silt Workshop, September 19th 2009, Pisa 22 Vocabulary match with Wordnet synsets If polysemous then SSI-Dijkstra weighting of senses based on the hyperonym chain Results still to be evaluated: –Animalia (animal:1)-> Chordata (chordate:1) - > Amphibia (amphibian:3) -> Anura -> Leptodactylidae -> Eleutherodactylus -> Eleutherodactylus augusti (barking frog:1) How to integrate data? Alignment Species 2000 with wordnet

Flarenet-Silt Workshop, September 19th 2009, Pisa 23 Word-sense-disambiguation is applied to terms in KAF (Kyoto Annotation Format) Term hierarchy is extracted from KAF: –land:5 grassland:1 -> biome:1 woodland:1 -> biome:1 cropland urban land Results still to be evaluated: SemEval2010 How to integrate data? Alignment of terms with wordnet

Flarenet-Silt Workshop, September 19th 2009, Pisa 24 Should all knowledge be stored in the central ontology? Vocabularies are too large for full inferencing Vocabularies are linguistically too diverse to be represented in an ontology Inferencing capabilities of formal ontologies is not needed for all levels of knowledge A model of division of labor (along the lines of Putnam 1975) in which knowledge is stored in 3 layers: –SKOS vocabularies and term databases –wordnet (WN-LMF) –ontology (OWL-DL), Each layer supports different types of inferencing ranging from Sparql queries, graph algorithms to reasoning. Mapping relations that support the division of labour and different types of inferencing and that allow for the encoding of language- specific lexicalizations and restrictions.