Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Foundations IV: Ontology Evolution and Knowledge Management Class Session 8 Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI-6962-01.

Similar presentations


Presentation on theme: "1 Foundations IV: Ontology Evolution and Knowledge Management Class Session 8 Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI-6962-01."— Presentation transcript:

1 1 Foundations IV: Ontology Evolution and Knowledge Management Class Session 8 Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI-6962-01 October 25, 2010

2 Review Ontology Evolution Environment (Chimaera) Reading Assignment Any comments, questions? 2

3 3 Semantic Web Methodology and Technology Development Process Establish and improve a well-defined methodology vision for Semantic Technology based application development Leverage controlled vocabularies, et c. Use Case Small Team, mixed skills Analysis Adopt Technology Approach Leverage Technology Infrastructure Rapid Prototype Open World: Evolve, Iterate, Redesign, Redeploy Use Tools Science/Expert Review & Iteration Develop model/ ontology

4 Some Current Motivating Trends More applications are depending on background ontologies for : Site Organization, Query expansion, Integrity checking, … Systems are increasingly hybrid… thus requiring integration with many other systems There are an increasing number of existing vocabularies / taxonomies / ontologies that are official or defacto standards Applications are becoming more long lived, thus requiring evolution and maintenance …

5 Approach for today Introduce one early ontology evolution environment from 2000 (including historical motivation and capabilities) Discuss its strengths and weaknesses Followed by group discussion of where this area is evolving and should evolve today

6 Motivation: Ontology Integration Trends Integrated in most search applications –General search from 2000: Yahoo, Lycos, Xift, …) –General search today: Yahoo, keyword advertising and more –Scientific search today: e.g. Noesis Core component of E-Commerce apps (Amazon, eBay, Virtual Vineyards, REI, etc.) Integrated in configuration applications (Dell, PROSE, …. Sun, SAP, Trilogy, …)

7 Motivation: Ontology Evolution Controlled vocabularies abound (SIC-codes, UN/SPSC, RosettaNet, OpenDirectory,…) Distributed ownership/maintenance Larger scale –(Open Directory >23.5K editors, ~250K categories, 1.65M sites – true in 2000) –(Open Directory (now arguably less mainstream has over 81K editors, and 590K directories, and >4.6M sites) Becoming more complicated - Moving to classes and slots (and value restrictions, enumerated sets, cardinality)

8 Motivation: Science Ontologies Today Growing awareness and consensus on science taxonomies/ontologies (SWEET, ChemML, …) Growing interest in community ownership/maintenance Editors are less typically trained in computer science…. thus tools need to be aimed at broader audiences Domain ontologies are growing more complicated – GO, BioPortal, … Domain specific environments are starting to emerge

9 Chimaera – A Merging and Diagnostic Ontology Environment Web-based tool utilizing the KSL Ontolingua platform that supports: merging multiple ontologies found in distributed environments analysis of single or multiple ontologies attention focus in problematic areas simple browsing and mixed initiative editing

10 Historical Setting Large government sponsored project Broad ontology needs – CIA World Fact Book, terrorist information, biological and chemical knowledge, weapon knowledge, … Number of ontologies approaching 100 Large sets of facts – mined from natural language text Complicated question set Distributed work force (many without much training in computer science) Time pressure

11 Historical KB Analysis Task Review KBs that: – Were developed using differing standards – May be syntactically but not semantically validated – May use differing modeling representations – May have different purposes Produce KB logs (in interactive environments) – Identify provable problems – Suggest possible problems in style and/or modeling – Are extensible by being user programmable End user humans (but not with extensive training)

12 The (General) Need For KB Analysis Large-scale knowledge repositories will necessarily contain KBs produced by multiple authors in multiple settings KBs for applications will typically be built by assembling and extending multiple modular KBs from repositories that may not be consistent KBs developed by multiple authors will frequently – Express overlapping knowledge in different, possibly contradictory ways – Use differing assumptions and styles For such KBs to be used as building blocks - They must be reviewed for appropriateness and “correctness” That is, they must be analyzed

13

14

15

16

17 Historical KB Merging Needs One large reasoning task using many large scale KBs KBs for applications were required to extend existing ontologies (CYC and others) Overlapping ontologies and instance data For such KBs to be used together as building blocks - Their representational differences must be reconciled

18 The KB Merging Task Combine KBs that: – Were developed independently (by multiple authors) – Express overlapping knowledge in a common domain – Use differing representations and vocabularies Produce merged KB with –Non-redundant –Coherent –Unified vocabulary, content, and representation

19 How KB Merging Tools Can Help –Combine input KBs with name clashes Treat each input KB as a separate name space –Support merging of classes and relations Replace all occurrences by the merged class or relation Test for logical consistency of merge (e.g. instances/subclasses of multiple disjoint classes) Actively look for inconsistent extensions –Match vocabulary Find name clashes, subsumed names, synonyms,... –Focus attention Portions of KB where new relationships are likely to be needed E.g., sibling subclasses from multiple input KBs –Derive relationships among classes and relations Disjointness, equivalence, subsumption, inconsistency,...

20

21

22

23

24 Merging Tools Merging can be arbitrarily difficult –KBs can differ in basic representational design –May require extensive negotiation among authors Tools can significantly accelerate major steps KB merging using conventional editing tools is –Difficult  Labor intensive  Error prone Hypothesis: tools specifically designed to support KB merging can significantly –Speed up the merging process –Make broader user set productive –Improve the quality of the resulting KB

25

26 Chimaera Usage HPKB program – analyze diverse KBs, support KR novices as well as experts Cleaning semi-automatically generated KBs Browsing and merging multiple controlled vocabularies (e.g., internal vocabularies and UN/SPSC (std products and services codes)) Reviewing internal vocabularies

27 OIE Intro and Demo 27

28 Jiao Tao, Li Ding, Deborah L. McGuinness Tetherless World Constellation Rensselaer Polytechnic Institute Troy, NY, USA Instance Data Evaluation on the Semantic Web

29 Data Evaluation on the SW: State of the Art On the semantic web, instance data often accounts for orders of magnitude more data than ontology (Ding & Finin 2006). However most data evaluation work (Rocha et al. 1998) focuses on ontology evaluation, i.e., checking whether the ontologies correctly describe the domain of interests. There is very little, if any, work on evaluating the conformance between ontologies and instance data.

30 1. Create domain knowledge as ontologies O (semantic expectations) Web O O D D 3. Instantiate ontologies as instance data D 4. Publish D O O 2. Acquire ontologies Does D follow the semantic expectations in O ? No syntax errors? Instance Data Evaluation on the Semantic Web Semantic expectation mismatches: (i) Logical inconsistencies (ii) Integrity issues

31 Generic Evaluation Process (GEP) Load instance data D –Is loading failing? Parse instance data D –Is D syntactically correct? Load referenced ontologies O = {O 1,O 2, …} –Is O i reachable? where O i defines the terms used by D. Inspect logical inconsistencies in D –Is O i logically consistent? –Merge all consistent referenced ontologies into O' –Are D+O’ logically consistent? Inspect integrity issues in D –Compute DC = INF(D,O') which includes all triples in D and O', and all inferred sub-class/sub-property relations –Is there any integrity issue in D?

32 Integrity Issues Unexpected Individual Type (UIT) Issue –rdfs:domain –rdfs:range –owl:allValuesFrom Redundant Individual Type (RIT) Issue Non-specific Individual Type (NSIT) Issue Missing Property Value (MPV) Issue –owl:cardinality –owl:minCardinality Excessive Property Value (EPV) Issue –owl:cardinality –owl:maxCardinality

33 Make sure all instances of wine have a Maker Example: MPV

34 SPARQL Solutions for Integrity Issue Detection

35 Implementation and Evaluation Demo: TW OIE Service http://onto.rpi.edu/demo/oie/ http://onto.rpi.edu/demo/oie/ Comparative experiment results

36 References T. Berners-Lee, J. Hendler, and O. Lassila, The Semantic Web: A New Form of Web Content that Is Meaningful to Computers Will Unleash a Revolution of New Possibilities, Scientific American, pp. 34–43, 2001. J. Davies, M. Lytras, and A. Sheth, Semantic-Web-Based Knowledge Management, IEEE Internet Computing, Vol. 11, No. 5, pp. 14-6, 2007. L. Ding, and T. Finin, Characterizing the Semantic Web on the Web, ISWC, pp. 242-257, 2006. R. A. Rocha, S. M. Huff, P. J. Haug, D. A. Evans, and B. E. Bray, Evaluation of a Semantic Data Model for Chest Radiology: Application of a New Methodology, Methods of Information in Medicine, Vol. 37, No.4-5, pp. 477-490, 1998. D. L. McGuinness and P. Pinheiro da Silva. Explaining Answers from the Semantic Web: The Inference Web Approach. Journal of Web Semantics. Vol.1 No.4., pp 397-413, 2004.

37 Ontology Evaluation Presentation 37

38 Ontology Evaluation: Methods and Metrics

39

40 The Problem: Ontology Elephants

41 The Problem

42 Objective: Evolve to Science and Engineering Discipline for Ontology

43 Approach:

44 Research Plan: (1) Identify use cases

45 Research Plan (2) Develop Metrics

46 Examples

47 Example (1): BioPAX - Lack of Fluency

48 Example (2): Habitat Lite: Correctness and Completeness

49 Example 3: Enable Influenza Research (proposed construction and subsequent evaluation)

50

51 Impact

52 Why MITRE

53 Additional notes on specific slides

54 Background References

55 Background Prior Technical Approaches

56 Background Philosophical and Methodological Approaches

57 Discussion in its time Ontologies are becoming more central to applications, they are larger, more distributed, and longer-livedOntologies are becoming more central to applications, they are larger, more distributed, and longer-lived Environmental support (in particular merging and diagnostic support) is more critical for the broader user baseEnvironmental support (in particular merging and diagnostic support) is more critical for the broader user base Chimaera provides merging and diagnostic support for ontologies in many formatsChimaera provides merging and diagnostic support for ontologies in many formats It improves performance over existing toolsIt improves performance over existing tools It has been used by people of various training backgrounds in government and commercial applications and is available for use.It has been used by people of various training backgrounds in government and commercial applications and is available for use. http://www.ksl.Stanford.EDU/software/chimaera/ -movie, tutorial, papers, link to live system, etc.http://www.ksl.Stanford.EDU/software/chimaera/ -movie, tutorial, papers, link to live system, etc.http://www.ksl.Stanford.EDU/software/chimaera/

58 Discussion Today Chimaera addressed merging and diagnostic tasks directly It aimed at focusing human attention where humans would make updates It aimed to make users function at a higher level of training that what they had Evolved to 3 industrial systems: VerticalNet, Cisco, Sandpiper TW Ontology Instance Evaluator – next generation of diagnostics. Same general approach for tests; different technological foundation; focused on some other issues such as open world / closed world issues

59 OntologyBuilder

60 Configurationhttp://www.research.att.com/sw/tools/classic/tm/ijcai-95-with-scenario.html

61

62

63

64

65 Drexel Univ. IHI – Inst. For Healthcare Informatics

66 Ontology Creation and Maintenance Environment Needs Diagnostics/Explanation (Chimaera, CLASSIC,…) Merging and Difference (Chimaera, Prompt, Ontolingua, …) Translators/Dumping (Ontolingua, …) Distributed Multi-User Collaboration (OntologyBuilder,…) Versioning (OntologyBuilder,…) Scalability. Reliability, Performance, Availability (Shoe,OntologyBuilder,…) Security (viewing, updates, abstraction, authoritative sources…) Ontology Library systems (Ontolingua, DAML, PlanetOnt, …) Business needs – internationalization, compatibility with standards (XML,…) Provenance support (languages and environments)

67 Prompt Today http://protege.cim3.net/cgi-bin/wiki.pl?Prompt compare versions of the same ontology map one ontology to another move frames between included and including project merge two ontologies into one extract a part of an ontology

68 TW OIE In many ways, a next generation diagnostic environment Aimed at instance evaluation Uses very different approach – different approach to reasoning and checking We will consider this and related efforts in our discussion of priorities for your own environments you would design today

69 Today’s Environment Discussion of historical tools and requirements in today’s environment –Consider technological changes –Social issues (more distributed, more collaborative, …) –Design considerations for long lived systems… may lead to social and modeling conventions including naming “e.g. has-…”, separating out information that is likely to change over time, …

70 Evolving Environment Needs - Facilitated Discussion Diagnostics/Explanation Merging and Difference Translators/Dumping Distributed Multi-User Collaboration Versioning Scalability. Reliability, Performance, Availability Security Ontology Library systems Business needs Provenance support

71 Extras


Download ppt "1 Foundations IV: Ontology Evolution and Knowledge Management Class Session 8 Deborah McGuinness and Joanne Luciano With Peter Fox and Li Ding CSCI-6962-01."

Similar presentations


Ads by Google