Download presentation
Presentation is loading. Please wait.
1
Ontology Evolution Mark A. Musen Stanford University
2
Ontology Design Criteria (after Gruber) Clarity Definitions should be objective and complete Coherence The ontology should sanction those inferences consistent with the definitions Extendibility An ontology should anticipate future uses Minimal encoding bias No assumptions about knowledge representation Minimal ontological commitment
3
Trade-offs in Ontology Design Minimizing ontological commitment requires specifying a weak theory Making definitions precise requires increasing ontological commitment Anticipating various uses of the ontology may require increasing the number of concepts represented Making an ontology maximally general may make it useless for any specific application
4
Common problems when people build ontologies Classes are not defined at useful levels of abstraction (e.g., is it necessary to distinguish among mammals or terriers?) Class definitions are overloaded (e.g., is it helpful to have the class red bicycle?) Hierarchical relationships are not uniformly taxonomic (e.g., amino acid is a subclass of protein) The world (or our perception of it) changes
5
The world does change! What happened to the ether? To phlogiston? What happened to diseases such as dropsy, consumption, neuresthenia, “gay lymph node syndrome”? What happened to HTLV III? When did scurvy become a curable disease? When did the central dogma of biology first break down? When did Poland begin to exist?
6
Suggested Upper Merged Ontology (SUMO)
7
Part of the CYC Upper Ontology
8
A Parable: Protocol-Based Advisories
11
Protocol-Based Advisories
12
The ONCOCIN system (ca. 1986)
13
ONCOCIN: Object structure drives inference
14
OPAL first elicited the overall algorithm for the protocol
15
Clicking on “VAM” in the graph brought up a form for entering the constituent drugs
16
We construed ONCOCIN’s PSM as Episodic Skeletal Plan Refinement (ESPR) 1.Planning entities form skeletal plan 2.Task-level actions modify customary execution of the plan 3.Input data predicate actions
17
PROTÉGÉ-1 (ca. 1987) A meta-level knowledge-entry system for generating knowledge-entry systems like OPAL Assumed an ontology of the ESPR method (but we didn’t call in that, since no one other than Barry knew about ontologies in 1987) Demonstrated in domains of oncology and hypertension clinical trials—allowing rapid generation of custom-tailored knowledge-entry tools
18
Attributes of a Planning Entity
20
What was PROTÉGÉ-1 doing? The system started with a an ontology of the kinds of data on which the ESPR method operates Developers subclassed the entities in that ESPR ontology to define the domain entities that relate to skeletal planning in a particular application area (e.g., oncology, hypertension) The system used the subclasses to generate UIs –For entry of instance-level knowledge (e.g., that of particular clinical protocols) –For creating the electronic spreadsheet for interacting with clinical users
21
PROTÉGÉ-1 asked users to subclass the ESPR method ontology 1.Planning entities form skeletal plan 2.Task-level actions modify customary execution of the plan 3.Input data predicate actions
22
“Ontology development as subclassing” was not sustainable Subclassing entities in the ESPR “method ontology” did ensure that anything we said about the domain immediately had an operational semantics There were lots of things that we wanted to say about the domain unrelated to the ESPR method
23
The Parable Continues Therapy Helper, ca. 1992
24
Mapping domain ontologies to problem-solving methods ESPR Domain Ontology (e.g., clinical data, treatment history) Method Input Ontology (e.g., skeletal plan, input data) Method Output Ontology (e.g., fully formed plan) Therapy Helper: Protocol-Based Care for HIV/AIDS
27
ESPR Method Ontology
28
T-Helper Application Ontology
29
Mapping domain ontologies to problem-solving methods ESPR Domain Ontology (e.g., clinical data, treatment history) Method Input Ontology (e.g., skeletal plan, input data) Method Output Ontology (e.g., fully formed plan) Therapy Helper: Protocol-Based Care for HIV/AIDS
30
EON: Middleware that abstracts from T-Helper The debut of Protégé/Win
31
Protégé/Win KA tool
32
The EON ontology continued to evolve Support for concurrent actions Coordination of processes Data abstraction from the primary inputs (electronic medical record) Temporal data abstraction from primary data Contextualization of actions into “scenarios of care”
33
All the ontology changes took place at the “macro” level Major shifts in distinctions made about the world (e.g., stereotypic “scenarios”) Major new capabilities of underlying systems (e.g., ability to drive reasoning from large numbers of automatically acquired data) While all this was happening: Countless changes in small, individual modeling decisions
34
In the real world, ontolgies change all the time The number of distinctions that we can make about the world is practically infinite We have to start somewhere! We constantly must make new distinctions because –Our needs change –Our view of reality changes –We finally get around to it …
35
Supreme genus: SUBSTANCE Subordinate genera: BODYSPIRIT Differentiae: material immaterial Differentiae: animate inanimate Differentiae: sensitive insensitive Subordinate genera: LIVING MINERAL Proximate genera: ANIMALPLANT Species: HUMANBEAST Differentiae: rational irrational Individuals: Socrates Plato Aristotle … Porphyry’s depiction of Aristotle’s Categories
36
Locus of control for group ontology development Centralized –As in the NCI Thesaurus Decentralized –As in the Open Directory Project
37
NCI Enterprise Vocabulary Services 1997: R. Klausner, Director NCI, wanted a “science management system” Know about everything funded by NCI Goals and results – “bench to bedside” -Thereby improve and speed translation of research Approach: 1.Create integrative terminology 2.Evolve terminology scope from supporting grants management to supporting science 3.Build Web-accessible infrastructure – caCORE
38
The NCI Thesaurus
39
NCI Thesaurus Guidelines Develop content model Leverage existing sources as appropriate –MeSH, VA NDF-RT, MedDRA … Develop unique content where needed –Cancer genes, gene products, cancer diagnoses, drugs, chemotherapies, molecular abnormalities etc., and relationships among them Link to other standards using URLs where possible –OMIM, Swissprot, GO
40
: NCI uses a Centralized Process
42
Open Directory Project Started in 1998 as a volunteer effort to develop an open-content directory of Web pages In its first year, 4500 editors had indexed 100K Web sites By July 2005, 69K editors had indexed 4.6M sites using 580K categories On average, between 9K and 10K volunteer editors are working on ODP at any given time
43
Dimensions for Ontology Change Management Central vs. Decentralized control Continuous editing vs. Periodic archiving Curation vs. No curation Monitored editing vs. Nonmonitored editing
44
Monitored editing in Protégé
45
History of Changes is Stored in a “Change Ontology”
46
Workflow for Change Management
47
: The Goal: To Streamline NCI’s Cumbersome Process
48
Why Most Ontologies Stagnate It is tedious to evaluate ontological soundness by inspection It is impossible to evaluate ontological coverage by inspection It is often plain difficult to determine what an ontology is good for by inspection
49
A Portion of the OBO Library
50
Ontologies are not like journal articles It is difficult to judge methodological soundness simply by inspection We may wish to use an ontology even though some portions –Are not well designed –Make distinctions that are different from those that we might want
51
Ontologies are not like journal articles II The utility of ontologies –Depends on the task –May be highly subjective The expertise and biases of reviewers may vary widely with respect to different portions of an ontology Users should want the opinions of more than 2–3 hand-selected reviewers Peer review needs to scale to the entire user community
53
Solution Snapshot
54
In an “open” rating system: Anyone can annotate an ontology to say anything that one would like Users can “rate the raters” to express preferences for those reviewers whom they trust A “web of trust” may allow users to create transitive trust relationships to filter unwanted reviews
55
Possible Review Criteria What is the level of user support? What documentation is available? What is the granularity of the ontology content in specific areas? How well does the ontology cover a particular domain? In what applications has the ontology been used successfully? Where has it failed?
56
An ontology of “marginalia” would go a long way
57
Trade-offs in Ontology Design Minimizing ontological commitment requires specifying a weak theory Making definitions precise requires increasing ontological commitment Anticipating various uses of the ontology may require increasing the number of concepts represented Making an ontology maximally general may make it useless for any specific application
58
The National Center for Biomedical Ontology One of three National Centers for Biomedical Computing launched by US NIH in 2005 Collaboration of Stanford, Berkeley, Buffalo, Mayo, Victoria, UCSF, Oregon, and Cambridge Primary goal is to make ontologies accessible and usable Research will develop technologies for ontology indexing, alignment, and peer review
59
The Center move us beyond individual, one-off ontologies and one-off tools to: Integrated ontology libraries in cyberspace Meta-data standards for ontology annotation Comprehensive methods for ontology indexing and retrieval Easy-to-use portals for ontology access, annotation, and peer review End-user platforms for putting ontologies to use for –Data annotation –Decision support –Natural-language processing –Information retrieval –And applications that we have not yet thought of!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.