Download presentation
Presentation is loading. Please wait.
2
Ontology: A Guide for the Intelligence Analyst
Barry Smith
3
Problem of ensuring sensible cooperation in a massively interdisciplinary community
concept type instance model representation data
4
What do these mean? ‘conceptual data model’ ‘semantic knowledge model’
‘reference information model’ ‘an ontology is a specification of a conceptualization’
5
help is on the way ...
6
national center for ontological research
8
ECOR Partner Institutions
Laboratory for Applied Ontology, Trento/Rome Center for Theoretical and Applied Ontology, Turin Foundational Ontology Group, University of Leeds JCOR – Japanese Center for Ontological Research
9
Ontologies (tech.) Ontology (phil.)
Standardized classification systems which enable data from different sources to be combined Ontology (phil.) The theory of being
10
The need strong general purpose classification hierarchies created by domain specialists thoroughly tested in real use cases to help us navigate through oceans of data
11
Good ontologies should be
intelligible to human beings computationally useful capable of being glued together
12
The actuality (too often)
myriad special purpose ‘light’ ontologies, prepared by ontology engineers and deposited in internet ‘repositories’ or ‘registries’ which only create NEW oceans of data
13
Schemaweb ontologies (http://www.w3.org/)
MusicBrainz Metadata Vocabulary Musical Baton Vocabulary Beer Ontology Kissology
14
‘Lite’ ontologies often do not generalize …
repeat work already done by others are not gluable together no roadmap for progressive improvement reproduce the very problems of communication which ontology was designed to solve
15
Ontology (science) The empirical study of how to build humanly useful and computationally tractable representations of entities and of the relations between them Evidence-based terminology research
16
Why NCOR? Why NCOR? NCOR will advance ontology as science
develop measures of quality for ontologies to establish best practices
17
Why NCOR? NCOR will provide coordination and support for investigators working in ontology and its applications engage in outreach endeavors designed to foster the goals of high quality ontology in both theory and practice advance ontology education
18
National Center for Biomedical Ontology
$18.8 mill. NIH Roadmap Center Stanford Medical Informatics University of San Francisco Medical Center Berkeley Drosophila Genome Project Cambridge University Department of Genetics The Mayo Clinic University at Buffalo Department of Philosophy
19
From chromosome to disease
20
… legacy of Human Genome Project
genomics transcriptomics proteomics reactomics metabonomics phenomics behavioromics connectomics toxicopharmacogenomics bibliomics … legacy of Human Genome Project
22
need for semantic annotation of data
where in the body ? what kind of disease process ? need for semantic annotation of data dir.niehs.nih.gov/ microarray/datamining/
23
Woops: 54M already ! Compare with 3M Dec 2004, and 12 M june 2005 when I did this.
25
natural language labels
to make the data cognitively accessible to human beings dir.niehs.nih.gov/ microarray/datamining/
26
compare: legends for maps
27
ontologies are legends for data
dir.niehs.nih.gov/ microarray/datamining/
28
compare: legends for cartoons
29
legends help human beings use and understand complex representations of reality help human beings create useful complex representations of reality help computers process complex representations of reality
30
computationally tractable legends
help human beings find things in very large complex representations of reality
32
ontologies are legends for images
33
what lesion ? what brain function ?
34
which period? which architectural style? which type of building?
35
ontologies are legends for mathematical equations
xi = vector of measurements of gene i k = the state of the gene ( as “on” or “off”) θi = set of parameters of the Gaussian model ...
36
ontologies are legends for word lists
...and the Computer's View name education CV private work © 2006 Adam Pease, Articulate Software Slide inspired by Frank von Harmelan Slide inspired by Frank von Harmelan
37
The Idea GlyProt MouseEcotope DiabetInGene GluChem sphingolipid
transporter activity DiabetInGene GluChem
38
annotation using common ontologies yields integration of databases
MouseEcotope GlyProt Holliday junction helicase complex DiabetInGene GluChem
39
Glue-ability / integration
rests on the existence of a common benchmark called ‘reality’ the ontologies we want to glue together are representations of what exists in the world not of what exists in the heads of different groups of people
40
truth is correspondence to reality
41
simple representations can be true
42
there are true cartoons
43
a cartoon can be a veridical representation of reality
44
a network diagram can be a veridical representation of reality
47
pathway maps are representations of complexes of types
48
maps may be correct by reflecting topology, rather than geometry
49
an image can be a veridical
representation of reality a labeled image can be a more useful veridical representation of reality
50
an image labelled with computationally tractable labels can be an even more useful veridical representation of reality
51
annotations help us to find images
52
annotations using common ontologies can yield integration of image data
53
and link image databases together
Gazetteer GlyProt ruins of Hadrami mosque CIA Factbook GluChem
54
if you’re going to semantically annotate piles of data, better work out how to do it right from the start
55
two kinds of annotations
56
names of types
57
names of instances
58
instances vs. types dir.niehs.nih.gov/ microarray/datamining/
59
instances vs. types types dir.niehs.nih.gov/ microarray/datamining/
60
instances
61
molecular images and radiographic images are representations of instances
62
First basic distinction
type vs. instance (science text vs. diary) (human being vs. Tom Cruise)
63
For ontologies it is generalizations that are important = ontologies are about types, kinds
64
Inventory vs. Catalog: Two kinds of representational artifact
Databases represent instances Ontologies represent types
65
Catalog vs. inventory A 515287 DC3300 Dust Collector Fan B 521683
Gilmer Belt C 521682 Motor Drive Belt
66
Catalog vs. inventory
67
Catalog of types/Types
68
Ontology types Instances
69
Ontology = A Representation of types
70
An ontology is a representation of types
We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories experiments relate to what is particular science describes what is general
71
object types organism animal cat mammal siamese frog instances
72
Ontologies are here
73
or here
74
ontologies represent general structures in reality (leg)
75
Ontologies do not represent concepts in people’s heads
76
They represent types in reality
77
which provide the benchmark for integration
78
My job here Not tools: Leo Obrst, Chris Welty
Not instances: Werner Ceusters Ontology content : the types in reality
79
How to build an ontology
create an initial top-level classification of your domain = ~50 most common types of entities arrange the corresponding expressions terms into an informal is_a hierarchy according to this universality principle A is_a B every instance of A is an instance of B fill in missing terms to give a complete hierarchy move on to populate the lower levels of the hierarchy) annotate your data
80
Example domain: threat, vulnerability
Eric Little
81
Example domain: The ontology of documents
Hernando de Soto
82
valuable work on ‘documents’ in the context of XML, etc.
e.g. Bob Glushko: “A document is a purposeful and self-contained collection of information.” focuses on information content, but there is more than information here
83
transactional documents
passport contract tax form bill of lading shipping authorization plane ticket visa
84
the legal powers of documents
the social interactions in which they play a role the institutional systems to which they belong provenance (original, copy, forgery ...)
85
document vs. attachments
signatures, seals, stamps ...
87
anchoring documents to reality
88
Countersignatures
89
document template vs. filled-in document
document vs. the piece of paper (or other physical carrier) upon which it is written/printed, ...
91
Standardized documents
filled in completely/partially correctly/incorrectly validly/invalidly
92
from the Shiprock Navajo fair New Mexico, September 30-October 1, 2005
93
Standardized documents
allow networking across time across space (individuals linked through document systems) improve the flow of communications allow standardized transactions
94
Documents are artifacts
analogous to organizations, rules, prices, debts, claims and obligations ...
95
John Searle The Construction of Social Reality
claims and obligations are brought into existence by the performance of speech acts
96
appointings, marryings, promisings
change the world We perform a speech act ... the world changes, instantaneously
97
The de Soto thesis document systems are mechanisms for creating the institutional orders of modern societies
98
stock and share certificates create capital
marriage licenses create bonds of matrimony statutes of incorporation create companies title deeds create property rights and property owners insurance certificates create insurance coverage
99
Identity documents create identity
and thereby create the possibility of identity theft what is the ontology of identity? what is the epistemology of identity (the technologies of identification)?
100
What you can do with a document
sign it stamp it witness it fill it in revise it nullify it deliver it (de facto, de jure) ...
101
types of document systems
types of document acts types of document systems types of document pathways
102
if you’re going to semantically annotate piles of data, better work out how to do it right from the start
103
Tomorrow: The problems, and a strategy for the future
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.