Download presentation
Presentation is loading. Please wait.
1
USCISIUSCISI SCEC Ontology Development Tom Russ Hans Chalupsky, Stefan Decker, Yolanda Gil, Jihie Kim, Varun Ratnakar University of Southern California Information Sciences Institute
2
USCISIUSCISI Outline Background SCEC Goals Ontology Basics Semantic Interoperability Examples Weather Seismology Building Computational Pathways Ontology Development SCEC Ontology Development Gene Ontology Development Fundamental Ontologies? Big Questions
3
Goals: SCEC/IT Project
4
USCISIUSCISI What is an Ontology? An Ontology is a framework for representing shared conceptualizations of knowledge An Ontology provides: Definitions for objects and relations in the domain Shared vocabulary and and common structure for modeling domain knowledge Domain model/theory that captures common knowledge about the domain
5
USCISIUSCISI Syntactic Interoperability Current technologies focus on syntactic interoperability Standardized protocols, ports (TCP/IP, FTP, HTTP, SOAP, etc.) Standardized data formats (properly lined up bits, arrays,…) Standardized description languages (XML, WSDL, …) Standardized remote invocation mechanisms (RMI, CORBA, …) Interoperation possible as long as exchanged messages/data are syntactically correct (something a compiler could check)
6
USCISIUSCISI Semantic Interoperability Claim: Full E-Science needs semantic interoperability Computer needs to understand what the bits mean (e.g., is this number a wavelength or a frequency of what wave in what context) Facilitates stronger semantic integrity or compatibility checks Facilitates integration (automatic transformations and translations) Facilitates more informed search engines
7
USCISIUSCISI Semantic Interoperability Story SCEC Java code for Community Velocity Model Inputs: longitude and latitude Output: Vs30 (m/s) Connection technology: Java serialization In other words: Ship the bits for two double precision floating point values through a network connection Make sure you send longitude first! – Non-standard convention for geography – Probably based on X-Y convention instead Better: More structured input Latitude=34.15 Longitude=-117.58 Explicit identification of parameters
8
USCISIUSCISI Ontologizing a Domain such as “Weather”
9
Conditions for Joint Tasks (from: CJCSM 3500.04A 9/13/96, p. 3-11.) Identify Relevant Domain Concepts
10
USCISIUSCISI Weather Specification in English (f rom: CJCSM 3500.04A 9/13/96, p. 3-11.) C 1.3.1.3 Weather Definition: current weather (next 24 hours). Descriptors: clear, partly cloudy, overcast, precipitating, stormy C 1.3.1.3.1Air Temperature Definition: atmospheric temperature at ground level Descriptors: Hot(> 85° F) Temperate(40° to 85° F) Cold (10° to 39° F) Very Cold(< 10° F)
11
USCISIUSCISI Formalizing Domain Concepts A knowledge-based system about “Weather” must know things like these: Terms hot, humid, windy... Definitions cold = (10° to 39° F) Relationships cold and windy may overlap cold and hot are disjoint cold and very cold are disjoint! Rules IF heavy rain lasts 2 days THEN muddy terrain and excessive runoff (probability.9)
12
USCISIUSCISI Earthquake Hazard Analog NEHRP Soil Types Soil TypeDescriptionVs (m/s)Rock Types AHard Rock> 1500Unweathered igneous intrusive BRock760 - 1500 750 - 1500 Volcanics, most Mesozoic bedrock, some Franciscan bedrock CSoft Rock360 - 760 350 - 750 Some Quarternary and Tertiary sands, sandstones and mudstones. Some Franciscan melange & serpentinite DStiff Soil180 - 360 200 - 350 Some Quarternary muds, sands, gravels, silts and mud ESoft Soil< 180 < 200 Water-saturated mud and artificial fill
13
USCISIUSCISI (deffunction source-hypocenter ((?s earthquake-source)) :-> (?h location) :documentation "The 3D point where the ruptured started.") (deffunction source-epicenter ((?s earthquake-source)) :-> (?e location) :documentation "The point on the earth's surface directly above the hypocenter" :axioms (=> (earthquake-source ?s) (and (= (latitude-of (source-hypocenter ?s)) (latitude-of (source-epicenter ?s))) (= (longitude-of (source-hypocenter ?s)) (longitude-of (source-epicenter ?s))) (= (depth-of (source-epicenter ?s)) (units 0 "m")))) PowerLoom: Hypocenter vs. Epicenter The epicenter is the point on the surface directly above the hypocenter. “Directly above”, more formally: The latitude and longitude of the epicenter and hypocenter are the same. The epicenter depth is zero.
14
USCISIUSCISI PowerLoom Knowledge representation & reasoning system Uses definitions specified in a formal logic First order predicate calculus Expressive: We can say what we need to Inference via logical deductions Support for units and dimensions Browsing tool: Ontosaurus
15
USCISIUSCISI Ontosaurus Diagrams and images aid domain familiarization Display of formal information and rules Navigation Tools and Control Panel Domain facts. Textual documentation
16
USCISIUSCISI Graphical View: Fault Hierarchy
17
USCISIUSCISI Plan: Building Computational Pathways Simple scenario to illustrate how a user would define computational pathways Behind the scenes, DOCKER uses descriptions of components, their I/O requirements and their constraints to: detect errors in user’s input suggest additional steps needed to make the pathway work make educated guesses about how components selected by the user may be connected to one another
18
USCISIUSCISI Compute PGA for an Address Using These Components Earthquake Forecast Model (USGS-02) Geocoder Fault-type Magnitude Vs30 Distance Community Velocity Model Address Lat/long Fault-type Magnitude Lat/longTime Span Lat/longVs30 Attenuation Relationship (Field-2000) PGA Distance Computation Lat/long1 Distance Lat/long2 Fault-type Magnitude Site Type Distance Attenuation Relationship (Campbell-02) PGA
19
USCISIUSCISI Some Data Paths Connect Easily Earthquake Forecast Model (USGS-02) Geocoder Fault-type Magnitude Vs30 Distance Community Velocity Model Address Lat/long Fault-type Magnitude Lat/long Time Span Lat/longVs30 Attenuation Relationship (Field-2000) PGA Distance Computation Lat/long1 Distance Lat/long2
20
USCISIUSCISI Others Require Transformation Earthquake Forecast Model (USGS-02) Geocoder Fault-type Magnitude Vs30 Distance Address Lat/long Fault-type Magnitude Lat/long Time Span Community Velocity Model Lat/longVs30 Attenuation Relationship (Field-2000) PGA Distance Computation Lat/long1 Distance Lat/long2
21
USCISIUSCISI Developing Ontologies
22
USCISIUSCISI SCEC Ontology Development Task-driven Particular application Modeled on domain inferences & reasoning Small team of Computer Scientists Seismology - Tom Russ Models - Jihie Kim, Varun Ratnakar, Tom Russ Small group of Domain Experts Ned Field and Tom Jordan Future Development and curation by domain experts Requires methodology Requires tools
23
Capture Inference in Ontology Ned Field’s markup of fault parameter data Computation and checking of properties Definitions of Terms
24
USCISIUSCISI The Gene Ontology (GO) Had a successful jumpstart Done by biologists, not knowledge engineers Developed by a wide, distributed community Focused on specific aspects of genomics Fly-base, yeast, mouse Used 24/7 from day 1 Accepted widely by the community Extended based on use requirements of a wide community Quite large (30-40K terms)
25
USCISIUSCISI Jumpstart of Go: Key Decisions (1) Limited scope limit domain, though it could have included many many more areas – not let anyone else in until they got somewhere – Added new groups incrementally (10) 3 related areas open (no licenses), use open standards Involve the community Had to develop own software control over own code KISS: keep it simple stupid – E.g., only two relations Transitivity
26
USCISIUSCISI Key Decisions (2) Use it from the beginning If you wait to have ontology finished before using it you’d never be there Errors would only be discovered through use Set things up so that you are OK when you have to fix those errors (entire chunks of ontology had to be entirely redone) Minimized change impacts by limiting most changes are to rels, which in practice does not impact the annotations Face-to-face meetings 3-4 times a year Satisfied a need for DB users that wanted to ask complex queries (1 query to all DBs) Establish migration path
27
USCISIUSCISI Key Decisions (3) Requests are resolved either: Immediately Over email if can reach closure over 2-3 days – No voting, only consensus on agenda for next meeting Attribution was important Learned that from Flybase Both GO content and annotations are annotated with attribution Unique identifiers within GO The term can change as a lexical string, but no change in meaning and thus no change in identifier Can change defn, but not the GO string, then id changes Small number of relations
28
USCISIUSCISI Fundamental Ontologies What is out there? Not much. Ontolingua (Stanford University) has a number of small component ontologies – Designed as components – Not tied to applications DAML is working on fundmental physics ontologies (Jerry Hobbs, SRI International, ISI, Ken Forbus, others) – Time – Space We would like input from GEON!
29
USCISIUSCISI Some BIG Questions (from Gene Ontology Workshop) How do you get started? How to ensure the community will accept it (use it)? How do you (can you?) represent alternative views? What is the process to contribute to it? What is the process to make changes to it? What happens when there is an update? How is it implemented? What tools? How is it managed? Who does what, when, where, why?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.