GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Towards Semantic Mediation for GEON: Facilitating Scientific Data Integration using.

Slides:



Advertisements
Similar presentations
An Operational Metadata Framework For Searching, Indexing, and Retrieving Distributed GIServices on the Internet By Ming-Hsiang.
Advertisements

Interoperability of Distributed Component Systems Bryan Bentz, Jason Hayden, Upsorn Praphamontripong, Paul Vandal.
Ontology Notes are from:
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
San Diego Supercomputer Center EDBT'02, Prague 1 EDBT Panel, March 2002, Prague: Scientific Data Integration for Complex Multiple-Worlds Scenarios: Databases.
GEON Science Application Demos
1 Distributed Database Concepts 8:30-10:00AM Thursday, July 21 st 2005 CSIG05 Chaitan Baru.
GEON-UTEP GEON-Knowledge Representation WG Update GEON-KR list (currently) Bertram Ludaescher (SDSC: Bertram Ludaescher (SDSC:
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Model Based Mediation With Domain Maps ___________________________ Xiaosen Li Guanrao William
Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher
The Pragmatics of Geo-ontologies, and the Ontology of Geo-pragmatics Boyan Brodaric, Geological Survey of Canada, Ottawa.
Physical model Model results HPCC Data Modeling Environment Core Grid Services Authentication, monitoring, scheduling, catalog, data transfer, Replication,
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
EcoGrid SEEK All Hands Meeting February 2003 Albuquerque, NM.
Investigators: Chaitan Baru, Randy Keller, Dogan Seber, Krishna Sinha, Ramon Arrowsmith, Boyan Brodaric, Karl Flessa, Eric Frost, Ann Gates, Mark Gahegan,
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Rule-Based Programming for VORBs Bertram Ludaescher Arcot Rajasekar Data and Knowledge Systems San Diego Supercomputer Center U.C. San Diego.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Cyberinfrastructure and EarthScope Science goals: A GEON perspective What is Cyberinfrastructure? What is GEON? How will GEON research facilitate discovery.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
GEON Cyberinfrastructure Workshop Beijing, China, July 21-23, 2006 Workflow-Driven Ontologies for the Geosciences Leonardo Salayandía The University of.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
Kepler includes contributors from GEON, SEEK, SDM Center and Ptolemy II, supported by NSF ITRs (SEEK), EAR (GEON), DOE DE-FC02-01ER25486.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
N NESSTAR: A Semantic Web Application for Statistical Data and Metadata Pasqualino “Titto” Assini Nesstar Ltd - UK.
Information Integration BIRN supports integration across complex data sources – Can process wide variety of structured & semi-structured sources (DBMS,
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Working with Ontologies Introduction to DOGMA and related research.
GRID-ENABLED MEDIATION SERVICES FOR GEOSPATIAL INFORMATION Ilya Zaslavsky, Chaitan Baru San Diego Supercomputer Center University of California San Diego.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Application Ontology Manager for Hydra IST Ján Hreňo Martin Sarnovský Peter Kostelník TU Košice.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GEON IT Advances: ⁃ Data Integration ⁃ GEON Workbench ⁃ Scientific Workflows Bertram Ludäscher.
WSDL – Web Service Definition Language  WSDL is used to describe, locate and define Web services.  A web service is described by: message format simple.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
A Cyberinfrastructure Framework for Discovery, Integration, and Analysis of Earth Science Data A Prototype System A. K. Sinha, Z. Malik, A. Rezgui, A.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Semantic Mediation and Scientific Workflows Bertram Ludäscher Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego.
CUAHSI HIS: Science Challenges Linking small integrated research sites (
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
GEONSearch: From Searching to Recommending GeoInformatics 2006 May 10-12, Reston, Virginia Ullas Nambiar, Bertram Ludaescher Dept. of Computer Science.
An Extensible Model-Based Mediator System with Domain Maps Amarnath Gupta * Bertram Ludäscher * Maryann E. Martone + * San Diego Supercomputer Center (SDSC)
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Ontology Technology applied to Catalogues Paul Kopp.
SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Introduction to SDSC Fran Berman Director, SDSC and.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
EcoGrid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University.
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
Data R&D Issues for GTL Bertram Ludäscher Data and Knowledge Systems
Managing Scientific Data From Data Integration to Scientific Workflow
A Semantic Type System and Propagation
Ontologies: Introduction and Some Uses
Presentation transcript:

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Towards Semantic Mediation for GEON: Facilitating Scientific Data Integration using Knowledge Representation Bertram Ludäscher Data and Knowledge Systems San Diego Supercomputer Center U.C. San Diego

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Acknowledgements “Smart” Geologic Map Prototype: Kai Lin Data and Knowledge Systems San Diego Supercomputer Center Geo-Knowledge-Engineer: Boyan Brodaric Natural Resources Canada... and many GEONites : Dogan, Krishna,..., State Geologic Surveys, Chaitan, Ilya, Michalis, Ashraf,... (upcoming demo) Geoscientists + Computer Scientists Igneous Geoinformaticists +/- Energy GEON Metamorphism Equation:

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES GEON and “Semantic” Data Integration Rocky Mountains Midatlantic Region

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES What is Knowledge Representation ? Relating Theory to the World via Formal Models Source: John F. Sowa, Knowledge Representation: Logical, Philosophical, and Computational FoundationsKnowledge Representation: Logical, Philosophical, and Computational Foundations “All models are wrong, but some are useful!”

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES What is (an) “Ontology” ??? (... what CS graduate students need to know...) 1. Ontology as a philosophical discipline 2. Ontology as a an informal conceptual system 3. Ontology as a formal semantic account 4. Ontology as a specification of a “conceptualization” 5. Ontology as a representation of a conceptual system via a logical theory 5.1 characterized by specific formal properties 5.2 characterized only by its specific purposes 6. Ontology as the vocabulary used by a logical theory 7. Ontology as a (meta-level) specification of a logical theory [Guarino’95]

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES What is an Ontology? (CSE-291 cont’d ;-) Given a logical language L...Given a logical language L... –... a conceptualization is a set of models of L which describes the admittable (intended) interpretations of its non-logical symbols (the vocabulary) –... an ontology is a (possibly incomplete) axiomatization of a conceptualization. conceptualization C(L) ontology set of all models M(L) logictheories [Guarino96]

Problem: Scientific Data Integration... from Questions to Queries... What is the distribution and U/ Pb zircon ages of A-type plutons in VA? How about their 3-D geometry ? How does it relate to host rock structures? ? Information Integration Geologic Map (Virginia) GeoChemical GeoPhysical (gravity contours) GeoChronologic (Concordia) Foliation Map (structure DB) “Complex Multiple-Worlds” Mediation domain knowledge Database mediation Data modeling Knowledge Representation: ontologies, concept spaces raw data

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Got Glue? Which one? What for? XML (common syntax)XML (common syntax) –flexible (semistructured) data model –used at all levels: data / metadata exchange, message exchange (SOAP), schemas & data types (XML Schema), Semantic Web & web ontologies (RDF(S), OWL),... Grid infrastructure (system interoperation)Grid infrastructure (system interoperation) –distributed computing and data management –web services Controlled Vocabularies (“joins”)Controlled Vocabularies (“joins”) –data level: joins across different data sets –but meta-data and ontologies (concept names, relationship names,...) are also data! Integrated View Definitions (mediated views/virtual databases)Integrated View Definitions (mediated views/virtual databases) –declarative specification of “integration logic”: XQuery, Datalog,... Thesauri (translator for retrieving related information)Thesauri (translator for retrieving related information) –synonyms, broader/narrow term, e.g., UMLS (meta-thesaurus, “ontology”) Taxonomies (classification)Taxonomies (classification) –shared vocabulary, concept hierarchy (is-a) Ontologies (classification + additional semantics):Ontologies (classification + additional semantics): –formal specification of a conceptualization, shared meaning –facilitates “smart querying”, semantic mediation

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Information Integration Challenges System aspects: “Grid” Middleware distributed data & computing Web Services, WSDL/SOAP, OGSA, … sources = functions, files, data sets, … Syntax & Structure: (XML-Based) Data Mediators wrapping, restructuring (XML) queries and views sources = (XML) databases Semantics: Model-Based/Semantic Mediators conceptual models and declarative views Knowledge Representation: ontologies, description logics (RDF(S),OWL...) sources = knowledge bases (DB+CMs+ICs) Syntax Structure Semantics System aspects  reconciling S 4 heterogeneities  “gluing” together multiple data sources  bridging information and knowledge gaps computationally

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Standard (XML-Based) Mediator Architecture MEDIATOR (XML) Queries & Results S1S1 Wrapper (XML) View S2S2 Wrapper (XML) View SkSk Wrapper (XML) View Integrated Global (XML) View G Integrated View Definition G(..)  S 1 (..)…S k (..) USER/Client USER/Client Query Q ( G (S 1,..., S k ) ) Query Q ( G (S 1,..., S k ) ) wrappers implemented as web services

XML-Based vs. Semantic Mediation Raw Data IF  THEN  Semantics, Constraints in Logic Integrated-CM := CM-QL(Src1-CM,...) Integrated-CM := CM-QL(Src1-CM,...) (XML) Objects Conceptual Models XML Elements XML Models C2 C3 C1 R Classes, Relations, is-a, has-a,... “Glue Maps” ontologies, concept spaces Integrated-DTD := XQuery(Src1-DTD,...) Integrated-DTD := XQuery(Src1-DTD,...) No Semantics / Domain Constraints A = (B*|C),D B =... Structural Constraints (DTDs), Parent, Child, Sibling,... CM ~ {Descr.Logic, ER, UML, RDF(S), …} CM-QL ~ {F-Logic, …} , ,2,140,29,Tertiary,Trc,CHINLE FORMATION,59,57

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES GEON Framework for Interoperability in the Geosciences Systems level: GEON Grid...Systems level: GEON Grid... –enable sharing of data and tools via grid services –based on Open Grid Services Architecture (OGSA) –acquisition of cluster endpoints and initial deployment at some sites underway, including SDSC, UTEP, VT,..., Syntactic and schema level: Data integration via (meta)data standards (often XML-based)Syntactic and schema level: Data integration via (meta)data standards (often XML-based) –database mediators create integrated virtual databases => dynamic creation and automatic update of data-warehouses Semantic level: data integration via “semantic” mediationSemantic level: data integration via “semantic” mediation –Situating 4-D data in context  spatio-temporal, thematic, process contexts can be represented as “concept spaces” –specifically: use of ontologies, and logic-based knowledge representation –development guided/driven by specific scientific data integration problems

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Towards Shared Conceptualizations: High-level Domain Ontology & Standard Data Model Source: NADAM Team (Boyan Brodaric et al.) Adoption of a standard (meta)data model => wrap data sets into unified virtual views

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Towards Shared Conceptualizations: Data Contextualization via Concept Spaces

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Towards Knowledge Sharing: Rock-type “Ontology” Composition Genesis Fabric Texture

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Biomedical Informatics Research Network Biomedical Informatics Research Network Getting Formal: Source Contextualization & Ontology Refinement in Logic

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Show formations where AGE = ‘Paleozic’ (without age ontology) Show formations where AGE = ‘Paleozic’ (without age ontology) Show formations where AGE = ‘Paleozic’ (with age ontology) Show formations where AGE = ‘Paleozic’ (with age ontology) domain knowledge domain knowledge Knowledge representation AGE ONTOLOGY Nevada

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Querying with Multiple Classifications/Ontologies: Age, Composition, Texture, Fabric, Genesis

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES What to do with the “KR Glue”? Conceptual-level information, concept spaces, ontologies, and other KR techniques for...Conceptual-level information, concept spaces, ontologies, and other KR techniques for... –... smart data discovery –... browsing and querying by themes, disciplines,... –... defining virtual/mediated databases at conceptual level –... support “plugging together” of “data and information experiments” into Scientific Workflows (a.k.a. Analytical Pipelines in the SEEK ITR) –... smarter user interfaces  is “find felsic sedimentary rocks” a meaningful (satisfiable) query? –...

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Some enabling operations on “ontology data” Composition Concept expansion: what else to look for when asking for ‘Mafic’ what else to look for when asking for ‘Mafic’

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Some enabling operations on “ontology data” Composition Generalization: finding data that is “like” X and Y finding data that is “like” X and Y

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Towards Knowledge Sharing: Rock-type Ontology Composition Genesis Fabric Texture

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES DEMO... do NOT click this...

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Architecture of Integrated Geologic Map Prototype System HTTP Server (Java Server Page) MapServer (Minnesota) Mediator (Java application) Database (Arizona) Database (Montana) Map Definition local layer remote layer local layer Global Ontology Definitions Rock classification Geologic age requestresponse

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Data Source Wrapping and Integration Arizona Colorado Utah Nevada Wyoming New Mexico Montana East Idaho Montana West Formation… Age… Formation…Age… Formation…Age… Formation…Age… Formation…Age… Formation…Age… Formation…Age… …Formation…Age …Composition …Fabric …Texture …Formation…Age …Composition …Fabric …Texture ABBREV PERIOD NAME PERIOD TYPE TIME_UNIT FMATN PERIOD NAME PERIOD NAME FORMATION PERIOD FORMATION LITHOLOGY AGE andesitic sandstone Livingston formation Tertiary- Cretaceous

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Ontology-Enabled Query Processing User: “Show formations from Cenozoic!” Query Rewriting QuaternaryTertiary Cenozoic Age Ontology Arizona Montana West TertiaryTkgm QuaternaryQ ……… QgQuaternary………TwpTertiary……… TwlTertiary……… PERIOD FORMATIONLITHOLOGYTkgmQ Qg Twp Twl … PERIOD Color Definition Map Rendering select FORMATION where AGE=“Tertiary” or AGE=“Quaternary” ABBREV

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Integration Challenges MANY!MANY! non-available or non- interoperable datanon-available or non- interoperable data “Dirty data”, no controlled vocabularies“Dirty data”, no controlled vocabularies Many different controlled vocabularies! (“clean data”)Many different controlled vocabularies! (“clean data”) What is entailed by a vocabulary?What is entailed by a vocabulary?  Formal Ontologies  Extensible Ontologies

GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES What’s next? YOU!YOU! GEON-SCI:GEON-SCI: –Science questions waiting to be turned into queries! GEON-KR Working Group activitiesGEON-KR Working Group activities –guided (if not driven by) geoscientists –marry KR technologies to standards (W3C, Semantic Web: RDF, OWL,...) –collect GEON-able KR resources (data models, controlled vocabularies, ontologies,...) GEON-DEV:GEON-DEV: –Generalize and merge current KR/semantic mediation architecture with standard Grid architecture –building systems