Download presentation
Presentation is loading. Please wait.
Published byCatherine Fisher Modified over 9 years ago
1
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org GEON IT Advances: ⁃ Data Integration ⁃ GEON Workbench ⁃ Scientific Workflows Bertram Ludäscher Kai Lin Ilkay Altintas Efrat Jaeger San Diego Supercomputer Center University of California, San Diego
2
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES2 www.geongrid.org The Problem: Scientific Data Integration or: … from Questions to Queries …
3
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES3 www.geongrid.org Information Integration Challenges: S 4 Heterogeneities Systems Integration Systems Integration – platforms, devices, data & service distribution, APIs, protocols, … Grid middleware technologies + e.g. single sign-on, platform independence, transparent use of remote resources, … Syntax & Structure Syntax & Structure – heterogeneous data formats (one for each tool...) – heterogeneous data models (RDBs, ORDBs, OODBs, XMLDBs, flat files, …) – heterogeneous schemas (one for each DB...) Database mediation technologies + XML-based data exchange, integrated views, transparent query rewriting, … Semantics Semantics – fuzzy metadata, terminology, “hidden” semantics, implicit assumptions, … Knowledge representation & semantic mediation technologies + “smart” data discovery & integration + e.g. ask about X (‘mafic’); find data about Y (‘diorite’); be happy anyways!
4
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES4 www.geongrid.org Information Integration Challenges: S 5 Heterogeneities Synthesis of analysis pipelines, integrated apps & data products, … Synthesis of analysis pipelines, integrated apps & data products, … – How to make use of these wonderful things & put them together to solve a scientist’s problem? Scientific Problem Solving Environments GEON Portal and Workbench (“scientist’s view”) + ontology-enhanced data registration, discovery, manipulation + creation and registration of new data products from existing ones, … GEON Scientific Workflow System (“engineer’s view”) + for designing, re-engineering, deploying analysis pipelines and scientific workflows; a tool to make new tools … + e.g., creation of new datasets from existing ones, dataset registration,…
5
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES5 www.geongrid.org Ontology-Enabled Application Example: Geologic Map Integration Show formations where AGE = ‘Paleozic’ (without age ontology) Show formations where AGE = ‘Paleozic’ (without age ontology) Show formations where AGE = ‘Paleozic’ (with age ontology) Show formations where AGE = ‘Paleozic’ (with age ontology) +/- a few hundred million years domain knowledge domain knowledge Knowledge representation AGE ONTOLOGY Nevada
6
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES6 www.geongrid.org Querying by Geologic Age …
7
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES7 www.geongrid.org Querying by Geologic Age: Result
8
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES8 www.geongrid.org Querying by Chemical Composition … (GSC)
9
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES9 www.geongrid.org Querying by Chemical Composition: Results DO know: It’s NOT there! DON’T know! (not registered) Note the fine differences in shades of gray: OK – we got to work on the color coding ;-)
10
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES10 www.geongrid.org Querying w/ British Rock Classification (BRC) Uses a GSC BRC inter-ontology articulation mapping
11
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES11 www.geongrid.org British Rock Classification Query: Results Uses a GSC BRC inter-ontology articulation mapping
12
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES12 www.geongrid.org The Query: Show sedimentary rocks The Puzzle: Find the 17 differences in the results… but first: what states are we looking at?
13
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES13 www.geongrid.org Sedimentary Rocks: BGS Ontology
14
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES14 www.geongrid.org Sedimentary Rocks: GSC Ontology
15
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES15 www.geongrid.org Need for Knowledge-enabled Integration A geologist analyzing chemical data from a pluton finds no recognizable correlation between variables. A geologist analyzing chemical data from a pluton finds no recognizable correlation between variables. – What possible scenarios can he examine to understand this heterogeneity? Measured ages also show a scatter Measured ages also show a scatter – What is the significance of the observed spread in measure time? GeolAgeDB GeoChemDB DataTables Knowledge Representation Research: concept maps & ontologies process maps & ontologies semantic types … to facilitate (even) “smarter” tools
16
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES16 www.geongrid.org A Prerequisite: Resource Registration (1a) Register ontologies – geologic age; rock classifications (GSC, BGS), seismology; … (1b) optionally: register inter-ontology articulations – e.g. GSC ontology BGS ontology (2a) Item-level dataset registration – ADN metadata; other controlled vocabularies & ontologies (e.g. geologic age timescale (USGS), SWEET (NASA), …) (2b) Item-detail registration – e.g. associate values in a column with a concept (3) Use ontology-based query UI / application – e.g. query by geologic age and chemical composition
17
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES17 www.geongrid.org Demonstration Preview NOTE: A technology demonstration, not a content demonstration (vocabulary, ontology, maps, …) demonstration (vocabulary, ontology, maps, …) 1. Ontology Registration (geologicAge.owl) 2. Dataset Registration (myShapeFiles.zip) 3. Item-Level Association (1 2) 4. GEONsearch metadata, spatial, temporal, concept-based 5. GEONworkbench use of workspace e.g. composing new maps from existing ones … resume with GEON workflow overview
18
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES18 www.geongrid.org GEONmiddleware Demonstration Preview myOntology.owl myDataset.foo metadata User Access (via Portal) Gazetteer, DLESE, … Geologic Age, Chronos, … external services GEONsearch Search condition(s) spatial temporal concept Log GEONworkbench GEON Workspace (user) User actions add delete manipulate GEON Catalog ResourceRegistration SRB Client Access (via web services) Other distributed apps Kepler, DLESE, …
19
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES19 www.geongrid.org Dataset to Ontology Registration (Item-level) Domain Knowledge Ontologies Arizona 19 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES
20
20 www.geongrid.org GEON Search: Concept-based Querying Portal Demonstration
21
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES21 www.geongrid.org Scientific Problem Solving Environments GEON Portal and Workbench (“scientist’s view”) GEON Portal and Workbench (“scientist’s view”) previous demonstration – a workbench for using existing/integrated tools Kepler Workflow System (“engineer’s view”) Kepler Workflow System (“engineer’s view”) – for (semi-)automating “scientific workflows” and “analysis pipelines” – a tool for making and deploying new tools – some features: … low-level plumbing to high-level conceptual flows … connect reusable components (“actors”, “boxes”) to form apps abstraction via nesting of subworkflows into composite actors deploy automated workflows on the Grid and/or with custom Uis – demonstrations available (“Kepler2Go-1. ” CD for Summer Institute)
22
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES22 www.geongrid.org A Kepler Scientific Workflow 22 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES component (actor) libraries canvas for design and execution monitoring canvas for design and execution monitoring inline documentation
23
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES23 www.geongrid.org Translating query xml response to web service xml input format. worldImage XML SOAP response Look Inside Sample GEON Dataset Extraction & Processing
24
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES24 www.geongrid.org 24 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GEON Dataset Registration Annotation form
25
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES25 www.geongrid.org 25 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GEON Dataset Registration validation Registering ADN metadata Metadata display
26
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES26 www.geongrid.org 26 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Putting it all together …
27
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES27 www.geongrid.org GEON Workflows & KEPLER 27 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES HPC workflow http://kepler-project.org
28
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Using Kepler for Geological Data Integration Workflows Ilkay Altintas presenting joint GEON work of: Efrat Jaeger Bertram Ludäscher Kai Lin Ashraf Memon San Diego Supercomputer Center University of California, San Diego
29
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES29 www.geongrid.org Some Requirements for a Scientific Workflow System (1/2) …it should work… (No kidding!) …it should work… (No kidding!) USER REQUIREMENTS: Design tools-- especially for non-expert users Design tools-- especially for non-expert users Ease of use-- fairly simple user interface having more complex features hidden in the background Ease of use-- fairly simple user interface having more complex features hidden in the background Reusable generic features Reusable generic features – Generic enough to serve to different communities but specific enough to serve one domain (e.g. geosciences) Extensibility for the expert user-- almost a visual programming interface Extensibility for the expert user-- almost a visual programming interface Registration and publication of data products and “process products” (=workflows); provenance Registration and publication of data products and “process products” (=workflows); provenance
30
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES30 www.geongrid.org Some Requirements for a Scientific Workflow System (2/2) TECHNICAL REQUIREMENTS: Error detection and recovery from failure Error detection and recovery from failure – Logging information for each workflow Allow data-intensive and compute-intensive tasks Allow data-intensive and compute-intensive tasks (Maybe at the same time) – HPC+X (From Dr. Berman’s last GSM talk) Allow status checks and on the fly updates Allow status checks and on the fly updates Visualization… Visualization… Semantics and metadata… Semantics and metadata… Certification, trust, security… Certification, trust, security… Ask the experts in this room
31
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES31 www.geongrid.org Kepler is… … a scientific workflow system … a scientific workflow system … a cross-project collaboration … a cross-project collaboration New contributing partners: Cheminformatics: Resurgence (Kim Baldridge et al.) Life Sciences: EOL (Mark Miller et al.) Data Mining: SKIDL (Tony Fountain et al.) Neuroinformatics: BIRN (coming…) … an emerging open source tool for “scientific discovery workflows” … an emerging open source tool for “scientific discovery workflows” 31 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Kepler 1.0 alpha release Summer Institute Kepler 1.0 alpha release Summer Institute
32
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES32 www.geongrid.org Some Recent Actor Additions Generic WS Invocation CommandLine Execution File Transfer Globus Job Execution SRB Access SQL Queries Queries & Transformations Browser-based user interface Real-time data streaming SMTP-based messaging
33
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES33 www.geongrid.org Web Services Actors (WS Harvester) Web Services Actors (WS Harvester) 1 2 3 4 ”Minute-made” (MM) WS-based application integration Similarly: MM workflow design & sharing w/o implemented components
34
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES34 www.geongrid.org GEON Contributions to Kepler System demonstration System demonstration - Using Kepler Features GEON workflows in detail GEON workflows in detail - Dataset Registration Model - Processing Datasets on the Fly and Registering with the GEONworkbench
35
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES35 www.geongrid.org Conclusions Evolving system – GEON is a significant contributor Evolving system – GEON is a significant contributor – Plans for new generic and project-specific extensions Second alpha release available as CD Second alpha release available as CD – Installers for Windows, Linux, MacOSX – Daily version tests and JWS installer generation User manuals and developer documentation is coming soon! User manuals and developer documentation is coming soon! More: next week during the Summer Institute … More: next week during the Summer Institute … Kepler project website: http://kepler-project.org http://kepler-project.org Thanks!
36
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org GEON IT Advances: ⁃ Data Integration ⁃ GEON Workbench ⁃ Scientific Workflows Bertram Ludäscher Kai Lin Ilkay Altintas Efrat Jaeger San Diego Supercomputer Center UC San Diego E N D
37
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES37 www.geongrid.org Related Publications Semantic Data Registration and Integration Semantic Data Registration and Integration On Integrating Scientific Resources through Semantic Registration, S. Bowers, K. Lin, and B. Ludäscher, 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), 21-23 June 2004, Santorini Island, Greece. On Integrating Scientific Resources through Semantic RegistrationSSDBM'04 A System for Semantic Integration of Geologic Maps via Ontologies, K. Lin and B. Ludäscher. In Semantic Web Technologies for Searching and Retrieving Scientific Data (SCISW), Sanibel Island, Florida, 2003. A System for Semantic Integration of Geologic Maps via OntologiesSCISW Towards a Generic Framework for Semantic Registration of Scientific Data, S. Bowers and B. Ludäscher. In Semantic Web Technologies for Searching and Retrieving Scientific Data (SCISW), Sanibel Island, Florida, 2003. Towards a Generic Framework for Semantic Registration of Scientific DataSCISW The Role of XML in Mediated Data Integration Systems with Examples from Geological (Map) Data Interoperability, B. Brodaric, B. Ludäscher, and K. Lin. In Geological Society of America (GSA) Annual Meeting, volume 35(6), November 2003. The Role of XML in Mediated Data Integration Systems with Examples from Geological (Map) Data Interoperability Semantic Mediation Services in Geologic Data Integration: A Case Study from the GEON Grid, K. Lin, B. Ludäscher, B. Brodaric, D. Seber, C. Baru, and K. A. Sinha. In Geological Society of America (GSA) Annual Meeting, volume 35(6), November 2003. Semantic Mediation Services in Geologic Data Integration: A Case Study from the GEON Grid Query Planning and Rewriting Query Planning and Rewriting Processing First-Order Queries under Limited Access Patterns, Alan Nash and B. Ludäscher, Proc. 23rd ACM Symposium on Principles of Database Systems (PODS'04) Paris, France, June 2004. Processing First-Order Queries under Limited Access PatternsPODS'04 Processing Unions of Conjunctive Queries with Negation under Limited Access Patterns, Alan Nash and B. Ludäscher., 9th Intl. Conference on Extending Database Technology (EDBT'04) Heraklion, Crete, Greece, March 2004, LNCS 2992. Processing Unions of Conjunctive Queries with Negation under Limited Access PatternsEDBT'04 Web Service Composition Through Declarative Queries: The Case of Conjunctive Queries with Union and Negation, B. Ludäscher and Alan Nash. Research abstract (poster), 20th Intl. Conference on Data Engineering (ICDE'04) Boston, IEEE Computer Society, April 2004. Web Service Composition Through Declarative Queries: The Case of Conjunctive Queries with Union and NegationICDE'04
38
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES38 www.geongrid.org Related Publications Scientific Workflows Scientific Workflows Kepler: An Extensible System for Design and Execution of Scientific Workflows, I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludäscher, S. Mock, 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), 21-23 June 2004, Santorini Island, Greece. Kepler: An Extensible System for Design and Execution of Scientific WorkflowsSSDBM'04 Kepler: Towards a Grid-Enabled System for Scientific Workflows, Ilkay Altintas, Chad Berkley, Efrat Jaeger, Matthew Jones, Bertram Ludäscher, Steve Mock, Workflow in Grid Systems (GGF10), Berlin, March 9th, 2004. Kepler: Towards a Grid-Enabled System for Scientific WorkflowsWorkflow in Grid Systems (GGF10) An Ontology-Driven Framework for Data Transformation in Scientific Workflows, S. Bowers and B. Ludäscher, Intl. Workshop on Data Integration in the Life Sciences (DILS'04), March 25-26, 2004 Leipzig, Germany, LNCS 2994. An Ontology-Driven Framework for Data Transformation in Scientific WorkflowsDILS'04 A Web Service Composition and Deployment Framework for Scientific Workflows, I. Altintas, E. Jaeger, K. Lin, B. Ludaescher, A. Memon, In the 2nd Intl. Conference on Web Services (ICWS), San Diego, California, July 2004.ICWS
39
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES www.geongrid.org Additional Material (for questions etc)
40
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES40 www.geongrid.org Multi-Hierarchical Rock Classification System (GSC) … a target ontology (after conversion to OWL) for geologic map registration … Composition Genesis Fabric Texture
41
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES41 www.geongrid.org Inside Ontology-Enabled Map Integration User: “Show formations from Cenozoic!” Query Rewriting QuaternaryTertiary Cenozoic Age Ontology Arizona Montana WestTertiaryTkgmQuaternaryQ ……… QgQuaternary………TwpTertiary……… TwlTertiary……… PERIOD FORMATIONLITHOLOGYTkgmQ Qg Twp Twl … PERIOD Color Definition Map Rendering select FORMATION where AGE=“Tertiary” or AGE=“Quaternary” ABBREV
42
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES42 www.geongrid.org Data Source Wrapping and Integration Arizona Colorado Utah Nevada Wyoming New Mexico Montana East Idaho Montana West Formation… Age… Formation…Age… Formation…Age… Formation…Age… Formation…Age… Formation…Age… Formation…Age… …Formation…Age …Composition …Fabric …Texture …Formation…Age …Composition …Fabric …Texture ABBREV PERIOD NAME PERIOD TYPE TIME_UNIT FMATN PERIOD NAME PERIOD NAME FORMATION PERIOD FORMATION LITHOLOGY AGE andesitic sandstone Livingston formation Tertiary- Cretaceous
43
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES43 www.geongrid.org Gravity Modeling Design Workflow Idea: Comparing observed & synthetic gravity models Idea: Comparing observed & synthetic gravity models Steps: Steps: – Extracting and merging gravity depths from heterogeneous data sources for a Lat/Lon bounding box (databases, web services). – Projecting and interpolating data sources into the same coordinate systems. – Differencing observed and synthetic models. – Displaying Differential raster image.
44
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES44 www.geongrid.org Grid Interpolation Interpolating queried gravity data on the grid and displaying it using a color schema. Interpolating queried gravity data on the grid and displaying it using a color schema. Currently IDW interpolation algorithm supported. Future plans: Minimum Curvature, TIN, Kriging and Spline. Currently IDW interpolation algorithm supported. Future plans: Minimum Curvature, TIN, Kriging and Spline. Output: either ascii x,y,z,p or ESRI ascii grid format. Output: either ascii x,y,z,p or ESRI ascii grid format. Display: using global mapper service. Display: using global mapper service.
45
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES45 www.geongrid.org Gravity Modeling Design Workflow
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.