Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher

Similar presentations


Presentation on theme: "Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher"— Presentation transcript:

1 Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher ludaesch@sdsc.edu

2 Data R&D Issues for GTL GTL data management infrastructure GTL data management infrastructure Service-oriented Data Grids for Service-oriented Data Grids for Seamless data sharing (volume, distribution, access restrictions, …) Capabilities for data integration (mediators/warehouses), digital library functions, knowledge-based (“semantic”) extensions (e.g. ontologies), and archival capabilities Data analysis and knowledge-enabling infrastructure Data analysis and knowledge-enabling infrastructure Analytical Pipelines (“Scientific Workflows”) Analytical Pipelines (“Scientific Workflows”) Rapid design and prototyping, handling of complex data & task semantics, large volume, sci. workflow as a first-class product, validation, execution, monitoring, sharing, archiving How to go from a scientist’s abstract (conceptual) workflow to a data grid execution plan? New Model Management and Knowledge Representation Technologies : New Model Management and Knowledge Representation Technologies : Closing the gap between data management (DBMS’s, data grids) and knowledge-based systems (desktop- oriented, rule-based systems) and analysis and modeling systems Mapping between numerous formalisms at the syntactic, structural, and semantic level (terminological, process-semantics, …) “Gluing” together models and formalisms across different levels: from genes to proteins to molecular machines to microbial communities…(compare: pnp transistors, boolean circuits, assembly language, high-level PLs, declarative QLs, … )  abstraction & elaboration mechanisms  Data exploration and hypothesis generation tools (KNOW-ME, SKIDL, SEEK AMS, …) Computational facilities Computational facilities Use of high-end networked facilities a la TeraGrid Use of high-end networked facilities a la TeraGrid Opportunities (and challenges!) in leveraging related efforts: Opportunities (and challenges!) in leveraging related efforts: NIH BIRN, …, NSF Cyberinfrastructure (ITRs GEON, GriPhyN, SCEC, SEEK, …), UK e-Science, … NIH BIRN, …, NSF Cyberinfrastructure (ITRs GEON, GriPhyN, SCEC, SEEK, …), UK e-Science, … Standardization (OGSA, KR/Semantic Web technologies, e.g., ontology languages (OWL), inference mechanisms, …), scientific workflow standards, …  interoperable, open source tools Standardization (OGSA, KR/Semantic Web technologies, e.g., ontology languages (OWL), inference mechanisms, …), scientific workflow standards, …  interoperable, open source tools One size/standards fits all? Probably not: data-intensive vs computation-intensive vs “semantics-intensive” (capturing implicit domain knowledge, hidden assumptions, …) One size/standards fits all? Probably not: data-intensive vs computation-intensive vs “semantics-intensive” (capturing implicit domain knowledge, hidden assumptions, …)

3 Bonus Material (beyond 1 slide limit ;-) starts here …

4 Up & Down: Abstraction & Elaboration Mechanisms Knowledge Mgmt Information Mgmt Data Management How to punch through the technology barriers? Data Grids vs Digital Libraries vs DBMS’s vs Knowledge-Based Analysis & Modeling Systems

5 Biomedical Informatics Research Network

6 Biomedical Informatics Research Network http://nbirn.net Biomedical Informatics Research Network http://nbirn.net Getting Formal: Source Contextualization & Ontology Refinement in Logic

7 Scientific Data Integration... Questions to Queries... What is the distribution and U/ Pb zircon ages of A-type plutons in VA? How about their 3-D geometry ? How does it relate to host rock structures? ? Information Integration Geologic Map (Virginia) GeoChemical GeoPhysical (gravity contours) GeoChronologic (Concordia) Foliation Map (structure DB) “Complex Multiple-Worlds” Mediation domain knowledge Database mediation Data modeling Knowledge Representation: ontologies, concept spaces raw data GeoSciences Network

8 Geologic Map Integration: Geo & IT/CS meet domain knowledge domain knowledge Knowledge representation AGE ONTOLOGY Nevada Geoscientists + Computer Scientists Igneous Geoinformaticists +/- Energy GEON Metamorphism Equation: +/- a few hundred million years 

9 Large collaborative NSF/ITR project: UNM, UCSB, UCSD (SDSC), UKansas,.. Large collaborative NSF/ITR project: UNM, UCSB, UCSD (SDSC), UKansas,.. “Analysis & Modeling System” to design, execute, reproduce/refine scientific workflows in the ecology and biodiversity domains. “Analysis & Modeling System” to design, execute, reproduce/refine scientific workflows in the ecology and biodiversity domains. SEEK Project Overview


Download ppt "Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher"

Similar presentations


Ads by Google