Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Metadata for Scientific Data Access and Management Richard M. Keller, Ph.D. Group Lead for Information Sharing & Integration Intelligent Systems.

Similar presentations


Presentation on theme: "Semantic Metadata for Scientific Data Access and Management Richard M. Keller, Ph.D. Group Lead for Information Sharing & Integration Intelligent Systems."— Presentation transcript:

1 Semantic Metadata for Scientific Data Access and Management Richard M. Keller, Ph.D. Group Lead for Information Sharing & Integration Intelligent Systems Division NASA Ames Research Center rkeller@arc.nasa.gov http://sciencedesk.arc.nasa.gov/scidesk/ February 17, 2005 ROSES Workshop

2 Focus of Work Scientific data management, not data analysis Computational infrastructure related to: storing locating searching integrating sharing scientific data

3 Specific Problems Integrating Heterogeneous Scientific Data from Multiple Sources Searching/finding Relevant Scientific Data Organizing/indexing Data for Rapid, Intuitive Access

4 Culprit: Inadequate Metadata Metadata is typically limited to essentials only (e.g. data format, instrument, date) –inadequate for extensive indexing, precise searching Each data repository defines its own metadata, using its own terminology and data dictionary –difficult to search across repositories –difficult to integrate and combine datasets No common frame of reference for cross- repository comparison

5 Common Approach To facilitate storage, retrieval, integration, and comprehension of scientific data: capture the semantic metadata that provides a rich context for each data product

6 What is “semantic metadata”? Semantic Metadata: information relating to the context in which the scientific data are generated and used –how? –when? –where? –why? –who?

7 Collection of microbial mats in the field Early Microbial Ecosystems Investigation Trace gas production and consumption under “Early Earth” conditions Greenhouse Incubator Microbial mat (algae) Detailed studies of mat biogeochemistry monitoring analysis experimentation geographically-disbursed team of collaborators B. Bebout D. Des Marais T. Hoehler, et al. Code SSX

8 Semantic Context Surrounding Mat “4b” (“Semantic Network”) collected-at Spring Beach collected-by Brad Bebout stored-in Greenhouse has-measurement measured- with O 2 Microsensor O 2 Concentration HBC-2 Microbial culture Culture prep B notes for Lee has-culture cultivated-by Culture recipe Mary Hogan has-recipe imaged-with Electron Microscope has-image

9 Semantic Network Structure culture photo measurement site instrument sample hypothesis Links: relationships among resources ( e.g.,“measured by”, “supports hypothesis”) Attached files: electronic products associated with resources (e.g., datasets, images, documents) Attributes: properties of resources (metadata) Nodes: key info resources or organizational structures (describes people, places, measurements, hypotheses) date size format Ontology: Specifies the types of nodes, attributes and links defined for scientific investigation Rules: Add/modify nodes, links & attributes in the network

10 DNA sequence image document culture person sample photographic image SEM image Scientific Data Collection Ontology (partial) other experiment Scientific Information Nodes project measurement site equipment camera gas chromatograph stub O2 microsensor N2 microsensor SEM O2 concentration N2 concentration spectrometer spectrograph chromatogram other micrograph cultivated-from cultivated-by has-genetic-sequence pictured-in researcher lab tech

11 Benefits of Semantic Metadata Approach Semantic context provides a unifying framework for integrating data across data collections Sophisticated “semantic search” methods allow retrieval based on semantic relationships among data Intuitive data indexing, access, and organization schemes derive from semantic data models Formal semantic representation enables automated inference about the data

12 Challenge Semantic metadata approach has been applied to small, PI-maintained data repositories Tremendous volume of earth and space science data is stored in huge, curated data repositories maintained by NASA, USGS, ESA, universities, and others. How to translate semantic metadata ideas to operate on the scale of large data repositories? Seeking Collaborators!

13 SemanticOrganizer System (Mat Sample: Spring-M4-b)

14 Photo: SprM4b excised

15 What is ScienceOrganizer? A Web-based collaborative knowledge management tool for distributed teams of scientific investigators Facilitates information sharing, integration, correlation A project information repository / digital library: users upload/download heterogeneous project information products -- images, datasets, documents, and various types of scientific records (describing samples, field sites, measurements, instruments, etc.) Features cross-linkage: enables rapid access to interrelated information; permits linking data and observations to scientific hypotheses Supports inference capabilities: permits formal reasoning about the repository contents A “project archive” system: tracks history of project team’s fieldwork, labwork, and associated data collection activities

16 ScienceOrganizer Users ARC Microbial Ecosystems Group: field & lab science, experiments, data analysis. NAI Ecogenomics Focus Group: cross-discipline collaboration, data analysis. ARC Electron Microscopy Lab: electron microscopy image archiving, sample cataloging. MARTE Mission: analog Mars drilling mission, support for remote science data acquisition, storage, and access JSC Astrobiology Institute for the Study of Biomarkers: electron microscopy image archive, sample collection, cataloging, and storage; support for education & outreach. NIH/NASA Malaria Control Study: African malaria study - data collection and archiving. ASU/NSF Desert Microbial Survey (NSF): microbial survey; provides publicly- accessible repository. Mobile Agents Demonstration Project: analog Mars surface exploration, support for remote science data acquisition, storage, and access Astrobionics Technology Integration: technology infusion program


Download ppt "Semantic Metadata for Scientific Data Access and Management Richard M. Keller, Ph.D. Group Lead for Information Sharing & Integration Intelligent Systems."

Similar presentations


Ads by Google