SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Overview of the Science Environment for Ecological Knowledge (SEEK) Ricardo Scachetti Pereira.
An Operational Metadata Framework For Searching, Indexing, and Retrieving Distributed GIServices on the Internet By Ming-Hsiang.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
1 NODC, Russia GISC & DCPC developers meeting Langen, 29 – 31 March E2EDM technology implementation for WIS GISC development S. Sukhonosov, S. Belov.
Introduction to Web services MSc on Bioinformatics for Health Sciences May 2006 Arnaud Kerhornou Iván Párraga García INB.
Introduction to Kepler Deana Pennington, PhD University of New Mexico LTER Network Office, Sevilleta LTER PI CI-Team: Advancing CI-Based Science through.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Center for Environmental Studies Arizona State University Digital Research Records at Center for Environmental Studies Peter McCartney.
SAN DIEGO SUPERCOMPUTER CENTER Developing a CUAHSI HIS Data Node, as part of Cyberinfrastructure for the Hydrologic Sciences David Valentine Ilya Zaslavsky.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Synthesis of Incomplete and Qualified Data using the GCE Data Toolbox Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
January, 23, 2006 Ilkay Altintas
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
SEEK: Enabling Ecology and Biodiversity Science Through Cyberinfrastructure.
Refactoring the EarthGrid SOAP API to REST style and implementing it to Metacat Serhan Akın Ph.D. candidate in Earth System Sciences Institute of Earth.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Supporting Large-Scale Science with Workflows Deana Pennington University of New Mexico Long-Term Ecological Research Network Office ITR: Science Environment.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Cyberinfrastructure Overview Core Cyberinfrastructure Team Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
EcoGrid SEEK All Hands Meeting February 2003 Albuquerque, NM.
Science Environment for Ecological Knowledge Bertram Ludäscher San Diego Supercomputer Center University of California, San Diego
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
Distributed Aircraft Maintenance Environment - DAME DAME Workflow Advisor Max Ong University of Sheffield.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
Grid Technologies Arcot Rajasekar (SEEK) Paul Watson (North East eScience Centre)
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Web: Minimal Metadata for Data Services Through DIALOGUE Neil Chue Hong AHM2007.
Ecoinformatics Workshop Summary SEEK, LTER Network Main Office University of New Mexico Aluquerque, NM.
The SEEK EcoGrid: A Data Grid System for Ecology Arcot Rajasekar Matthew Jones Bertram Ludäscher
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
Analysis and Modeling System Breakout Create a semi-automated system for analyzing data and executing models that provides documentation, archiving, and.
Information Integration BIRN supports integration across complex data sources – Can process wide variety of structured & semi-structured sources (DBMS,
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
EScience Workshop on Scientific Workflows Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Information Management Jornada Basin LTER. Jornada Information management system Six major components: a)Data management implementation/process b)Management.
Hellenic Centre for Marine Research (HCMR) MedOBIS - Ocean Biogeographic Information System for the Eastern Mediterranean and Black Sea.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
LTER Science 2050: Challenges, Constraints and Opportunities Bill Michener Professor and DataONE Project Director University of New Mexico 12 September.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
© Geodise Project, University of Southampton, Integrating Data Management into Engineering Applications Zhuoan Jiao, Jasmin.
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
Staging of the Ecological Niche Modeling Mammal Prototype Project Deana Pennington University of New Mexico December 14, 2004.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
EcoGrid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Flanders Marine Institute (VLIZ)
Data R&D Issues for GTL Bertram Ludäscher Data and Knowledge Systems
PDAP Query Language International Planetary Data Alliance
Presentation transcript:

SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open Grid interfaces (OGSA compliant) l Analysis and Modeling System l Modeling scientific workflows l Semantic Mediation System l “Smart” data discovery l Knowledge-based data integration l Knowledge-based analysis integration l Knowledge Representation l Ontologies for describing ecology

SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Grid-standardized interfaces l Metadata-mediated data access (EML) l Query l Read l Write l Authentication l Authorization l Replication l Computational access l Pre-defined analytical services l On-the-fly analytical services

EcoGrid Services l Query l Search metadata and data, return result sets with ID l Read l Retrieve data objects by ID l Authentication l Verify user identity l Authorization l Record allowable interactions l Write l Write data objects by ID l Replication l Mirror objects for backup and efficiency l Computation l Execute models and simulations from AMS on various nodes

EcoGrid client interactions l Modes of interaction l Client-server l Fully distributed l Peer-to-peer l EcoGrid Registry l Node discovery l Service discovery l Aggregation services l Centralized access l Reliability l Data preservation

Building the EcoGrid ANDLUQHBRNTL Metacat node Legacy system LTER Network (24) Natural History Collections (>> 20) Organization of Biological Field Stations (180) UC Natural Reserve System (36) Partnership for Interdisciplinary Studies of Coastal Oceans (4) Multi-agency Rocky Intertidal Network (60) SRB node DiGIR node VCR VegBank node Xanthoria node

Semantics for Science l Ontologies provide domain context l Link directly to data via EML, and to analytical workflows l Use logic engines for discovery and integration Elevation (m) Vegetation cover type P, juniper, 2200m, 16C P, pinyon, 2320m, 14C A, creosote, 1535m, 22C Sample 1, lat, long, presence Sample 3, lat, long, absence Sample 2, lat, long, presence Mean annual temperature (C) Access File Excel File Integrated data:

Ecological ontologies l What was measured (e.g., biomass) l Type of measurement (e.g., Energy) l Context of measurement (e.g., Psychotria limonensis) l How it was measured (e.g., dry weight)

l Label data with semantic types l Label inputs and outputs of analytical components with semantic types l Use reasoning engines to generate transformation steps l Beware analytical constraints l Use reasoning engine to discover relevant components Semantic Type Labeling DataOntologyWorkflow Components

Scientific workflows EML provides semi-automated data binding Scientific workflows represent knowledge about the process; AMS captures this knowledge

SEEK Analysis and Modeling System l Ontologies provide domain context l Link directly to data via EML, and to analytical workflows via MoML l Use logic engines for: l Discovery of data and analytical components l Integration of those components l Implementation l Design tool based on Ptolemy l Direct access to EcoGrid data within design tool l Individual workflow components execute as OGSA services

Analysis and Modeling system Training sample (d) GARP rule set (e) Test sample (d) Integrated layers (native range) (c) DiGIR Species presence & absence points (native range) (a) EcoGrid Query EcoGrid Query Layer Integration Layer Integration Sample + A3 + A2 + A1 Data Calculation MapValidation User ValidationMap SRB Environmental layers (invasion area) (b) Integrated layers (invasion area) (c) Invasion area prediction map (f) DiGIR Species presence &absence points (invasion area) (a) Native range prediction map (f) Model quality parameter (g) SRB Environmental layers (native range) (b) Model quality parameter (g) Slide from D. Pennington Scientific workflows represent knowledge about the process; AMS captures this knowledge

Aims of EcoGrid l Which, Where, How, Who ???? l Share Data and Information l Relate Data from multiple projects/groups l Crosswalks across data structures l Develop Eco-related Finding Aids for Data l Global User: Authenticate and Authorize l Provide an infrastructure for “Archivable Collection-building” for SEEK scientists l Facilitate the A&M layer and the SMS layer

Challenges of EcoGrid l Data & User Diversity l datasets & scientists l themes, methods, units,structures l Small data sizes but high complexity - metadata l Multiple Data Organizations l Biodiversity Surveys l Population data l GIS, Satellite Images, Weather Data, … l Ontologies & Taxonomies l Data Discovery: No single place to find l Data Entropy – rapid decline of information on data l Autonomy with Centralized access l Leverage Computational Grid work

Existing services l Metacat – syntactic and semantic metadata querying/inserting/updating/deleting, user registration/authentication, data replication, data/metadata versioning, - supports any XML- based metadata l Xanthoria – common-schema mediator (currently 8 sites) metadata query/insert/update/delete for any XML schema to underlying metadatabase (SQL, native XML)

Existing Systems l DiGIR – querying arbitrary XML-describable resources (underlying data sources can be any type: RDB, XMLDB). l ClimDB – integrating (using wrapping at the data source) diverse format climate data. Access through web, common schema identified beforehand – tabular description l HyperLTER – summary ontology as metadata for images put in as metadata, image extraction /geographicsubsetting/band-level subsetting/ - integration with MODIS images and Hyperspectral images, TM images, airphotos, …

Existing Systems l VegBank – 3 databases co-occurrence records, species taxonomic database that is concept-driven, community classification. Distributed vegbank, querying by plots. Querying/insert/update/annotate across three diverse databases that are described using XML l SRB – access distributed data, syntactic, semantics,user-defined (arbitrary relational) metadata based querying. Annotations for data. Opertions on data. Extraction of metadata. ingest,bulk ingest, delete,upate of data/metadata