Ecoinformatics Workshop Summary SEEK, LTER Network Main Office University of New Mexico Aluquerque, NM.

Slides:



Advertisements
Similar presentations
Overview of the Science Environment for Ecological Knowledge (SEEK) Ricardo Scachetti Pereira.
Advertisements

Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Education, Outreach and Training. Specifications Document Overall objective: Better integration of ecoinformatics, in general, and SEEK tools, specifically,
With TimeCard appointments are tagged with information that converts them into time sheets. This way users can report time and expenses from their Outlook.
John Porter MANY HANDS: FOSTERING ECOLOGICAL DATA SHARING THROUGH ILTER INFORMATION MANAGEMENT COLLABORATIONS.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Center for Environmental Studies Arizona State University Digital Research Records at Center for Environmental Studies Peter McCartney.
Systems Architecture, Fourth Edition1 Internet and Distributed Application Services Chapter 13.
Tutorial 11: Connecting to External Data
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
Synthesis of Incomplete and Qualified Data using the GCE Data Toolbox Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia.
Introduction to R Statistical Software Anthony (Tony) R. Olsen USEPA ORD NHEERL Western Ecology Division Corvallis, OR (541)
Microsoft Access Database software. What is a database? … a database is an organized collection of data. A collection of data of similar information compiled.
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Based on material developed by Samantha Romanello and
January, 23, 2006 Ilkay Altintas
Central Arizona Phoenix LTER Center for Environmental Studies Arizona State University Data Entry Applications Peter McCartney (CAP) RDIFS Training Workshop.
Databases C HAPTER Chapter 10: Databases2 Databases and Structured Fields  A database is a collection of information –Typically stored as computer.
Data Integration, Analysis, and Synthesis Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
SEEK: Enabling Ecology and Biodiversity Science Through Cyberinfrastructure.
material assembled from the web pages at
Cyberinfrastructure Overview Core Cyberinfrastructure Team Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
EcoGrid SEEK All Hands Meeting February 2003 Albuquerque, NM.
Ecological Metadata Language (EML) and Morpho
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
Extensible Markup Language (XML) Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879).ISO 8879 XML is a.
Module 6. Data Management Plans  Definitions ◦ Quality assurance ◦ Quality control ◦ Data contamination ◦ Error Types ◦ Error Handling  QA/QC best practices.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
Grid Technologies Arcot Rajasekar (SEEK) Paul Watson (North East eScience Centre)
Managing the Impacts of Change on Archiving Research Data A Presentation for “International Workshop on Strategies for Preservation of and Open Access.
The SEEK EcoGrid: A Data Grid System for Ecology Arcot Rajasekar Matthew Jones Bertram Ludäscher
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Using R in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Using Desktop Data in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Building the LTER Network Information System. NIS History, Then and Now YearMilestone 1993 – 1996NIS vision formed by Information Managers (IMs) and LTER.
LTER Data Management Margaret O’Brien Santa Barbara Coastal Long Term Ecological Research (LTER) Project Santa Barbara Channel Biodiversity Observation.
EML Analysis Tools Introduction Ecoinformatics Working Group Taiwan Forestry Research Institute (TFRI)
Information Management using Ecological Metadata Language Corinna Gries - CAP Margaret O’Brien - SBC.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
Analysis and Modeling System Breakout Create a semi-automated system for analyzing data and executing models that provides documentation, archiving, and.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Long Term Ecological Research Network Office Trends Project Spaghetti & Linguine (aka Trends Data Store) Mark Servilla 14 September.
Registering your data with OBFS Ecoinformatics Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Registering your data with KNB BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
1 The EDIT System, Overview European Commission – Eurostat.
John Porter Sheng Shan Lu M. Gastil Gastil-Buhl With special thanks to Chau-Chin Lin and Chi-Wen Hsaio.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa EC CHM & GBIF European Regional Nodes Meeting Copenhagen,
Metadata ESA Workshop. In this session we will discuss…  Metadata: what are they? and why should they be created?  Metadata standards  Creating metadata.
Morpho – metadata management software SEEK Training January 2004.
Kepler BEAM Workshop Samantha Romanello LTER Network Office.
IPT + Darwin Core OBIS XML Schema OBIS Database Schema Explained Mike Flavell OBIS Data Manager OBIS Nodes Training Course, Oostende, Belgium, 6 May 2014.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
EcoGrid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University.
An Overview of Data-PASS Shared Catalog
Databases.
A step-by-step guide to DOI registration
Chapter 1 Database Systems
Staying afloat in the sensor data deluge
Introducing Schoolwires Forms & Surveys Module
Chapter 1 Database Systems
Tutorial 7 – Integrating Access With the Web and With Other Programs
Presentation transcript:

Ecoinformatics Workshop Summary SEEK, LTER Network Main Office University of New Mexico Aluquerque, NM

Topics covered 1.Grid networks – Ecogrid 2.Workflow systems – Kepler / Ptolemy II 3.Metadata compilers – Morpho 4.Databases – MySQL, MetaCAT, DBDesigner 5.QA/QC – SAS, S-Plus, Access, Excel 6.Interactive & Dynamic web sites – DreamWeaver

SEEK Overview

Grid Networks

SEEK EcoGrid Goal: standardize interfaces (using web and grid services) –We have standardized data via EML –Integrate diverse data networks from ecology, biodiversity, and environmental sciences Grid-standardized interfaces –Uniform interface to: Metacat, SRB, DiGIR, Xanthoria, etc. Anyone can implement these interfaces Hides complexity of underlying systems Metadata-mediated data access –Supports multiple metadata standards –EML, Darwin Core as foci Computational services –Pre-defined analytical services –On-the-fly analytical services

EcoGrid Node

EcoGrid client interactions Modes of interaction –Client-server –Fully distributed –Peer-to-peer EcoGrid Registry –Node discovery –Service discovery Aggregation services –Centralized access –Reliability –Data preservation

Kepler: scientific workflows EML provides semi-automated data binding Scientific workflows represent knowledge about the process; Kepler captures this knowledge

Kepler: ecological modeling

Lotka-Volterra Predator Prey Model

Running the model

Elk/Wolf Predator Prey Model

Running the model

Metadata what are they? and why should they be created?

Metadata Example In front of you are two tuna cans. How do you decide which one to buy?

Metadata helps you decide which one to get ! Metadata Example

Ecological Metadata Language Adopted by the LTER Information Management Metadata specification developed by the ecology discipline for the ecology discipline Based on prior work of Ecological Society of America and others (Michener et. al., 1997) Seven years in development – 14 versions –EML Implemented as an XML Schema Supports four separate modules –Dataset –Citation –Software –Protocol

Associated Metadata Data Set Data Table Xml files

Morpho provides a way for ecologists to share data by defining a common structure to document their data uses an XML format to create the common structure.

Morpho - tree editor

Morpho – entering metadata Again, chose from the earlier entries, another, data package or enter new information

Morpho - metadata Once data is up loaded to Morpho you can edit data or metadata This is the window that press finish in the morpho wizard.

Databases Small scale & on local computer – Access Bigger & on server - MySQL

Example - why use a database? Coordinate field data collection and data entry forms DATE SITE WEB PLOT QD SPECIES OBS COVER HEIGHT COUNT PHEN COMMENTS 2/3/1999 FPC 1 E 1 ERPU V NA 2/3/1999 FPC 1 E 1 ERPU V NA 2/3/1999 FPC 1 E 1 GUSA V NA 2/3/1999 FPC 1 E 1 GUSA V NA 2/3/1999 FPC 1 E 1 GUSA V NA

Database example Divide to 4 tables: –Location table –Species table –Visit table –Observation table DATE SITE WEB PLOT QD SPECIES OBS COVER HEIGHT COUNT PHEN COMMENTS 2/3/1999 FPC 1 E 1 ERPU V NA 2/3/1999 FPC 1 E 1 ERPU V NA 2/3/1999 FPC 1 E 1 GUSA V NA 2/3/1999 FPC 1 E 1 GUSA V NA 2/3/1999 FPC 1 E 1 GUSA V NA

Database example Location Visit Observation Species

Database example

QA/QC QC –Designing data sheets –Data entry using Validation rules Filters Lookup tables –Validate entered data Double entry Prior data Filters

QA/QC QA –Graphics Box plots Scatterplots Normal probability plots –Formal statistical methods Grubbs’test Edwards 2000

QA/QC The goal of QA is NOT to eliminate outliers! Rather, we wish to detect unusual & extreme values.

µ µ - 3σ µ + 3σ

What did I learn? Know your subject. Have a plan. Some planning (little time) in advance will save a lot of head-ache (and time and money and missed opportunities) later. Unorganized data might become a quick way to wall yourself off the increasingly collaborative and computerized research world.