Knb.ecoinformatics.org LTER EML Best Practices Data Discovery in the Biological Sciences 7-9 February 2005 Mark Servilla LTER Network Office University.

Slides:



Advertisements
Similar presentations
Overview of the Science Environment for Ecological Knowledge (SEEK) Ricardo Scachetti Pereira.
Advertisements

Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
Mark Servilla & Duane Costa LTER Network Office LTER 2012 All Scientist Meeting LTER Network Office.
Foundational Objects. Areas of coverage Technical objects Foundational objects Lessons learned from review of Use Case content Simple Study Simple Questionnaire.
WELCOME to the LTER Data Co-op with PASTA (Provenance Aware Synthesis Tracking Architecture) All Scientists Meeting 2012 Your source for LTER data.
LTER IM Articulation Work: Developing Community Web Recommendations Nicole Kaplan (SGS), Karen Baker (CCE, PAL), Barbara Benson (NTL), Eda Melendez-Colom.
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological.
2009 Mid–Term Review El Verde Field Station June 4, 2009.
1 Adaptive Management Portal April
Building the LTER Network Information System. NIS History, Then and Now YearMilestone 1993 – 1996NIS vision formed by Information Managers (IMs) and LTER.
Long-Term Ecological Research working_groups/controlled_vocabulary Working Group: “Synthesis through data.
NOAA Metadata Update Ted Habermann. NOAA EDMC Documentation Directive This Procedural Directive establishes 1) a metadata content standard (International.
Educause October 29, 2001 A GEM of a Resource: The Gateway to Educational Materials Copyright Nancy Virgil Morgan, This work is the intellectual.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Based on material developed by Samantha Romanello and
Publishing Digital Content to a LOR Publishing Digital Content to a LOR 1.
ClimDB/HydroDB (ClimHy) Integration ClimHy has been migrated from AND to LNO and will remain status quo in 2011 – Public page (
Data Integration, Analysis, and Synthesis Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
Towards Improvingt the BNZ LTER Core Data Sets. Types of Core Data Climate Hydrology Element Cycling Population Biodiversity.
U.S. Department of the Interior U.S. Geological Survey CDI Data Management Working Group December 12, 2011 Sally Holl, USGS Texas Water Science Center.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
EML Congruency Checker A tool to assess and report on the quality of EML-based data packages.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
Directions in observational data organization: from schemas to ontologies Matthew B. Jones 1 Chad Berkley 1 Shawn Bowers 2 Joshua Madin 3 Mark Schildhauer.
Ecological Metadata Language (EML) and Morpho
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
Preparing Metadata Records Suresh K.S. Vannan ORNL, Oak Ridge, TN Viv Hutchison US Geological Survey, Denver, CO
USGS Metadata in the Broader Picture 1994 Executive Order – Metadata must be created for all Federally-funded research – Federal Geographic Data.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Data, Metadata, and Ontology in Ecology Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of California Santa Barbara.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Grid Technologies Arcot Rajasekar (SEEK) Paul Watson (North East eScience Centre)
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
The SEEK EcoGrid: A Data Grid System for Ecology Arcot Rajasekar Matthew Jones Bertram Ludäscher
Strategies for Adding EML Support to the GCE Data Toolbox for Matlab Wade Sheldon Georgia Coastal Ecosystems LTER (WWW: gce-lter.marsci.uga.edu/lter)
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Controlled Vocabulary VTC June 1, Agenda Review some past activities Plan some future activities.
Using R in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Using Desktop Data in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Building the LTER Network Information System. NIS History, Then and Now YearMilestone 1993 – 1996NIS vision formed by Information Managers (IMs) and LTER.
Network Information System EML status of LTER sites Iñigo San GilAug 5th 2005 IM meeting, Montreal ‘05.
Information Management using Ecological Metadata Language Corinna Gries - CAP Margaret O’Brien - SBC.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Long Term Ecological Research Network Office Trends Project Spaghetti & Linguine (aka Trends Data Store) Mark Servilla 14 September.
The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
EML Best Practices for LTER Site Metadata EML Best Practices Committee (Corinna Gries, Margaret O’Brien, Ken Ramsey, Wade Sheldon)
Information Management Jornada Basin LTER. Jornada Information management system Six major components: a)Data management implementation/process b)Management.
John Porter Sheng Shan Lu M. Gastil Gastil-Buhl With special thanks to Chau-Chin Lin and Chi-Wen Hsaio.
LTER IM Meeting 2008 – Benson, Boose, Bohm, Gries, Gu, Kaplan, Koskela, Laney, Porter, Remillard, Sheldon and others.
Metadata Content Entering Metadata Information. Discovery vs. Access vs. Understanding Cannot search on content if it is not documented. Cannot access.
Collaborative Project Database Margaret O’Brien, Corinna Gries, Wade Sheldon, Jonathan Walsh, John Porter, Sven Bohm, James Brunt, Suzanne Remillard, Ken.
A look to the past for the future- The North American Profile Sharon Shin Metadata Coordinator Federal Geographic Data Committee.
Visualization in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Long Term Ecological Research Network Information System LTER EML Status LTER Information Manager’s Meeting 28 July 2004 Mark Servilla
Network Information System Advisory Committee NISAC Activity Report 2007 LTER IM Meeting Wade Sheldon (GCE) Committee Co-chair.
Building an Information Management System for Global Data Sharing: A Strategy for the International Long Term Ecological Research (ILTER) Network Kristin.
EcoGrid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University.
Strategies for NIS Development
Network Information System Advisory Committee (NISAC)
Data Management: Documentation & Metadata
LTER Metadata Query Interface – Current Status and Future Challenges
An ecosystem of contributions
Session 2: Metadata and Catalogues
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Proposal of a Geographic Metadata Profile for WISE
Chapter 5.5 Metadata John Cima.
Presentation transcript:

knb.ecoinformatics.org LTER EML Best Practices Data Discovery in the Biological Sciences 7-9 February 2005 Mark Servilla LTER Network Office University of New Mexico, Albuquerque

knb.ecoinformatics.org Agenda Goals & Motivation LTER Metadata Tiers Recommended EML elements Additional Recommendations

knb.ecoinformatics.org Goals & Motivation Guidelines to achieve the following goals –Maximize interoperability of LTER EML documents to facilitate data synthesis –Minimize heterogeneity of LTER EML documents to simplify development and re-use of software tools and style sheets –Identify useful subsets of the EML to support specific functionality tiers targeted by the LTER NIS Advisory Committee (NISAC) –Provide guidance to sites in their initial implementation of EML, and a roadmap for improving their implementation to achieve higher functionality Why do we need an “EML Best Practices” document?

knb.ecoinformatics.org LTER Tiered Trajectory for Metadata

knb.ecoinformatics.org Increasing Levels of Completeness Identification Discovery Evaluation Access Integration Semantic Use

knb.ecoinformatics.org Level 1 - Identification Description – Minimum content for adequate data set discovery Major Elements Added : –Title –Creator –Contact –Publisher –Publication Date –Keywords –Abstract –Dataset/distribution (i.e. URL for dataset information)

knb.ecoinformatics.org Level 1 Code Example <eml:eml xmlns:eml="eml://ecoinformatics.org/eml-2.0.1" xmlns:xsi=" xsi:schemaLocation="eml://ecoinformatics.org/eml packageId="knb-lter-fls.1.1" system="FLS” scope="system"> FLS-1 Arthropods Long-term Ground Arthropod Monitoring Dataset at Silver City, NM USA from 1998 to

knb.ecoinformatics.org John Ecologist FLS LTER Department of Ecology University of New Mexico PO Box 1234 Albuquerque NM (505) Level 1 Code Example cont.

knb.ecoinformatics.org Description – Level 1 content, plus coverage information to support targeted searches Major Elements Added : –Geographic Coverage –Taxonomic Coverage –Temporal Coverage Level 2 - Discovery

knb.ecoinformatics.org Silver City, NM USA meter... Level 2 Code Example

knb.ecoinformatics.org Orthopteran insects (grasshoppers) were id using the 2004 BigKey to Orthoptera Kingdom Animalia Phylum Arthropoda Level 2 Code Example cont.

knb.ecoinformatics.org Description – Level 2 content, plus data set details to enable end-user evaluation of the methodology and data entities Major Elements Added : –Intellectual Rights –Project –Methods –Data Table/Entity Group –Data Table/Attributes (constrained by current version of EML ) Level 3 - Evaluation

knb.ecoinformatics.org Level 3 Code Example The dataset is released to the public and may be used for academic or commercial purposes subject to the following restrictions: LTER will make every effort possible to control and document the quality of the data it publishes. Data are made available "as is"......

knb.ecoinformatics.org Level 3 Code Example cont.... Fictitious LTER Site (FLS) permanent monitoring program Dr. Eva Scientist addr-1 principalInvestigator The FLS basic monitoring program consists of monitoring of arthropod populations, plant net primary productivity, and bird populations. Monitoring takes place at 3 sites, 4 times a year. Climate parameters are continuously measured at all stations.

knb.ecoinformatics.org Level 3 Code Example cont. FSL Protocol for Surveying Ground Arthropods has been... FLS Protocol for Surveying Ground Arthropods pers This protocol is being used by FLS arthropod... Ecology...

knb.ecoinformatics.org Level 3 Code Example cont. SBE MicroCAT 37-SM (S/N 1790); manufacturer: Sea-Bird Electronics (model: 37-SM MicroCAT); parameter: Conductivity (accuracy: S/m, readability: S/m, range: 0 to 7 S/m); last calibration: Feb 28, 2001 SBE MicroCAT 37-SM (S/N 1790); manufacturer: Sea-Bird Electronics (model: 37-SM MicroCAT); parameter: Pressure (water) (accuracy: 0.2m, readability: m, range: 0 to 20m); last calibration: Feb 28, 2001 SBE MicroCAT 37-SM (S/N 1790); manufacturer: Sea-Bird Electronics (model: 37-SM MicroCAT); parameter: Temperature (water)(accuracy: 0.002C, readability: C, range: -5 to 35C); last calibration: Feb 28,

knb.ecoinformatics.org Level 3 Code Example cont.... arthro_hab Habitat description for the sampling locations temp Water Temperature float celsius real NaN value not recorded or invalid...

knb.ecoinformatics.org Level 3 Code Example cont. cond Conductivity measured with SeaBird Electronics CTD-911 float siemensPerMeter real

knb.ecoinformatics.org Level 3 Code Example cont.... <unit id="siemensPerMeter" name="siemensPerMeter" unitType="conductance" parentSI="siemen" multiplerToSI="1"> electrical conductance of a solution (conductivity)...

knb.ecoinformatics.org Level 4 - Access Description – Level 3 content plus data access details to support automated data retrieval Major Elements Added : –Access –Physical

knb.ecoinformatics.org PUBLIC read uid=fls,o=LTER,dc=ecoinformatics,dc=org all Level 4 Code Example

knb.ecoinformatics.org... flslter column, Level 4 Code Example

knb.ecoinformatics.org Level 5 - Integration Description – Level 4 content plus complete attribute and quality control details to support computer-assisted data integration and re- sampling; Integration-level metadata should support computer-mediated access and processing of data, and therefore requires that all aspects of the data package be fully described. Major Elements Added : –Attribute List (full descriptions) –Measurement Scale –Units –Constraint –Quality Control

knb.ecoinformatics.org Level 5 Code Example... pkarthro_taxa dbo.arthro_taxa.taxon arthro_taxa.taxonNotNull dbo.arthro_taxa.taxon...

knb.ecoinformatics.org Level 5 Code Example Passage of clouds during a profile reduces the incident radiation, and leads to erroneous estimates of Kd. Variation of incident irradiance was described in two ways (before binning): 1) the coefficient of variation (cv) over the 10m depth interval, and 2) difference......

knb.ecoinformatics.org Level 6 - Semantic Description – Level 5 content plus semantic information (currently under development by SEEK, and may require extension to the EML schema)

knb.ecoinformatics.org packageID and Metacat document naming convention LDAP access control in Metacat Organizational citation Additional Recommendations Metacat and by extension the Metacat harvester rely on numerical data set ids and revision numbers for document management and synchronization - packageId attributes for EML contributed to the KNB Metacat should be formed as follows: knb-lter-[site].[dataset number].[revision], e.g. knb-lter-sev Scope UniqueID Revision# Metacat access control format conforms to the LDAP Distinguished Name concept: uid=FLS,o=lter,dc=ecoinformatics,dc=org The “Organization” field on the Metacat query results page is populated using the first eml:eml/dataset/creator/organizationName element in the document, so it is recommended that for LTER-contributed data sets the LTER site be included as the first creator: Sevilleta LTER

knb.ecoinformatics.org Access to EML Best Practices

knb.ecoinformatics.org Credits James Brunt (LNO) Corinna Gries (CAP) Jeanine McGann (LNO) Margaret O’Brien (SBC) Ken Ramsey (JRN) Wade Sheldon (GCE)

knb.ecoinformatics.org Acknowledgements This material is based upon work supported by: The National Science Foundation under Grant Numbers , , , , , and The National Center for Ecological Analysis and Synthesis, a Center funded by NSF (Grant Number ), the University of California, and the UC Santa Barbara campus. The Andrew W. Mellon Foundation. PBI Collaborators: NCEAS, University of New Mexico (Long Term Ecological Research Network Office), San Diego Supercomputer Center, University of Kansas (Center for Biodiversity Research) Kepler contributors: SEEK, Ptolemy II, SDM/SciDAC, GEON