Building Integrated Data Streams for Large- Scale Paleoclimatology & Biogeography CDSCO Neotoma DB www.neotomadb.org Neotoma DB www.neotomadb.orgC4P Jack.

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management UKOLN is supported by: Curating the Scientific Record: The Challenges Ahead Dr.
Advertisements

Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Publish or perish? Linking Scratchpads and the new Biodiversity Data Journal for streamlining publication of botanical data D.N Koureas 1, L. Penev 2 &
BIS TDWG Conference, New Orleans, 2011 GBIF: Issues in providing federated access to digital information related to biological specimens David Remsen Senior.
Using Sakai to Support eScience Sakai Conference June 12-14, 2007 Sayeed Choudhury Tim DiLauro, Jim Martino, Elliot Metsger, Mark Patton and David Reynolds.
Pirita O. Oksanen, University of Bristol, School of Geographical Sciences Searching for wetlands since the Last Glacial Maximum Acknowledgements Most basal.
1. Instruments record the past 140 years. 2. Historic records go back thousands of years. 3. Prehistoric climate data must be collected by something called.
D EVELOPING S YNERGIES B ETWEEN L ARGE -S CALE R ESEARCH AND G EODATABASES : N EOTOMA A ND P AL EON Simon Goring, John W. Williams, Eric C. Grimm, Russell.
ODM2: Developing a Community Information Model and Supporting Software to Extend Interoperability of Sensor and Sample Based Earth Observations Jeffery.
n U.S. Department of Agriculture Natural Resources Conservation Service National Plant Data Team (NPDT) NRCS: A repository of plant data P lant L ist.
TPAC Digital Library Talk Overview Presenter:Glenn Hyland Tasmanian Partnership for Advanced Computing & Australian Antarctic Division Outline: TPAC Overview.
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
Information Requirements for Integrating Spatially Discrete, Feature- Based Earth Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Kerstin Lehnert,
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Version 2.0 [Review Date]
Scratchpads Publication Module - A paradigm shift in publishing RBG Kew, Seminar,
Domain Applications: Broader Community Perspective Mixture of apprehension and excitement about programming for emerging architectures.
Managing Sustainability Solutions Initiative (SSI) data Kate Beard, Steve Cousins University of Maine NERACOOS/NECOSP Data Management Workshop, Sept. 26,
Animal Species Database of China JI, Li-Qiang Institute of Zoology, CAS Beijing, China CODATA, 2006, Beijing.
Dimitris Koureas, PhD Natural History Museum London Linking layers of biodiversity data: Informatics challenges for the long tail research RDA - Long Tail.
DCO's Data Science Day Introduction June 5, 2014, Troy NY Peter Fox (Rensselaer Polytechnic Institute)
The ICDP Information Network Telework and Information Management in Scientific Drilling Projects Jens Klump and Ronald Conze GeoForschungsZentrum Potsdam.
Think Big and Long Scale - Global System Time - Global systems don’t change instantly.
Innovation & Supplementary Material Eleonora Presani – Elsevier
[] Where Did Those GBIF Occurrences Come From? Providing Digital Access to NatureServe's Reference Database: Report on a Project in the Early Stages of.
Web services at TRFIC TRFIC has developed the Access Technologies to achieve its goals of interoperability and provide access to data and information on.
Progress since the February 2005 London DNA Barcode of Life Conference Scott Miller, Chair Consortium for the Barcode of Life Smithsonian Institution.
BioData a new bioassessment database for the USGS Briefing for the CDI
Preserving the Scientific Record: Case Study 2 – Arctic Temperature Variability Matthew Mayernik National Center for Atmospheric Research Version 1.0 Review.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
1 Global Systems Division (GSD) Earth System Research Laboratory (ESRL) NextGen Weather Data Cube Chris MacDermaid October, 2010.
Astro / Geo / Eco - Sciences Illustrative examples of success stories: Sloan digital sky survey: data portal for astronomy data, 1M+ users and nearly 1B.
Jake F. Weltzin United States Geological Survey Taking the Pulse of our Planet The USA National Phenology Network.
Encyclopedia of Life Established May 2007 First version of portal went online Feb year goals –Assemble infinitely expandable web pages for all.
TWC Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Observatory Community Xiaogang (Marshall) Ma, Yu Chen, Han Wang, Patrick West,
Soil and Water Conservation Modeling: MODELING SUMMIT SUMMARY COMMENTS Dennis Ojima Natural Resource Ecology Laboratory COLORADO STATE UNIVERSITY 31 MARCH.
Preserving the Scientific Record: Case Study 2 – Arctic Temperature Variability Data Matthew Mayernik National Center for Atmospheric Research Version.
The Long Tail of Sample-based Data in the Next Decade FROM DARKNESS TO LIGHT Kerstin Lehnert
Investigating the Carbon Cycle in Terrestrial Ecosystems (ICCTE) A joint program between: The University of New Hampshire, USA AND Charles University,
Proxy Measures of Past Climates Current Weather Current Weather Finish Cryosphere Finish Cryosphere Significance of Climate Proxies Significance of Climate.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Community-Supported Data Repositories in Paleoecology and Paleoclimatology: The ‘Middle Tail’ between Geoscientific Users and Geoinformatics Neotoma DB.
Context: The Strategic Plan for Establishing the Network Integrated Biocollections Alliance Judith E. Skog, Office of the Assistant Director, Biological.
Integrating past, present, and projected future biological and environmental data to facilitate innovative global change biology research.
Teaching Climate Change: Lessons from the Past 2006 Workshop Montana State University, Bozeman Mt Teaching with Real Data: Paleoclimatology Resources for.
U.S. Environmental Protection Agency Central Data Exchange Pilot Project Promoting Geospatial Data Exchange Between EPA and State Partners. April 25, 2007.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
The role of the National Agricultural Library in arthropod genomics research - implementing and developing tools for genomic data management Monica Poelchau.
Project number: ENVRI and the Grid Wouter Los 20/02/20161.
Efforts to Link Ecological Metadata with Bacterial Gene Sequences at the Sapelo Island Microbial Observatory Wade M. Sheldon Mary Ann Moran James T. Hollibaugh.
GLOBAL BIODIVERSITY INFORMATION FACILITY Vishwas Chavan Senior Programme Officer for DIGIT 10 th Meeting of the GBIF Participant Node Managers Committee.
Advertising your data Alecia Aleman 1, Ruth Duerr 2 1 National Aeronautics and Space Administration (NASA) 2 National Snow and Ice Data Center, University.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
PALEOBIOLOGICAL DATA CONSORTIUM COMMUNITY GEODATA OPEN-SOURCE BIODATA Paleobiology DB NOW DB Continental Scientific Drilling Office (CDSCO) Digimorph NOAA.
Connecting Users, Data & Data Repositories Simon J. Goring ORCID: John W. Williams doi: /m9.figshare Distinguished Lecture.
Transformative Earth Sciences through Data: Neotoma, EarthCube & Flyover Country Simon Goring Assistant Scientist University of Wisconsin - Madison S i.
Cushing – EIM Integrating Ecological Data Notes from the Grasslands ANPP Data Integration Project evergreen.edu LTER Network Office,
Community-Curated Data Resources and Large-Scale Data-Model Syntheses: The Children of COHMAP John (Jack) W. Williams, University of Wisconsin,
GBIF Implementation Plan Highlights
Data Fundamentals A. D. Smith – September 26, 2011.
Paleoecoinformatics.
Introduction to Paleoclimatology
Pollen Representation of Vegetation Pattern in Woodrat Middens
Bringing Organism Observations Into Bioinformatics Networks
Introduction to D4Science
Proxy Measures of Past Climates
Recent Advances from the Neotoma Paleoecology Database
Bird of Feather Session
Presentation transcript:

Building Integrated Data Streams for Large- Scale Paleoclimatology & Biogeography CDSCO Neotoma DB Neotoma DB Jack Williams Simon Goring UW-Madison

Many Big Questions require assembly of individual records into larger networks Do global temperatures lead or lag CO 2 during deglaciations? 21,000 11,000 Modern 15,000 7,000 % Spruce distributions: last glacial maximum to present % % % No Data Williams et al. (2004) Ecological Monographs Spruce Pollen Ice How far and fast can species migrate when climates change? Global temperatures & CO 2 : 22ka->0ka Shakun et al. (2012) Nature

Paleoecological Data: Key characteristics ‘Long Tail’: Collected in the field by small scientific teams. Workers vary w.r.t. data management expertise, capacity, interest Highly valuable – specimens & samples collected decades ago are still analyzed Scientific expertise distributed by proxy type, region, time period, and/or taxonomic group C4P

Community Data Repositories have emerged to tackle these bigger questions Neotoma DB Key Characteristics Open Data Curated by Community Standardized Taxonomy Time: Age Controls and Age Models Paleobiology DB paleobiodb.org

PALEOBIOLOGICAL DATA CONSORTIUM COMMUNITY GEODATA OPEN-SOURCE BIODATA Paleobiology DB NOW DB Continental Scientific Drilling Office (CDSCO) Digimorph NOAA Paleoclimatology DarwinCore iDigPaleo MorphoBank Neotoma DB VertNet Early Career Members-at-Large ROpenSci GBIF/BISON STEPPE Open Geospatial Consortium Integrated Earth Data Alliance iDigBio C4P Share best practices & protocols Build compatibility between geo- & bioinformatics

Neotoma Paleoecology Database: Design Concepts Spatiotemporal database: species occurrences & abundances in space and time Age controls and age models stored Centralized IT and Distributed Scientific Governance. Neotoma composed of several constituent databases (e.g. North American Pollen Database, FAUNMAP) Open data accessible via Explorer, APIs, R Neotoma Broad user community: Paleoecologists, ecosystem modellers, paleoclimatologists, biogeographers, educators, … Neotoma DB

Time: Late Neogene (~last 5 million years) Most records: yrs Space: North American to Global Datasets: Plants & pollen Vertebrates Ostracodes Diatoms Insects Testate Amoebae Physical Sedimentology Brewer et al TREE Neotoma Domain Temporal Domains of Paleoecological Databases Neotoma DB

Paleoecol- ogists Ecosystem Modelers Biogeograph- ers Neotoma DB Neotoma as Boundary Organization Data Users Paleoecologists Pollen Vertebrates Insects Diatoms Ostracodes Amoebae Packrat Middens Informatics & Computer Scientists IEDA GeoWS Open Core Paleoclimat- ologists Best Practices Shared Protocols Data New Questions

Paleodata Workflows: State of Field 1.Cores Collected 2.Cores Split, Sampled, Logged 3.Proxies Measured by PIs 4.Papers Written 5.Data & Metadata Assembled 6.Data Deposited (Journals, NOAA-Paleo, Neotoma, etc.) Consequences: Variably documented data Challenging project management Multiple inefficiencies, sources of data friction Synthetic research hard at anything beyond site scale Neotoma DB 7.Data Synthesized into Regional-Global Studies 9.New Analyses. 8.Metadata gaps discovered

Key Need: Integrated Data Workflows 1.Cores Collected, Tagged with IGSNs, Metadata Logged In Field 2.Cores Split, Sampled, Logged, Samples Tagged with IGSNs, Data Stored in Common Data Structures (Open Core Data) 3.Proxies Measured by PIs, Data Stored in Common Data Formats 4.Papers Written, Embargoed Data Passed to Community Data Repositories (e.g. Neotoma) 5.Data & Metadata Assembled 6.Paper Published, Embargo Lifted from Repository Neotoma DB

Current & Future Neotoma Activities 1.Data Uploads 2.Partnership with LacCore/CDSCO et al. to establish common standards & linked data flows 3.neotoma R – establishing data models, integration with R packages 4.API development, user-driven 5.New tools for data visualization & exploration Neotoma DB Neotoma 2 2 Users

This talk represents the work of many Neotoma PIs & Developers: Eric C. Grimm, Russ Graham, Mike Anderson, Allan Ashworth, Brian Bills, Jessica Blois, Bob Booth, Ed Davis, Don Charles, Simon Goring, Steve Jackson, Alison Smith, Jack Williams C4P Steering Committee: Kerstin Lehnert, David Anderson, Doug Fils, Leslie Hsu, Chris Jenkins, Anders Noren, Tom Olsewski, Dena Smith, Mark Uhen, Jack Williams Neotoma DB NSF-Geoinformatics NSF-Earth Cube Eric Grimm C4P