Presentation is loading. Please wait.

Presentation is loading. Please wait.

Community-Supported Data Repositories in Paleoecology and Paleoclimatology: The ‘Middle Tail’ between Geoscientific Users and Geoinformatics Neotoma DB.

Similar presentations


Presentation on theme: "Community-Supported Data Repositories in Paleoecology and Paleoclimatology: The ‘Middle Tail’ between Geoscientific Users and Geoinformatics Neotoma DB."— Presentation transcript:

1 Community-Supported Data Repositories in Paleoecology and Paleoclimatology: The ‘Middle Tail’ between Geoscientific Users and Geoinformatics Neotoma DB www.neotomadb.orgC4P Jack Williams, Allan Ashworth, Brian Bills, Jessica Blois, Don Charles, Simon Goring, Russ Graham, Eric Grimm, Alison Smith, & Mark Uhen Part I: Building the Middle Tail: Community-Led Data Repositories Part II: Interconnecting the Middle Tail: Cyberinfrastructure for the Paleogeosciences

2 Many Big Questions require assembly of individual paleorecords into larger networks Do global temperatures lead or lag CO 2 during deglaciations? 21,000 11,000 Modern 15,000 7,000 % Spruce distributions: last glacial maximum to present % % % No Data Williams et al. (2004) Ecological Monographs Spruce Pollen Ice How far and fast can species migrate when climates change? Global temperatures & CO 2 : 22ka->0ka Shakun et al. (2012) Nature

3 Paleoecological Data: Key characteristics ‘Long Tail’: Collected in the field by small scientific teams. Scientists vary w.r.t. data management expertise, capacity, interest Highly valuable: specimens & samples collected decades ago are still analyzed Distributed scientific expertise: by proxy type, region, time period, and/or taxonomic group C4P “Big Data” “Long Tail” Datasets Data Size Neotoma DB www.neotomadb.org

4 Solution: Community-Led Data Repositories (COLDARs) as ‘middle tail’ for long-tail data Neotoma DB www.neotomadb.org Key Characteristics Open Data Curated by Community Added Value by serving community-specific needs (e.g. age models, taxonomy) Paleobiology DB paleobiodb.org

5 Neotoma DB www.neotomadb.org accessible small data BIG DATA findable identification, persistence identification, persistence authorization, protocols authorization, protocols context, provenance context, provenance re-usable harmonized, community governance & input interoperable “… data have no value or meaning in isolation; they exist within a knowledge infrastructure — an ecology of people, practices, technologies, institutions, material objects, and relationships.” - C.L. Borgman Moving up the Value Chain: Generic Depositories vs. Community-Led Repositories Modified from K. Lehnert Community- Led Repositories Community- Led Repositories Generic Depositories

6 Neotoma Paleoecology Database: Community- Led Repository for Quaternary and Pliocene Data Design Concepts Spatiotemporal Database: species occurrences & abundances in space & time Age Controls and Age Models stored Centralized IT and Distributed Scientific Governance Neotoma composed of several constituent databases (e.g. North American Pollen Database, FAUNMAP) Open Data accessible via Explorer, APIs, R Neotoma Broad User Community: Paleoecologists, ecosystem modellers, paleoclimatologists, biogeographers, educators, … Neotoma DB www.neotomadb.org

7 Time: Late Neogene (~last 5 million years) Most records: 10 4 -10 5 yrs Space: North American to Global Paleoecological Data Plants & pollen Vertebrates Ostracodes Diatoms Insects Testate Amoebae Physical Sedimentology Brewer et al. 2012 TREE Neotoma Domain Temporal Domains of Paleoecological Databases Neotoma DB www.neotomadb.org

8 Recent uploads to Neotoma Pubs Citing Neotoma & Constituent DBs Neotoma Uploads, Citations, and Usage Last updated: July 2015 2014 Usage Statistics Neotoma Explorer: 1,918 unique users Neotoma APIs: 1,562 unique users Neotoma APIs: 241,469 requests Neotoma DB www.neotomadb.org

9 Data Preparation & Submission Data Search & Retrieval Neotoma Explorer APIs neotoma (R) Neotoma DB Tilia Data Exploration & Visualization Data Archival Ice Age Mapper Niche Viewer Stratigraphic Diagrams Explorer Data Submission Web Application Downloadable Database Snapshots Neotoma Software Ecosystem Exists In Development

10 Amoebae Data Stewards Developer Team Bills (lead) Anderson Buckland Davis Goring Grimm Roth Williams Executive Team Grimm, Williams + 1 more Users & Informaticists Paleobiological Data Consortium Neotoma Leadership Council Graham, Blois, Davis, Barnosky, Colburn, Etnier, Jacisin, Maguire, Milideo, Smith, Warren Josh Miller, Russ Graham Grimm, Williams, Bills + 1 Developer & 3 Data Stewards Bob Booth Betancourt, Holmgren, Latorre, Rylander Ashworth, Buckland, Punel Alison Smith, Brandon Curry Don Charles, Sonja Hausmann Bob Booth Suzanne Pilaar Birch, Chris Widja Jon Nichols Grimm, Bradshaw, Giesecke, Williams, Goring, Evans, Fletcher, Hopf, Markgraf, McGeever, Mitchell Training Workshops Diatoms Insects Middens Pollen Plant Macros Vertebrates Biomarkers Isotopes Taphonomy Ostracodes Neotoma Governance (Proposed) Neotoma DB www.neotomadb.org

11 Next Challenge: Organizing and Interconnecting the Middle Tail C4P CINERGI Catalog: 224 Databases, 23 with geologic time metadata C4P CINERGI http://pivots2.azurewebsites.net/c4p.html#pv-file-selection

12 EarthCube RCN: Cyberinfrastructure for Paleobioscience (C4P) Goals Build new partnerships and collaborations among geoscientists and technologists Survey and catalog existing resources Share news of the latest advances in cyberscience and paleogeoinformatics Facilitate development of common standards and semantic frameworks C4P

13 EarthCube RCN: Cyberinfrastructure for Paleobioscience (C4P) C4P Activities Webinars & YouTube Channel: https://www.youtube.com/user/cybe r4paleo https://www.youtube.com/user/cybe r4paleo CINERGI Catalog of paleoresources (databases, software, etc.) http://earthcube.org/content/cinergi- c4p-resource-viewer http://earthcube.org/content/cinergi- c4p-resource-viewer Paleobiology Workshop (May 2014) Geochronology Workshop (Oct 2014) Early Career Workshops – GSA 2014, 2015 New Initiatives: Paleobiological Data Consortium (Neotoma/PBDB/…, PBDB-iDigBio, Open Core Data (CDSCO/IEDA/Neotoma/…)

14 PALEOBIOLOGICAL DATA CONSORTIUM COMMUNITY GEODATA OPEN-SOURCE BIODATA Paleobiology DB NOW DB Continental Scientific Drilling Office (CDSCO) Digimorph NOAA Paleoclimatology DarwinCore iDigPaleo MorphoBank Neotoma DB VertNet Early Career Members-at-Large ROpenSci GBIF/BISON STEPPE Open Geospatial Consortium Integrated Earth Data Alliance iDigBio C4P Share best practices & protocols Build compatibility between geo- & bioinformatics

15 Current & Future Neotoma, C4P, & PDC Activities 1.Data Uploads (Neotoma; e.g. MIOMAP, Mexican Quaternary Mammal DB, ongoing) 2.All Hands Neotoma Workshop at AGU (Neotoma; Dec 2015) 3.One-Stop Queries for Neotoma & Paleobio DBs (Harmonized APIs & R packages) (PDC, ongoing) 4.Hackathon for Paleobiological Data (C4P; Summer 2016, invitations TBD!) 5.New tools for data visualization & exploration (Neotoma Taxa Mapper & Niche Viewer) Neotoma DB www.neotomadb.orgC4P PDC

16 Sounds great! What’s in it for me? 1.Interested in using Neotoma to archive your data and make it available to others? Catch me after session Talk to a Data Steward WebEx training for new Stewards 2. Interested in using Neotoma & other paleobio resources? Neotoma Explorer walkthrough exercise: http://serc.carleton.edu/neotoma/activities.html http://serc.carleton.edu/neotoma/activities.html neotoma (R) paper (Goring et al. 2015 Open Quaternary) User workshops: ESA2016, IBS2017 Hackathon Summer 2016 3.Interested in integrating your resource (software/DBs) to Neotoma & other paleobio resources? Catch me after session Hackathon Summer 2016 Neotoma DB www.neotomadb.orgC4P PDC

17 This talk represents the work of many Neotoma PIs & Developers: Eric C. Grimm, Russ Graham, Mike Anderson, Allan Ashworth, Brian Bills, Jessica Blois, Bob Booth, Ed Davis, Don Charles, Simon Goring, Steve Jackson, Alison Smith, Jack Williams C4P RCN Steering Committee: Kerstin Lehnert, David Anderson, Doug Fils, Leslie Hsu, Chris Jenkins, Anders Noren, Tom Olsewski, Dena Smith, Mark Uhen, Jack Williams Neotoma DB NSF-Geoinformatics NSF-Earth Cube Eric Grimm C4P Paleobiological Data Consortium: Mark Uhen, Jack Williams, Brian Bills, Jessica Blois, Ed Davis, Simon Goring, Russ Graham, Michael McClennen, Shanan Peters, Alison Smith NSF-Earth Cube Paleobio Data Consortium


Download ppt "Community-Supported Data Repositories in Paleoecology and Paleoclimatology: The ‘Middle Tail’ between Geoscientific Users and Geoinformatics Neotoma DB."

Similar presentations


Ads by Google