Biological Oceanography Scientific Domain Ed DeLong MIT Department of Biological Engineering Department of Civil and Environmental Engineering DataSpace.

Slides:



Advertisements
Similar presentations
Will The 21st Century Need a Library ? Cyberinfrastructure and its Implications.
Advertisements

Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
EARTH 2008 Teachers’ Workshop Sponsored by MBARI, C-MORE & A-STEP with logistical support provided by CIOSS & OSU Partnership between teachers & scientists.
Visualizing Fitness for Purpose Bob Groman and Dicky Allison Biological and Chemical Oceanography Data Management Office Woods Hole Oceanographic Institution.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
BLUE GROWTH CALL AREA 3 : Ocean observation systems and technologies Horizon 2020 Societal Challenge 2 Info Day 17/01/2014 Dr Efthimios ZAGORIANAKOS.
Laboratory for Marine Microbial Ecology
Data Management in the DOE Genomics:GTL Program Janet Jacobsen and Adam Arkin Lawrence Berkeley National Laboratory University of California, Berkeley.
Reasons You Said For Attending Networking: meet people; learn about what’s going on in other units What MSU (institutional level) projects are going on.
Marine Studies Initiative: Research Working Group.
NSF and Environmental Cyberinfrastructure Margaret Leinen Environmental Cyberinfrastructure Workshop, NCAR 2002.
INTRODUCTION TO OCEANOGRAPHY Instructor: Prof. ANAMARIJA FRANKIĆ Office Number: S Office Hours: Posted on office door or by appointment Telephone:
Office of Science Office of Biological and Environmental Research J Michael Kuperberg, Ph.D. Dan Stover, Ph.D. Terrestrial Ecosystem Science AmeriFlux.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
The BIO Directorate Microbial Biology Emphasis BIO Advisory Committee April, 2005.
Metagenomic Analysis Using MEGAN4
The Natural Resources Digital Library Needs, Partners, and Challenges Bonnie Avery, Janine Salwasser, & Janet Webster Oregon State University.
Data Management Practices: BCO-DMO’s Successes and Challenges Bob Groman BCO-DMO Woods Hole Oceanographic Institution NERACOOS/NeCODP Data Management Workshop.
Preserving the Scientific Record: Establishing Relationships with Archives Matthew Mayernik National Center for Atmospheric Research Version 1.0 Review.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
GTL Facilities Computing Infrastructure for 21 st Century Systems Biology Ed Uberbacher ORNL & Mike Colvin LLNL.
Microbial Biology at the National Science Foundation Dr. Lita M. Proctor Division of Biological Infrastructure Biosciences Directorate National Science.
International ScientificThematic Network for Environmental Technologies ENVITECH-Net.
World Data Center for Marine Environmental Sciences.
U.S. Department of the Interior U.S. Geological Survey A vision for a global community Linda Gundersen Director Science Quality and Integrity US Geological.
Helping scientists collaborate BioCAD. ©2003 All Rights Reserved.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Roadmap for Soil Community Metagenomics of DOE’s FACE & OTC Sites
Data discovery and data processing for environmental research infrastructures Roberto Cossu ENVRI WP4 leader ESA.
National Ecological Observatory Network
ASCAC-BERAC Joint Panel on Accelerating Progress Toward GTL Goals Some concerns that were expressed by ASCAC members.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
Metagenomic Analysis Using MEGAN4 Peter R. Hoyt Director, OSU Bioinformatics Graduate Certificate Program Matthew Vaughn iPlant, University of Texas Super.
TWC Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Observatory Community Xiaogang (Marshall) Ma, Yu Chen, Han Wang, Patrick West,
Genomes To Life Biology for 21 st Century A Joint Initiative of the Office of Advanced Scientific Computing Research and Office of Biological and Environmental.
Soil and Water Conservation Modeling: MODELING SUMMIT SUMMARY COMMENTS Dennis Ojima Natural Resource Ecology Laboratory COLORADO STATE UNIVERSITY 31 MARCH.
Biological and Chemical Oceanography Data Management Office slide 1 of 19 CAMEO Data Management Bob Groman Biological and Chemical Oceanography Data Management.
Understanding Science and Technology Through K-8 Education Rollie Otto Center for Science and Engineering Education, Berkeley Lab June 28, 2007.
Who are we? Laboratory of Biodiversity, Institute of Marine Biology, Biotechnology and Aquaculture (IMBBA), Hellenic Centre for Marine Research (HCMR)
PRIMO PROGRAM A contribution from the COPAS CENTER (CHILE) A multi-institutional, multi-disciplinary and international PRIMO program including the following.
Context: The Strategic Plan for Establishing the Network Integrated Biocollections Alliance Judith E. Skog, Office of the Assistant Director, Biological.
Implementing a National Data Infrastructure: Opportunities for the BIO Community Peter McCartney Program Director Division of Biological Infrastructure.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
NATIONAL TREASURES DATA PRESERVATION WITH METADATA Sharon Shin Metadata Coordinator Federal Geographic Data Committee Secretariat ASPRS-Reno 2006.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Digital Data Collections ARL, CNI, CLIR, and DLF Forum October 28, 2005 Washington DC Chris Greer Program Director National Science Foundation.
Preliminary Findings Baseline Assessment of Scientists’ Data Sharing Practices Carol Tenopir, University of Tennessee
Research Data Management Nova Southeastern University – Halmos College of Natural Sciences and Oceanography – Ocean Campus November 2015 Data Management.
The Global Scene Wouter Los University of Amsterdam The Netherlands.
1 - What are the local, regional, and continental-scale exchanges of carbon, nitrogen, and reactive species? What are their relationships to underlying.
High throughput biology data management and data intensive computing drivers George Michaels.
Social and Personal Factors in Semantic Infusion Projects Patrick West 1 Peter Fox 1 Deborah McGuinness 1,2
Big Data in Indian Agriculture D. Rama Rao Director, NAARM.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Research Data Management Nova Southeastern University – Halmos College of Natural Sciences and Oceanography – Ocean Campus November 2015 Institutional.
What is the Darwin Project? Goals Investigators Funding
Flanders Marine Institute (VLIZ)
Branches of Oceanography
Unidata Policy Committee Meeting
Digital Agriculture Opportunities in Engineering
Bird of Feather Session
Wrap-Up – NSF Site Visit 8 February 2010
BCoN Data Integration Workshop, University of Kansas, Feb 13-14, 2018
Presentation transcript:

Biological Oceanography Scientific Domain Ed DeLong MIT Department of Biological Engineering Department of Civil and Environmental Engineering DataSpace 1

Coupling of physical & biological oceanographic processes Comparative ecosystem analysis Biodiversity, biomass and productivity C-N-P cycling and energy flow Production, consumption of greenhouse gases: climate Measurement, modeling and experiments with microbial communities in the sea Education, training and knowledge exchange BIOLOGICAL OCEANOGRAPHY

Microbial and sampling scales, based on Dickey (1991) and Allen (2000): Ricardo Letelier Oceanagraphic sampling approaches in the context of scales Scope & Scale : Challenges in Biological Oceanography (Genomes to Biomes…)

ADVANCED INSTRUMENTATION Continuous, autonomous collection of 4D physical, chemical and bio- optical datasets

2 Eddies 1 frontal system Sub-mesocale features? Higher Chla bellow cyclone DCM constant Patchy distribution of small particles Advection/local production of small particles in the Z e

Further specialization: Marine Metagenomics Traditional microbiology and microbial genome sequencing studies rely on cultivated cultures Marine metagenomics: DNA sequences of microbial assemblages from the environment Metagenomic data is used by scientists across multiple disciplines, e.g., Biological engineering & biotechnology Genomics and computation biology Ecology and environmental science Climate: relationship between marine microbes & the ocean’s carbon cycle, productivity, greenhouse gases 6

H179_454DNA_vs_Pelagibacter * * * 25 m 75 m 125 m

2 ND Gen Sequencing Platforms Cost per run~$50<$12K<$5K Bases read/run 72 Kbp100 Mbp 500 Mbp >2 Gbp > 200 Gbp !!! Bases per read >36 (> Paired end reads) Reads per run 96 reads/run400K reads/run20M reads/run $ per Mbp$ 694 $ 120 $ 7 AB3730 work equivalent -100x AB3730/dy300x AB3730/dy ErrorsDiverse (cloning bias) Homopolymeric runs Diverse (base subn.) Run time1 hour6.5 hours2-14 days* AB FLX/titan.ILLUMINA

Biological Oceanography Data Challenges Wide variety and heterogeneity of data types Oceanographic cruise data Oceanographic time series data Laboratory & field experiments Remote sensing datasets Data from gliders, AUVs & moorings Genomics, metagenomics, gene expression data Numerical simulations & synthesis products Distributed data (multi-institution & researchers) Need to balance PI, project & public data accessibility Data visualization & analysis needs Long term archiving requirements

Why do biological oceanographers need DataSpace?

UH  MIT  OSU  UCSC  WHOI  MBARI

DataSpace partners: MIT-OSU Oceanographic Science Partners Ed DeLong (MIT) & Ricardo Letelier (OSU) Library IT Partners MacKenzie Smith (MIT) & Terry Reese (OSU) DeLong and Letelier Co-PIs on three major projects: Center for Microbial Oceanography: Research and Education (C-MORE) Microbial Oceanography of Oxygen Minimum Zones (MOOMZ) Microbial diversity and activity in seasonal hypoxic waters (MI-LOCO)

Existing Data Portal Currently a distributed approach. Consists of weblinks to individually managed heterogeneous datasets.

Biological and Chemical Oceanography Data Management Office database BCO-DMO Where is the data now ? (Oceanographic data)

Public Databases: NCBI and CAMERA National Center for Biotechnology Information Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis Where is the data now? (Genomic/metagenomic) In-house Databases …

Why do biological oceanographers need DataSpace ? Data access, storage, search not centralized Large heterogeneous datasets Complex data management/sharing requirements Shared multiple Institutions & Investigator Long term requirements (2017) Need cross-investigator,institution,project search Currently lots of data is “lost”, e.g. not utilizable

Why do biological oceanographers need DataSpace ? How many autonomous surveys, cruises, mooring datasets, hydrocasts, deckboard experiments had chlorophyll concentrations than X ? Of those data, how many had light levels and oxygen concentrations corresponding Y and Z ? Of those data, how many have corresponding microbial community taxonomic composition and gene content data ? (retrieve) What is the relationship between light, chlorophyll, oxygen and microbial community taxonomic composition and gene content, across all datasets ? How do taxa and gene content relate to oxygen levels and the balance of production and consumption ? Greenhouse (GHG) gas levels ? Are there specific gene proxies that predict oxygen or GHG levels ? Note: centralized data access, search and storage will also drive the way we (sceintists) ask our questions, collect, and annotate our data. = A collaboration between scientists, IT, curators and database managers.

The DataSpace Project & Biological Oceanography Provide infrastructure for digital archiving & preservation at appropriate scales matching scope/complexity of data Enable more integrated intra- & inter-project collaborations, analyses, data encoding, documentation, sharing, visualizing, and preservation Establish standards & best practices to capture, express, encode and publish the policies related to archived data Enable new discoveries by facilitating access, search storage of large, complex heterogeneous datasets

GENOMES BIOMES Community genomic and transcriptomic data Community metabolism Ecosystem functions Community composition and interactions The DataSpace Project & Biological Oceanography