Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.

Slides:



Advertisements
Similar presentations
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Advertisements

PaN-data WP7 - Integration Brian Matthews STFC-e-Science.
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
TWC Why Data Science Matters Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic Institute
DARE: building a networked academic repository in the Netherlands ICOLC October 25 Ronald Dekker Delft University of Technology Library.
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
McGuinness – Microsoft eScience – December 8, Semantically-Enabled Science Informatics: With Supporting Knowledge Provenance and Evolution Infrastructure.
Data Publishing Workflows: Strategies and Standards
FROM DATA REPOSITORIES TO DATA JOURNALS – WHERE, WHEN AND HOW TO SUBMIT Andrew L. Hufton Managing Editor, Scientific Data Nature Publishing Group
TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
ToolMatch: Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Patrick West 1 Nancy Hoebelheinrich.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Using DCO Data (Infrastructure, Management, Analysis, Visualization, …) Peter (Marshall Ma) and the Data Science
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
DATAVERSE FOR JOURNALS Mercè Crosas, Ph.D. Director of Data Science IQSS, Harvard Society for Scholarly Publishing 37 th Meeting,
Progress in Open-World, Integrative, Web-based Collaborative Research Platforms Peter Fox and the DCO-DS* Team Tetherless World Constellation.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
An Example in The DCO Data Portal Formal Specification of Data Types in the Deep Carbon Observatory Data Portal Xiaogang (Marshall) Ma
References: [1] [2] [3] Acknowledgments:
DCO's Data Science Day Introduction June 5, 2014, Troy NY Peter Fox (Rensselaer Polytechnic Institute)
1 Open Access & Shades of Gre Open Access & Shades of Grey Open Access Increases Visibility of Grey Literature Providing an Essential Complement to Peer-Reviewed.
Innovation & Supplementary Material Eleonora Presani – Elsevier
Web of Science® Krzysztof Szymanski October 13, 2010.
Deep Carbon Observatory Data Science and Data Management Infrastructure Overview and Demonstration Patrick West – Tetherless World Constellation Rensselaer.
Semantic Cyberinfrastructure for Knowledge and Information Discovery (SCiKID) Proposal Principle Investigator: Eric Rozell Tetherless World Constellation.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
E-Science for the SKA WF4Ever: Supporting Reuse and Reproducibility in Experimental Science Lourdes Verdes-Montenegro* AMIGA and Wf4Ever teams Instituto.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Motivations and Challenges: Proper data management hinges on recording and maintaining “steps” applied to create data. Consumers require methods to assess.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
BMC Open Access Colloquium, 8 February Morgan: "Open Access Repositories"
Traceability, reproducibility, and scalability in Integrated Ecosystem Assessments: July 2013 ECO-OP is supported by NSF Grant # PIs: Peter Fox.
Joint Declaration of Data Citation Principles Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman &
Scientific Data and Electronic Publishing Renze Brandsma, Head, Digital Production Centre University of Amsterdam Maarten Hoogerwerf, Project Manager,
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
TWC Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Observatory Community Xiaogang (Marshall) Ma, Yu Chen, Han Wang, Patrick West,
Prof. Peter #twcrpi) Tetherless World Constellation Chair, Earth and Environmental Science/ Computer Science/ Cognitive.
1 Semantic Provenance and Integration Peter Fox and Deborah L. McGuinness Joint work with Stephan Zednick, Patrick West, Li Ding, Cynthia Chang, … Tetherless.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
DataONE: Preserving Data and Enabling Data-Intensive Biological and Environmental Research Bob Cook Environmental Sciences Division Oak Ridge National.
Brief: Data Science Progress/ Activities and Renewal Plans DCO Executive Committee. Oct. 8-9, Rome (IT) DCO-DS = DCO Data Science.
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
VIVO Conference 2013 Panel on VIVO Use-Cases for Collaborative Science: From Researcher Networks to Semantic User Interfaces for Data Patrick West – Tetherless.
References: [1] Lebo, T., Sahoo, S., McGuinness, D. L. (eds.), PROV-O: The PROV Ontology. Available via: [2]
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Facilitating Next Generation Science Collaboration: Marine Ecosystems Status Reports and Assessments June 24, 2014 IMBER – D2 Peter Fox (RPI/ Tetherless.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
DCO-DS: Moving Forward DCO Synthesis Meeting. Oct , 2015 DCO-DS = DCO Data Science.
DOE Data Management Plan Requirements
Deep Carbon Observatory Data Science and Data Management Infrastructure Overview and Demonstration Patrick West – Tetherless World Constellation Rensselaer.
1 Introducing the Australian National Data Service (ANDS) Research data as a scholarly output Options for data publishing and data discovery Make your.
How Environmental Informatics is Preparing Us for the Era of Big Data AGU FM 2013 GC11F-01 December 09, 2013, MW 3001 Peter
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Project number: ENVRI and the Grid Wouter Los 20/02/20161.
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
Leveraging the Expertise of our Staff and the Information Resources We Manage MIT Libraries Visiting Committee April 13, 2005.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Deep Carbon Observatory Data Science Platform
Data types and persistent identifiers in
Adoption of RDA DTR and PIT in the Deep Carbon Observatory Data Portal
Bird of Feather Session
Presentation transcript:

deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic Institute Deep Carbon Virtual Observatory: Leveraging Data Science to Facilitate Earth Science Research

Outline Deep Carbon Virtual Observatory Data Management, Publication and Citation Provenance of Research Era of Science 2.0 2

Deep Carbon Virtual Observatory A vision of the DCVO: –A conceptual model of the interplay between data, people, publication, instruments, models, organizations, etc. –Identify, annotate and link all key entities, agents and activities –A repository for datasets and associated metadata –Unique and powerful data and metadata visualization for dissemination of information –Collaboration tools for scientific efforts –An integrated portal for diverse content and applications 3 (Fox et al., 2014)

Data Management 4 data work Image courtesy Randy Glasbergen

Data Management Plan DCO Open Access and Data Policies – Data Management Plan –A formal document that outlines what you will do with your data during and after you complete your research Resources/Tools help create DMPs: –DCC Data Management Plans: –NSF Data Management Plan Requirements: –DMPTool: –DCC DMPOnline: 5

Data Publication Data as first class products of research –e.g., NSF bio-sketches can include data publications 6 Image from j4h.net (NSF, 2012)

7 “All data necessary to understand, assess, and extend the conclusions of the manuscript must be available to any reader of Science. ” “…authors are required to make materials, data and associated protocols promptly available to readers without undue qualifications.” “…authors must make materials, data, and associated protocols available to readers.” “…it is a condition of publication that authors make available the data and research materials supporting the results in the article.” “…require authors to make all data underlying the findings described in their manuscript fully available without restriction…” “Earth and space science data should be widely accessible in multiple formats and long ‐ term preservation of data is an integral responsibility of scientists and sponsoring institutions.” “…support the principle that research data should be made freely available to all researchers…” “…recommends depositing data that correspond to journal articles in reliable data repositories…”

Ways of data publication –Data as supplemental material of a paper –Standalone data –Data paper: data + descriptive ‘data paper’ 8 (Strasser, 2014) Examples: Standalone data journals: Nature Scientific Data, Geoscience Data Journal, Ecological Archives, Data in Brief Journals that publish data papers: Earth and Space Science, GigaScience, F1000 Research, Internet Archaeology

9 What does a DCO data publication look like?

10

11 Image from Internet; Anonymous author

Data Citation Data Citation Index –Indexes the world's leading data repositories –Connect datasets to related refereed literature indexed in the Web of Science™ –Efficient access to data across subjects and regions 12

Data interoperability 13 Ma et al., Nature Geosciecne (2011) Interoperable: “Data should be discoverable, accessible, decodable, understandable and usable, and data sharing should be legal and ethical for all participants.” Original image from:

Provenance of research Provenance documentation –Linking a range of observations and model outputs, research activities, people and organizations involved in the production of scientific findings with the supporting data sets and methods used to generate them 14 Provenance enables the traceability, reproducibility, explanation, verification, and validation of scientific findings. Image from nature.com Ma et al., Nature Climate Change (2014)

We made extension to the IPython Notebook to enable automatic provenance capture during a scientific workflow IPython Notebook: A web-based interactive computational environment (Di Stefano et al., 2014)

Era of Science 2.0 Science 2.0 –New practices of scientists who post raw experimental results, nascent theories, claims of discovery and draft papers on the Web for others to see and comment on –Social scholarship: Reconsidering scholarly practices in the age of social media 16 (Waldrop, 2008; Greenhow and Gleason, 2014) Practice

altmetric.com –already a product used by NPG, Springer, PNAS, Wiley, etc This Altmetric score means that the article is: in the 99 percentile (ranked 184th) of the 81,261 tracked articles of a similar age in all journals in the 92 percentile (ranked 69th) of the 983 tracked articles of a similar age in Nature

Summary Data Science is making DCO a more open, more collaborative, and more productive community eScience: the digital or electronic facilitation of science Are you ready?  18 Image courtesy BGS © NERC

19 Thank you!