Ecosystem Status Report: collaborating with IPython Notebooks

Slides:



Advertisements
Similar presentations
Design your own house Click here to enter. Introduction Discuss what kind of environment your house is designed for (Eg. hot/humid/city) Describe it main.
Advertisements

Towards a Common Provenance Model for Research Publications Linyun Fu Xiaogang Ma Patrick West Stace Beaulieu.
The main portal page is the entry point to your project information.
Codename: Belinda description The project is composed of a Psd file (with levels) to slice and transform to Xhtml and Css tabless page W3C compliant. On.
1 Richard White Design decisions: architecture 1 July 2005 BiodiversityWorld Grid Workshop NeSC, Edinburgh, 30 June - 1 July 2005 Design decisions: architecture.
Designing a Webpage (from the very start). Background of HTML Don’t download a complete page. Download set of instructions (HTML): –Put this writing here…
CM143 - Web Week 2 Basic HTML. Links and Image Tags.
Experimental Psychology PSY 433
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester materials by Dr Katy Wolstencroft and Dr Aleksandra.
Chapter 3: Formatting MuPAD Documents MATLAB for Scientist and Engineers Using Symbolic Toolbox.
References: [1] [2] [3] Acknowledgments:
UWG 2013 Meeting PO.DAAC Web Services Demo. What are PO.DAAC Web Services?
What has been lacking, until recently, is a successful method to develop, implement and sustain informatics solutions to modern application problems, such.
OARE Module 5A: Scopus (Elsevier). Table of Contents About Scopus (Elsevier) Using Scopus Search Page Results/Refine Search Pages Download, PDF, Export,
Traceability, reproducibility, and scalability in Integrated Ecosystem Assessments: July 2013 ECO-OP is supported by NSF Grant # PIs: Peter Fox.
Modeling and Representing National Climate Assessment Information using Linked Data Jin Guang Zheng 1 Curt Tilmes 2
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Coding Provenance in Software and Matching Tools to Data OPeNDAP Provenance Project And ESIP ToolMatch Project Patrick West, Tetherless World Constellation.
STEP 1 Enter search words in the text box and click on “Search.” In this demo version, LaserSearch downloads just a few hundred documents from the Internet.
References: [1] Lebo, T., Sahoo, S., McGuinness, D. L. (eds.), PROV-O: The PROV Ontology. Available via: [2]
Facilitating Next Generation Science Collaboration: Marine Ecosystems Status Reports and Assessments June 24, 2014 IMBER – D2 Peter Fox (RPI/ Tetherless.
Toward verifiable science: iPython meets PROV-O (Semantics in Ecosystems Assessments). April 16, 2014 ERRT Peter Fox (RPI/ Tetherless World Constellation.
Data Organization Quality Assurance and Transformations.
How Environmental Informatics is Preparing Us for the Era of Big Data AGU FM 2013 GC11F-01 December 09, 2013, MW 3001 Peter
HTML CODE Fill in your handout from notes on the slides!!
NOAA's Northeast Shelf Ecosystem Status Report: collaborating with IPython Notebooks for reproducibility July 2013 ECO-OP is supported by NSF Grant #
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
SAIL 2011: Into the I of the Storm; Information Resources Undergo a Sea Change Texas A&M University at Galveston April 5, 2011 – April 8, 2011 Data Management.
Use the Private link to upload a new file, or to update an existing entry, or to just look-up stuff that only uboone people can see. The Public link will.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
The Reproducible Research Advantage Why + how to make your research more reproducible Presentation for the Center for Open Science June 17, 2015 April.
Automatic manufacturing systems project I. KMAGP12ANC L: András Boráros-Bakucz.
How to Use the Online Project Monitoring System (OPMS) Navigating the Survey.
How to Get Started With Python
Compute and Storage For the Farm at Jlab
Journal of Mountain Science (JMS)
Incorporating W3C’s DQV and PROV in CISER’s Data Quality Review and
Instructor Module Access
MIRACLE Cloud-based reproducible data analysis and visualization for outputs of agent-based models Xiongbing Jin, Kirsten Robinson, Allen Lee, Gary Polhill,
Creating a wiki phage for your lab notebook
Next Generation Health Checks
L – Modeling and Simulating Social Systems with MATLAB
Overview of VAdata Virginia’s Sexual and Domestic Violence Data Collection System.
L – Modeling and Simulating Social Systems with MATLAB
L – Modeling and Simulating Social Systems with MATLAB
L – Modeling and Simulating Social Systems with MATLAB
L – Modeling and Simulating Social Systems with MATLAB
Steering Group Member, Link Digital
Annual Update Plan A Detailed Walk-through of Key Topics.
Installing R and R Studio
Computational Chemistry Seminar
Illinois School Code and Administrative Rules
Data management for reproducible research
How an RSE can benefit your projects:
Vitalnet quickly makes publication-ready output
An introduction to MEDIN Data Guidelines.
This is where R scripts will load
Data Management Writers Workshop
HOW TO MAKE PAGES FOR A WEB SITE
Guide to Editors (ED) Journal of Mountain Science (JMS)
This is where R scripts will load
Towards Executable Provenance Graphs for Reported Results in Research Publications Linyun Fu Xiaogang Ma Patrick West
Business PowerPoint Template
Stata Conference July 12, 2019 Abigail S. Baldridge, MS
Leslie Chavez and Will Bardé
Using the Bartlett Diagnostic Sample Submission Program (Plants)
Scientific Workflows Lecture 15
Presentation transcript:

Ecosystem Status Report: collaborating with IPython Notebooks NOAA's Northeast Shelf Ecosystem Status Report: collaborating with IPython Notebooks for reproducibility July 2013 ECO-OP is supported by NSF Grant #0955649 PIs: Peter Fox (RPI) and Andrew Maffei (WHOI) NEFSC Collaborators: Jon Hare and Mike Fogarty Software programmer: Massimo Di Stefano Informatics and metadata: Stace Beaulieu stace@whoi.edu This lightning talk is for the demo at the rear of this room. I’ll give a brief overview here, and then please come see me or ?Mike for more info during the demo. SAY TITLE, SAY Pis and Massimo

Adopting a provenance model for a collaborative report What is provenance? Lineage, or the history of a data or information product, including how was it processed, who processed it, and where is it stored Another title for our latest work could be SAY TITLE. Provenance refers to the history of a data product, including how was it processed, who processed it, and where is it stored/archived. Data provenance from Wikipedia: “Scientific research is generally held to be of good provenance when it is documented in detail sufficient to allow reproducibility.[24] Scientific workflows assist scientists and programmers with tracking their data through all transformations, analyses, and interpretations.”

Northeast Shelf Large Marine Ecosystem Ecosystem Status Report Use Case: Northeast Shelf Large Marine Ecosystem Ecosystem Status Report Goal: Our use case is READ TITLE. RE-ITERATE GOALS. “traceability, repeatability, explanation, verification, and validation” for ecosystem data and information products in the NEFSC Ecosystem Status Report (ESR)

Section on Climate Forcing Page from 2009 ESR Section on Climate Forcing Figures available for download as PDF or image files – but without access to data or metadata Note: NOAA directive for ISO 19115 metadata, which includes lineage This is page from the 2009 report. No need to read the details – just note that the report provides figures and text, in this case for climate indices – North Atlantic Oscillation upper left. READ RIGHT.

Software design to track data provenance Output of data pipeline is the figure for NAO from report. Our software captures this plus all the metadata describing data processing. But provenance is also who did it, when, and where?

PROV Data Model and PROV-O ontology http://www.w3.org/TR/prov-dm/ W3C Recommendation 30 April 2013 Core Structures (types and relations) Entity may be a single data product, or a chapter containing several data products READ RED. Activity is the data processing that generated the figure or chapter of the report, and Agent is the person who provided the figure or chapter to the report. Workflow provenance (e.g., how to put together the collaborative report)

Code in Python, Matlab, R, other http://ipython.org/ Screenshot of IPython Notebook used to track both data and workflow provenance Here I am showing a screenshot of an IPython Notebook that will output the climate forcing chapter of the 2013 Ecosystem Status report. Provenance is tracked for each data product because the acquisition, processing, and plotting may all be conducted within this one environment. Code in Python, Matlab, R, other

Notebook can be shared, or output as script, HTML, PDF, http://ipython.org/ Screenshot of IPython Notebook used to track both data and workflow provenance Notebook can be shared, or output as script, HTML, PDF, other READ RED. The code may also include an output to PDF or to HTML for the final report.

PDF output of IPython Notebook with clickable links to data and code As an example of this work that is being conducted in parallel to this year’s ESR, if you click on ‘data’ …

Screenshot of csv file at GitHub Access not only to the data that are plotted, but also to provenance metadata for reproducibility READ RED

Data provenance: from environmental data (left) to marine ecosystem indicator (right)