AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM Virtual Geophysics Laboratory (VGL): Scientific workflows Exploiting the Cloud Josh.

Slides:



Advertisements
Similar presentations
MINERALS DOWN UNDER Using Spatial Data Infrastructure (SDI) to: enable interoperable data exchange & usage to deliver corporate decisions Ryan Fraser 19.
Advertisements

Ryan Fraser (CSIRO), Lesley Wyborn (GA), Richard Chopping (GA), Terry Rankine (CSIRO), Robert Woodcock (CSIRO) MINERALS DOWN UNDER Virtual Geophysics Laboratory.
SAN DIEGO SUPERCOMPUTER CENTER Choonhan Youn Viswanath Nandigam, Nancy Wilkins-Diehr, Chaitan Baru San Diego Supercomputer Center, University of California,
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Virtual Geophysics Laboratory (VGL) VGL v1.1 Launch Ryan Fraser, Terry Rankine, Joshua Vote, Lesley Wyborn, Ben Evans, Robert Woodcock February 2013 CSIRO.
Virtual Geophysics Laboratory (VGL) VGL v1.2 NeCTAR Project Close R.Fraser, T.Rankine, J.Vote, L.Wyborn, B.Evans, R.Woodcock, C.Kemp July 2013 CSIRO |
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
Customized cloud platform for computing on your terms !
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
WPS Application Patterns at the Workshop “Models For Scientific Exploitation Of EO Data” ESRIN, October 2012 Albert Remke & Daniel Nüst 52°North Initiative.
EC Grant Agreement no GEOSS Interoperability for Weather Ocean and Water Enhancing the GEOSS Infrastructure for all the Stakeholders.
User requirements for and concerns about a European e-Infrastructure Steven Newhouse, Director.
Virtual Geophysics Laboratory Exploiting the Cloud and Empowering Geophysicists Ryan Fraser, Terry Rankine, Lesley Wyborn, Joshua Vote, Ben Evans. Presented.
, Implementing GIS for Expanded Data Accessibility and Discoverability ASDC Introduction The Atmospheric Science Data Center (ASDC) at NASA Langley Research.
Ryan Fraser, Nicholas Carr Connecting RDSI, NeCTAR and ANDS via Provenance - VHIRL.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
material assembled from the web pages at
National Earth Science Infrastructure Program AuScope Limited Headquarters School of Earth Sciences University of Melbourne Victoria 3010 Tel
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Linking AuScope to the broader minerals industry value chain Jonathan Law, Robert Woodcock, Ryan Fraser, Terry Rankine, Guillaume Duclaux.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM The Spatial Information Services Stack – infrastructure for the AuScope Community Earth.
Virtual Geophysics Laboratory Scientific workflows exploiting the cloud Ryan Fraser, Terry Rankine, Lesley Wyborn, Joshua Vote, Ben Evans... Presented.
Address Maps and Apps for State and Local Governments
National Spatial Data Infrastructure The Spatial Information Services Stack Dr Robert Woodcock.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Virtual Laboratories VGL and Friends R.Fraser, T.Rankine, J.Vote, R.Woodcock AuScope Grid Roadshow 2014 CSIRO | MINERAL RESOURCES FLAGSHIP.
Digital Earth Communities GEOSS Interoperability for Weather Ocean and Water GEOSS Common Infrastructure Evolution Roberto Cossu ESA
Future Directions MINERALS DOWN UNDER SISS – Spatial Information Services Stack Ryan Fraser| Project Lead 20 th March 2012.
Lessons from SEEGrid/AuScope Grid Bruce Simons GeoScience Victoria.
AuScope Spatial Data Infrastructure Supporting Earth Science Dr Robert Woodcock CSIRO.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
Technical Session Virtual Geophysics Laboratory MINERAL RESOURCES Josh Vote | Software Developer September 2014.
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Portable Infrastructure for the Metafor Metadata System Charlotte Pascoe 1, Gerry Devine 2 1 NCAS-BADC, 2 NCAS-CMS University of Reading PIMMS provides.
The Astronomy challenge: How can workflow preservation help? Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios.
Interoperability from the e-Science Perspective Yannis Ioannidis Univ. Of Athens and ATHENA Research Center
Virtual Geophysics Laboratory (VGL) VGL v1.2 NeCTAR Project Close Ryan Fraser, Terry Rankine, Joshua Vote, Lesley Wyborn, Ben Evans, Robert Woodcock July.
Mantid Stakeholder Review Nick Draper 01/11/2007.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM Virtual Geophysics Laboratory (VGL): Scientific workflows Exploiting the Cloud Josh.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM The NCRIS AuScope Community Earth Model Bruce Simons.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Mike Hildreth DASPOS Update Mike Hildreth representing the DASPOS project 1.
Esri UC 2015 | Technical Workshop | Community Parcels Chris Buscaglia.
Web Technologies Lecture 13 Introduction to cloud computing.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM “Building Clients for the AuScope Spatial Information Services Stack (SiSS)” AuScope.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
Project number: ENVRI and the Grid Wouter Los 20/02/20161.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM AuScope Grid Architecture “Where does your architecture fit in with the big picture?”
Esri UC 2015 | Technical Workshop | Community Addresses Chris Buscaglia.
Software tools for digital LLRF system integration at CERN 04/11/2015 LLRF15, Software tools2 Andy Butterworth Tom Levens, Andrey Pashnin, Anthony Rey.
Nci.org.au © National Computational Infrastructure 2016 Virtual Laboratories in Australia Lesley Wyborn (NCI) With contributions from:
Enabling Digital Earth by focussing on ‘accessibility’ rather than ‘delivery’. Ryan Fraser CSIRO.
Esri UC 2014 | Technical Workshop | Address Maps and Apps for State and Local Government Allison Muise Nikki Golding Scott Oppmann.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of Atmosphere.
Research and Service Support Resources for EO data exploitation RSS Team, ESRIN, 23/01/2013 Requirements for a Federated Infrastructure.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
Accessing the VI-SEEM infrastructure
Pasquale Pagano (CNR-ISTI) Project technical director
Tools and Services Workshop
Tools and Services Workshop Overview of Atmosphere
A Web-enabled Approach for generating data processors
VHIRL: Virtual Hazard Impact and Risk Laboratory – “A reuse story”
Virtual Geophysics Laboratory (VGL): Exploiting the Cloud and HPC
Presentation transcript:

AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM Virtual Geophysics Laboratory (VGL): Scientific workflows Exploiting the Cloud Josh Vote, Ryan Fraser, Terry Rankine Ben Evans, Michael Chapman Lesley Wyborn, Richard Chopping CSIRO Workshop on Workflows, High-Throughput Imaging, Visualisation and Accelerated Computing. October 11-14, 2011

Scientific workflow – Virtual Geophysics Laboratory (VGL) Scientific Workflow Engine (or Virtual Laboratory) Automates and massively expands Geophysicists computational capacity via the Cloud –Amazon – EC2 / S3 (and others using this interface) –OpenStack Collaboration between CSIRO, GA, NCI, Monash, UQ, and ANU VGL is just a pretty face –User Driven GUI –Leverages data providers and cloud technologies to do all the heavy lifting Non-Restrictive Open-source 2

AuScope Grid and SISS AuScope Grid: Charged with delivering software infrastructure to enable Geoscience Community –Establishing governance and sustainability of services –Delivering solutions to research and government organisations –Interoperability Spatial Information Services Stack (SISS) An open-source, open-standards SDI Achieves Interoperable data exchange Standardises on 3 components – Format, Content, Tools

V(what)GL VEGL – Virtual Exploration Geophysics Laboratory –One primary science collaboration –One primary workflow –One collection of geophysical data sets VGL - Virtual Geophysics Laboratory –Collaboration with multiple partners –Supporting multiple workflows –New data sets –New data types –New Use – Not just exploration. Done

Geophysics (as seen by a software dev) Geophysics is taking physical measurements… –Magnetism over an area –Acceleration due to gravity over an area …Applying lots of mathematics… …to infer the structure of what is under the surface… It is not geology –Samples are never taken, only measurements 5 Apply Mathematics Raw data But -- Where is all the gold?

Our Geophysics Problem Measurements coming from the field are ‘raw’ –Varying spatial reference systems –Noisy –Artifacts from collection process This data needs processing –From raw data to a data product Data products are valuable –They will be re-used and referenced repeatedly Processing is a time consuming process –Made worse by a purely manual workflow 6

The Past Compile raw data using proprietary FORTRAN –Also use software – Intrepid Transform to a regular grid using more software –MATLAB, Intrepid, ER Mapper, ESRI ArcGIS, QGIS Crop data spatially to suit final data product –eg: everything in Victoria Transform data into a file format that can be read by proprietary scientific code. –This is usually done with some handwritten python or c –There is no version control, code is often rewritten / redone Upload data to HPC –Manually enter input parameters/start job 7

Hardcopy of data SSH Client MATLAB Intrepid 8 Let’s map it out… Transform to a regular grid Crop data to area of interest Reformat data for processing Upload data to NCI Configure job and start processing Download results Get handed field data Visualise data

There seems to be a problem… Reproducibility – there is none What was inputted into your model? What transformations occurred? It’s a manual process Time consuming Error prone Expensive Licensing costs 9

Our solution Virtual Geophysics Laboratory Reproducibility –All input data is saved and then published with the final data product VGL automates portions of the workflow –Allowing scientists to focus on science Built entirely on open source tools –No licensing costs 10

11

Data Discovery 12

Data Selection 13

Script Builder 14

Job Monitoring 15

Published provenance records 16

17 From this… Hardcopy of data SSH Client MATLAB Intrepid Transform to a regular grid Crop data to area of interest Reformat data for processing Upload data to NCI Configure job and start processing Download results Get handed field data Visualise data

…to this Virtual Geophysics Laboratory Build “science” from existing libraries Run job Collect and publish results Discover raw data Select spatial bounds

VGL - Summary VEGL has been built for a Geophysics workflow –Its concepts can be re-used for other scientific workflows Is in the process of being deployed at GA –It can produce actual scientific data products Is capable of integrating with any SISS data provider –Or any provider that understands the OGC standards It’s built from many ‘generic’ components that can be repurposed Is just a pretty face –The power lies with the underlying services –These services are accessed using standardised protocols 19

Future Work VGL - Virtual Geophysics Laboratory –Collaboration with multiple partners –Supporting multiple workflows –New data sets –New data types –New Use – Not just exploration. Exploiting the generic –Modularising the workflow for general scientific usage Repurposing for other use cases – nature hazards, climate prediction, etc Commercial uptake Integration with other VLs to achieve ultimate aim… 20

Sustainable Energy Policy Societal Need Energy Exploration Integrated Virtual Laboratory Fishery adaptation V. Lab Integrated Virtual Labs Virtual Geophysical Laboratory Virtual Core Laboratory Virtual Geodesy Laboratory Virtual Climate Laboratory Virtual Fisheries Laboratory Virtual Laboratories GeophysicsBorehole dataGeodesy Climate Modelling Fisheries Monitoring Virtual Libraries Processing Services Data Middleware Processing Services Data Middleware Processing Services Data Middleware Processing Services Data Middleware Processing Services Data Middleware Modelling & analytic tools Virtual Libraries to Laboratories 21

Thank you and for more information: CSIRO Earth Science & Resource Engineering Josh Vote Software Developer Phone: Web: