EOSDIS Approach to Data Services in the Cloud

Slides:



Advertisements
Similar presentations
The Live Access Server (Access to observational data) Jonathan Callahan (University of Washington) Steve Hankin (NOAA/PMEL – PI) Roland Schweitzer, Kevin.
Advertisements

Abuse Testing Laboratory Management Laboratory Management.
Low Cost, Scalable Proteomics Data Analysis Using Amazon's Cloud Computing Services and Open Source Search Algorithms Brian D. Halligan, Ph.D. Medical.
Operations Management and Technology Ross L. Fink.
A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.
Operating Systems.
Windows.Net Programming Series Preview. Course Schedule CourseDate Microsoft.Net Fundamentals 01/13/2014 Microsoft Windows/Web Fundamentals 01/20/2014.
Ch 1. Introduction Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2012.
Industrial Project (234313) Final Presentation “App Analyzer” Deliver the right apps users want! (VMware) Students: Edward Khachatryan & Elina Zharikov.
Concept demo System dashboard. Overview Dashboard use case General implementation ideas Use of MULE integration platform Collection Aggregation/Factorization.
AIRNow-International The future of the United States real-time air quality reporting and forecasting program and GEOSS participation John E. White U.S.
Instrumentation System Design – part 2 Chapter6:.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
Python File Handling. In all the programs you have made so far when program is closed all the data is lost, but what if you want to keep the data to use.
©2010 John Wiley and Sons Chapter 12 Research Methods in Human-Computer Interaction Chapter 12- Automated Data Collection.
Model Coupling Environmental Library. Goals Develop a framework where geophysical models can be easily coupled together –Work across multiple platforms,
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
The 2000 Decennial Census School District Project: Using Census Data for the School District Mapping System **** Development and Implementation Tai A.
1 1 ECHO Overview and Status Enabling Interoperability with NASA Earth Science Data and Services GES DISC User Working Group May 10, 2011 Andrew E. Mitchell.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Technical Workshops | Esri International User Conference San Diego, California Creating Geoprocessing Services Kevin Hibma, Scott Murray July 25, 2012.
SIMO SIMulation and Optimization ”New generation forest planning system” Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
Server to Server Communication Redis as an enabler Orion Free
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Web Technologies Lecture 8 Server side web. Client Side vs. Server Side Web Client-side code executes on the end-user's computer, usually within a web.
1 Adventures in Web Services for Large Geophysical Datasets Joe Sirott PMEL/NOAA.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
A Technical Overview Bill Branan DuraCloud Technical Lead.
ORNL DAAC SPATIAL DATA ACCESS TOOL Open Geospatial Consortium (OGC) Services Bruce E. Wilson Suresh K. Santhana Vannan Yaxing Wei Tammy W. Beaty National.
A Use Case for GEON 1 A user request of the form: “For a given region (i.e. lat/long extent, plus depth), return a 3D structural model with accompanying.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Redmond Protocols Plugfest 2016 Jinghui Zhang Office Interoperability Test Tools (Test Suites and Open Source Projects) Software Engineer Microsoft Corporation.
GIS IN THE CLOUD Cloud computing furnishes scalable GIS technology that is maintained off premises and delivered on demand as services via the Internet.
CLOUD ARCHITECTURE Many organizations and researchers have defined the architecture for cloud computing. Basically the whole system can be divided into.
Data Browsing/Mining/Metadata
IBM Predictive Analytics Virtual Users’ Group Meeting March 30, 2016
Data Are from Mars, Tools Are from Venus
Design Components are Code Components
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
MATLAB Distributed, and Other Toolboxes
Amazon Storage- S3 and Glacier
Spark Presentation.
Platform as a Service.
Improving Data Access, Discovery, and Usability
Tools and Services Workshop Overview of Atmosphere
Efficiently serving HDF5 via OPeNDAP
DATA MINING Python.
Lecture 22: Using ArcToolbox Tools in Python
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
Open Data Cube Jupyter Notebooks
Chapter 12: Automated data collection methods
DESIGN & IMPLEMENTATION
Module 01 ETICS Overview ETICS Online Tutorials
Google App Engine Ying Zou 01/24/2016.
What's New in eCognition 9
Technical Capabilities
Last.Backend is a Continuous Delivery Platform for Developers and Dev Teams, Allowing Them to Manage and Deploy Applications Easier and Faster MICROSOFT.
Option One Install Python via installing Anaconda:
Python and REST Kevin Hibma.
Introduction to Portal for ArcGIS
What's New in eCognition 9
What's New in eCognition 9
What is UiPATH? For more details visit this link online-training.
Roadmap and short term activities on interoperability of Data Cubes
Adapting an existing web server to S3
Presentation transcript:

EOSDIS Approach to Data Services in the Cloud

Data Transformation Services in the Cloud Subsetting: Variable, Spatial, Temporal Reformatting: shapefile, etc. Regridding / Reprojection / Orthorectification Stitching / Mosaicking Dataset-Specific Preprocessing Despeckling for Synthetic Aperture Radar Geophysical Retrievals Etc.

What Makes Cloud Different? What’s New? So What? Data egress costs money Subsetting saves money Data processing costs money We have to watch costs of transformations Processing faster does not cost more Transformations that used to be orders may be streamable (synchronous) Transformation code is easily shared via containers or machine images Users can transform at their own speed Data are stored in Web Object Storage, not filesystems Current tools (may) need to be adapted to read the input data

User Interaction Patterns request synchronous streaming subsetting 1 file 100100110001010111... synchronous staging preprocessing 1 file to Analysis-Ready Data request data “handle” request aggregating many files asynchronous staging data “handle”

How Reuse Can Work Source Code Package Installation (conda, homebrew, …) Python module Container Amazon Machine Image (AMI) Service

Reuse Targets Legacy Source gdal: the core of virtually every Geographic Information System nco (netCDF Command Operators): fast netCDF preprocessing and analysis Sentinel Application Platform (SNAP): easy to use Synthetic Aperture Radar and other processing Open Geospatial Consortium Services Recent Packages Python: pandas, xarray, scikit-learn… R: ? Future: Analysis-Ready Data processing components and chains *netCDF = network Common Data Form

Managing Cost Egress vs. Processing vs. Storage “Easy” Calls: Promote subsetting and other data reduction Promote analysis “in place” Harder tradeoffs How much to do for the user? How much to cache? New Tasks Developing the most cost-effective data transformation capabilities Monitoring ongoing expenditures vs. budget

Interfaces: User vs. Application Python pandas xarray netcdf zarr

Interface Convergence in Jupyter Python pandas xarray netcdf zarr

User-Application Interface Convergence in Jupyter

Search - Analysis Convergence Analyze Download Analyze Analyze