Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud optimized preprocessing and Data Transformation Demonstration

Similar presentations


Presentation on theme: "Cloud optimized preprocessing and Data Transformation Demonstration"— Presentation transcript:

1 Cloud optimized preprocessing and Data Transformation Demonstration
Kate Crombie+, Anne Raugh* Stirling Algermissen#, Costin Radulescu# November, 29th 2017 PDS Management Council Meeting Tucson, AZ +Indigo Information Services, *PDS Small Bodies, #Jet Propulsion Laboratory © All rights reserved.

2 Today’s demonstration
Transform PDS-released PDS3 data into PDS4, generate a PDS4 Bundle using a Cloud deployment (on Amazon’s AWS), and showcase various aspects of data locality characteristics – deployment diagram on the next slide. Data to be transformed (PDS3) will be already present in the Cloud: No cost for upload. Incur storage costs only. Transformed data (PDS4) will remain in the Cloud: Deploy to the Cloud the necessary tools to work on the transformed data such that only results, reports, checksum, and the likes will be downloaded to keep the egress/download costs to a minimum. The deployment will scale up to do the transformation within a short period of time, and then provide a number of “Data User Nodes” that may be used to manipulate data. © All rights reserved.

3 Data Transformation Service
Physical View PDS Node PI Upload data & Transformation templates Deploy tools AWS Cloud Transformation Service Data User node Validate Working space (Elastic File Storage) Archive (PDS) Data (PDS3) Transformation Template Data (PDS4) Reports Legend: APPS Pipeline PDS Discipline © All rights reserved. Cloud Storage

4 Data Transformation Flow
Transformation Service A work flow is constructed and deployed in the form of a business process model (BPMN 2.0 Standard). © All rights reserved.

5 Demo Today’s demo: Stirling Algermissen will demo the use of APPS, CWS, and PDS tools to process a large PDS3 volume into a PDS4 bundle on the Amazon Cloud using multiple (~50) compute nodes. Data volume will be close to 2TB and will use Apache Velocity templates to do the transformation from PDS3 to PDS4. Once the transformation is complete, a Cloud machine working environment setup with PDS4 tools can be used to validate the output PDS4 bundle. That working environment certainly can include a registry and search service. Similar pipeline demoed here today is also used by the Mastcam Stereo Analysis and Mosaics PDART to reprocess their data (imaging). © All rights reserved.

6 Backup © All rights reserved.

7 APPS Refresher The AMMOS-PDS Pipeline Service (APPS) has two major components: PDS Label Analysis for Interactive Design (PLAID): APPS Pipeline: Provides: Transformation (via Apache Velocity templates) Validation (via PDS’ Vtool) Reporting (via Apache Couch DB) PDS4 Bundle building (via BPMN process orchestration): Using AMMOS’ Common Workflow Service (CWS): Supports Cloud deployment (Amazon’s AWS) © All rights reserved.


Download ppt "Cloud optimized preprocessing and Data Transformation Demonstration"

Similar presentations


Ads by Google