Requirements for EO Data Processing Farms ESA Workshop “Models for Scientific Exploitation of EO Data” ESA ESRIN, Oct 11-12, 2012 Stephan Kiemle German.

Slides:



Advertisements
Similar presentations
1/17/20141 Leveraging Cloudbursting To Drive Down IT Costs Eric Burgener Senior Vice President, Product Marketing March 9, 2010.
Advertisements

The Data Information and Management System (DIMS)
11 Application of CSF4 in Avian Flu Grid: Meta-scheduler CSF4. Lab of Grid Computing and Network Security Jilin University, Changchun, China Hongliang.
Network Resource Broker for IPTV in Cloud Computing Lei Liang, Dan He University of Surrey, UK OGF 27, G2C Workshop 15 Oct 2009 Banff,
© 2007 Open Grid Forum Grids in the IT Data Center OGF 21 - Seattle Nick Werstiuk October 16, 2007.
DFDs Data and Information Management System a Brief Introduction Joint WGISS Subgroup Meetings ICS Task Team Session Toulouse, May 14th,2003 Bernhard Buckl.
1 Virtual Resource Management (VRM) in Cloud Environment draft-Junsheng-Cloud-VRM-00 Friday 21 Jan 2011 B. Khasnabish, Chu JunSheng, Meng Yu.
GEOSS Workshop 20 September 2013 ESRIN P. Bargellini, Ground Segment and Mission Operations Department, Earth Observation Programmes Directorate, European.
Space/GMES and Climate Change Mikko Strahlendorff, GMES Bureau.
Hello i am so and so, title/role and a little background on myself (i.e. former microsoft employee or anything interesting) set context for what going.
© 2006 DataCore Software Corp SANmotion New: Simple and Painless Data Migration for Windows Systems Note: Must be displayed using PowerPoint Slideshow.
Windows IT Pro magazine Datacenter solution with lower infrastructure costs and OPEX savings from increased operational efficiencies. Datacenter.
2  Industry trends and challenges  Windows Server 2012: Modern workstyle, enabled  Access from virtually anywhere, any device  Full Windows experience.
Ed Duguid with subject: MACE Cloud
KAIST Computer Architecture Lab. The Effect of Multi-core on HPC Applications in Virtualized Systems Jaeung Han¹, Jeongseob Ahn¹, Changdae Kim¹, Youngjin.
1 Chapter 11: Data Centre Administration Objectives Data Centre Structure Data Centre Structure Data Centre Administration Data Centre Administration Data.
Deutsches Zentrum für Luft- und Raumfahrt e.V. Bench mark study for new technology archiving devices H.-J. Wolf K.-D Mißling, G. M.Pinna CEOS Subgroup.
Virtualization and Cloud Computing. Definition Virtualization is the ability to run multiple operating systems on a single physical system and share the.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
Tunis, Tunisia, 28 April 2014 Business Values of Virtualization Mounir Ferjani, Senior Product Manager, Huawei Technologies 2.
Adam Duffy Edina Public Schools.  The heart of virtualization is the “virtual machine” (VM), a tightly isolated software container with an operating.
UMF Cloud
Next Generation Application Platform (NGAP) Andrew Mitchell WGISS-39 Tsukuba, Japan Monday, May 11,
ProjectWise Virtualization Kevin Boland. What is Virtualization? Virtualization is a technique for deploying technologies. Virtualization creates a level.
SOFTWARE AS A SERVICE PLATFORM AS A SERVICE INFRASTRUCTURE AS A SERVICE.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
VMware vSphere 4 Introduction. Agenda VMware vSphere Virtualization Technology vMotion Storage vMotion Snapshot High Availability DRS Resource Pools Monitoring.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
1. Outline Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models 2.
Cloud Computing Why is it called the cloud?.
Cloud Computing All Copyrights reserved to Talal Abu-Ghazaleh Organization
Cloud Computing.
VIRTUALIZATION AND CLOUD COMPUTING Dr. John P. Abraham Professor, Computer Engineering UTPA.
Microsoft Virtual Academy. Microsoft Virtual Academy.
NORDUnet NORDUnet The Fibre Generation Lars Fischer CTO NORDUnet.
Virtual Machine Course Rofideh Hadighi University of Science and Technology of Mazandaran, 31 Dec 2009.
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
Storage and data services eIRG Workshop Amsterdam Dr. ir. A. Osseyran Managing director SARA
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
VO Sandpit, November 2009 e-Infrastructure to enable EO and Climate Science Dr Victoria Bennett Centre for Environmental Data Archival (CEDA)
Satellites, Ground Segment, and Data Access Evolution at DLR K
VO Sandpit, November 2009 e-Infrastructure for Climate and Atmospheric Science Research Dr Matt Pritchard Centre for Environmental Data Archival (CEDA)
Status: For Information Only ESA UNCLASSIFIED - For Official Use Cloud Processing at ESA [EO Payload Ground Segment] Cristiano Lopes, ESA CEOS WGISS-40.
VMware vSphere Configuration and Management v6
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing,
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Øg fleksibiliteten i din infrastruktur 32 virtual processors per VM 1 TB virtual machine memory New 64TB VHDX format Native 4k disk support Hyper-V.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Page 1 Jordi Farres Grid and Cloud activities 15/3/2011.
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
Brian Lauge Pedersen Senior DataCenter Technology Specialist Microsoft Danmark.
Research and Service Support Resources for EO data exploitation RSS Team, ESRIN, 23/01/2013 Requirements for a Federated Infrastructure.
ESA UNCLASSIFIED – For Official Use Scientific exploitation…. Ws input to the round table DD/MM/YYYY.
FusionCube At-a-Glance. 1 Application Scenarios Enterprise Cloud Data Centers Desktop Cloud Database Application Acceleration Midrange Computer Substitution.
LIGHTWEIGHT CLOUD COMPUTING FOR FAULT-TOLERANT DATA STORAGE MANAGEMENT
RSS Data-farm: from local storage to the cloud
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
By Chris immanuel, Heym Kumar, Sai janani, Susmitha
INTA ESA-ESRIN 1st LTDP+ Workshop Canaries Space Centre
Bernd Panzer-Steindel, CERN/IT
Technology for Long Term Digital Preservation Workshop ESA 22/09/2017
Managing Clouds with VMM
Exploitation Platforms and Common Reference Architecture
Specialized Cloud Mechanisms
Cloud Computing Architecture
Presentation transcript:

Requirements for EO Data Processing Farms ESA Workshop “Models for Scientific Exploitation of EO Data” ESA ESRIN, Oct 11-12, 2012 Stephan Kiemle German Remote Sensing Data Center DFD German Aerospace Center DLR

Evolution of EO PGS Processing Facilities Dedicated Facility single mission dedicated hardware tight coupling, static scheduling predictable performance expensive investment, housing, operating no flexibility e.g. 1 st generation ENVISAT PAF Chart 2> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data > Shared Facility multi-mission shared hardware static deployment, dynamic scheduling controlled performance usage rate reduces costs growth and renewal still difficult e.g. ESA MMFI Virtualized Facility

Virtualized Processing Facilities multi-purpose independent hardware dynamic deployment and scheduling dynamic performance initial + continuous renewal investment, pay per use scaling with low impact on applications Sounds good! But … Chart 3> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data > Infrastructure as a Service Control VM Processor = #CPU + #Mbit/s Control Processor = #request Platform as a Service Control Processor = #VCPU + #MB/day Software as a Service Processor Control Host

Scientific Exploitation Use Case: Reprocess Tbyte of Archived EO Data Chart 4> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data > Processor in out in processing out in processing out in processing out in processing out in processing out in processing out in processing out p in processing out b in processing out

Example – Large Scale Reprocessing  joint analysis of processing and data management required execute processing algorithms where the data is cross-distribute data archiving Chart 5> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data > Sentinel-5 Precursor L1b-L2 reproc. 1 year n14.3 * 365 = 5220 products s50 GB per L1b product r2 % in-to-out ratio t2400 s processing time per product (21MB/s) - moderate Local FacilityLAN CloudWAN Cloud b1 Gbit/sb500 Mbit/sb100 Mbit/s p6.9 nodesp3.9 nodesp1.6 nodes 24 days50 days248 days

Example – Small Scale Analysis Chart 6> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data > Hypothetic Data Analysis Scenario n1000 products s1 GB per input product r10 % in-to-out ratio t300 s processing time per product (3.4 MB/s) - complex Local FacilityLAN CloudWAN Cloud b1 Gbit/sb500 Mbit/sb100 Mbit/s p(38) 10 nodesp19.4 nodesp4.8 nodes 8.7 hours4.6 hours22.7 hours  processing complexity versus data volume determines distribution

Requirements for EO Data Processing Farms Processing performance versus i/o rate Dynamically balance distributed processing taking into account number of CPUs, RAM, disk cache allocated other local resources (e.g. embedded DBs, log files) actual transfer rates for inputs, auxiliary data, outputs Coordination Define procedures and guidelines for use Reconcile conflicts between projects Accounting Monitoring and control Privacy/security/availability Clear separation of production environment and other “scientific” environments Chart 7> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data >

Consequences for Processing and Data Management Individual analysis for best system approach (local, farm, private cloud, …) data rates, processing level/complexity project characteristics, processing strategies Algorithms encapsulated in deployable processors/processing systems Data processors shall dynamically use CPUs, RAM, disk cache as allocated Establish/extend standards for algorithm integration and processor deployment Bulk product transfer capabilities, pipelining/streaming for input data set provision and output data set repatriation Evolve archives to data lifecycle centers layered data sets for tailored access performance defined consolidation/migration capacities (LTDP context) new primary data access interfaces: geodata, time series Chart 8> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data >

“GeoFarm” for Scientific EO Data Exploitation at DLR Oberpfaffenhofen 2 Blade Centers (Dell), total 672 cores Opteron, 3.3 TB RAM, interconnected with 10Gb/s Ethernet 288 TB SAN storage, connected with 4 GB/s Fiber-Channel Virtualized using Citrix XenServer 6 (advanced edition) Separated pools for production network/normal infrastructure Usage examples: Project scope: ENVISAT/MERIS data reprocessing for CCI Fire using CATENA Continuous operational: O3M-SAF NRT, offline and re-processing Ongoing definitions: use scenario and application procedure monitoring accounting, cost calculation and sharing Chart 9> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data >

Conclusions Evolution of processing facilities towards virtualization Different scientific EO exploitation use cases require different distributed computation models, depending on input data size, processing complexity/strategy and network bandwidth Requirements in context of EO data processing farms: processors need to become deployable in standard environment and dynamically use allocated resources bulk input data provision using elaborated data management principles and technologies DLR operates a virtualized EO processing infrastructure “GeoFarm” Chart 10> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data >

Thank you! Questions? Stephan Kiemle Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR) German Aerospace Center Earth Observation Center | German Remote Sensing Data Center Oberpfaffenhofen Wessling | Germany Chart 11> Req. for EO Data Processing Farms > Stephan Kiemle Models for Scientific Exploitation of EO Data >