Page 1PV2009, 1-3/12/2009 The ESA Earth Observation Payload Data Long Term Storage Activities Gian Maria Pinna (1), Francesco Ferrante (2) (1) ESA-ESRIN.

Slides:



Advertisements
Similar presentations
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Advertisements

Andrew Hanushevsky7-Feb Andrew Hanushevsky Stanford Linear Accelerator Center Produced under contract DE-AC03-76SF00515 between Stanford University.
XenData SXL-3000 LTO Archive System Turnkey video archive system with near-line LTO capacities scaling from 150 TB to 750 TB, designed for the demanding.
Slide: 1 ROSA GRAS Meeting February 2009 Matera, Italy User Services EUMETSAT EUMETSAT Data Access & User Support.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Vorlesung Speichernetzwerke Teil 2 Dipl. – Ing. (BA) Ingo Fuchs 2003.
High Performance Cooperative Data Distribution [J. Rick Ramstetter, Stephen Jenks] [A scalable, parallel file distribution model conceptually based on.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
H-1 Network Management Network management is the process of controlling a complex data network to maximize its efficiency and productivity The overall.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
ESA Report to DAI/IPR WG Gian Maria Pinna ESA/ESRIN DAI/IPR meeting Colorado Springs January 2007.
1 The Google File System Reporter: You-Wei Zhang.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
School of EECS, Peking University Microsoft Research Asia UStore: A Low Cost Cold and Archival Data Storage System for Data Centers Quanlu Zhang †, Yafei.
Challenges of Digital Media Preservation Karen Cariani, Director Media Library and Archives Dave MacCarn, Chief Technologist.
Generic Instrument Processing Facility Interface Specifications A. BuongiornoFrascati 12 /10/2012 ESA EOP-GS 1.
9-Sept-2003CAS2003, Annecy, France, WFS1 Distributed Data Management at DKRZ Distributed Data Management at DKRZ Wolfgang Sell Hartmut Fichtel Deutsches.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
1 U.S. Department of the Interior U.S. Geological Survey Contractor for the USGS at the EROS Data Center EDC CR1 Storage Architecture August 2003 Ken Gacke.
Personal Computer - Stand- Alone Database  Database (or files) reside on a PC - on the hard disk.  Applications run on the same PC and directly access.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 1: Introduction to Scaling Networks Scaling Networks.
 CASTORFS web page - CASTOR web site - FUSE web site -
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
VMware vSphere Configuration and Management v6
ESA Report to DAI/IPR WG Gian Maria Pinna DAI/IPR meeting Toulouse 2-5 November 2004.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.
© Copyright 2004 Instrumental, Inc I/O Types and Usage in DoD Henry Newman Instrumental, Inc/DOD HPCMP/DARPA HPCS May 24, 2004.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
Next Generation of Apache Hadoop MapReduce Owen
CLIENT SERVER COMPUTING. We have 2 types of n/w architectures – client server and peer to peer. In P2P, each system has equal capabilities and responsibilities.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
ETERE A Cloud Archive System. Cloud Goals Create a distributed repository of AV content Allows distributed users to access.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
An Introduction to GPFS
Lustre File System chris. Outlines  What is lustre  How does it works  Features  Performance.
Compute and Storage For the Farm at Jlab
XenData SX-10 LTO Archive Appliance
Scalable sync-and-share service with dCache
RSS Data-farm: from local storage to the cloud
Data Stewardship Interest Group WGISS-43 Meeting
WP18, High-speed data recording Krzysztof Wrona, European XFEL
22 September 2017, ESA/ESRIN - Frascati
Technology for Long-Term Digital Preservation
Direct Attached Storage and Introduction to SCSI
INTA ESA-ESRIN 1st LTDP+ Workshop Canaries Space Centre
Introduction to Data Management in EGI
Experience of Lustre at a Tier-2 site
CERN Lustre Evaluation and Storage Outlook
ESA Report to DAI/IPR WG
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
Storage Virtualization
Direct Attached Storage and Introduction to SCSI
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Airbus Archive Technologies
Storage Trends: DoITT Enterprise Storage
Distributed computing deals with hardware
Database System Architectures
Data Management Components for a Research Data Archive
ESA EO Data Preservation System: CBA Infrastructure and its future
Presentation transcript:

Page 1PV2009, 1-3/12/2009 The ESA Earth Observation Payload Data Long Term Storage Activities Gian Maria Pinna (1), Francesco Ferrante (2) (1) ESA-ESRIN Via G. Galilei, CP 64,00044 Frascati, Italy (2) SERCO Italy Via Sciadonna24/26, 00044, Frascati, Italy PV December 2009

Page 2PV2009, 1-3/12/2009 Content ESA PDGS and MultiMission Facility Infrastructure Storage Technologies Hierarchical Storage Management Disk-based archives Distributed File System Data Exchange among archives Conclusions

Page 3PV2009, 1-3/12/2009 PDGS Archives Harmonization ESA PDGS approach was to develop a common infrastructure named MMFI (MultiMission Facility Infrastructure) The MMFI was developed following the OAIS (Open Archival Information System) Reference Model, a CCSDS/ISO standard PDGS Storage Products Metadata & browses PDGS Administration PDGS Consumer Access Archived products Output Products Orders & metadata Interactive User Services PDGS Data Management Data Producers PDGS Order Processing Data Consumers PDGS Ingestion Harmonization of the design, technologies and operational procedures used in the PDGS Requires adaptation for each mission operated Implementation by re-use and configuration proved to be very well suited for the harmonization and cost-effectiveness of the long term data preservation

Page 4PV2009, 1-3/12/2009 MMFI Goals The MMFI project at ESA aims at: –Evolving the ESA EO ground segment toward a harmonized multi-mission capable infrastructure –Maximizing the reuse of existing operational elements in a coherent standard architecture (based on the OAIS-RM) –Standardizing the MMFI external and internal interfaces

Page 5PV2009, 1-3/12/2009 MMFI Abstract Model Data LibraryRequest Handling IngestionDisseminationProcessing Central Infrastructure Archive, Site Cache and Local Inventory MMMCMMOHS Monitoring & Alarm Monitoring and Control Monitoring & AlarmOperating Logging Online Data Access Processing System IPF IPF.IPF Processing System IPF IPF.IPF Processing System IPFIPF.IPF Production Control Ingestion System(s) Dissemination and Circulation System(s) Reporting

Page 6PV2009, 1-3/12/2009 Hierarchical Storage Management Long Term Data Preservation is normally achieved using multiple tiers of storage media: –Tape storage is the basic technology utilized to decrease the cost of storage –The need for faster access, both in reading and writing, to the data requires a disk-based storage layer to be dimensioned based on the required I/O performance –It is also common in certain architectures that other levels of storage, either disk- or tape-based, are utilized to increment the performance and reduce the overall storage cost The migration among the different levels of storage, from the faster (and more expensive) to the slower (cheaper) storage levels, is normally managed by a hierarchical storage management system

Page 7PV2009, 1-3/12/2009 EO Archive Management System 1996, start of the study of an ESA EO Archive Management System EADS SatSTORE + ADIC AMASS identified to manage the ESA EO AMS: –SatSTORE => data management and clients acl –AMASS => HSM Archive library Archive cache

Page 8PV2009, 1-3/12/2009 AMASS (why ?) Advantages for slow tape drive (respect to the current available technologies) Advantages for concurrent reading operations from different tapes (the concurrent reading form the same tape was managed by SatSTORE) Read/write performed by : 4 x blocks = 4 x ( / ) Tape fine-tune to specify the tape block size (256 Kb) to achieves good performances

Page 9PV2009, 1-3/12/2009 AMASS (vs Tape technologies) oDLT4000/7000 oaverage speed of 1.5/4 Mb/s oaverage access time 65 s o20/40 Gb size oT9940B oaverage speed of 29 Mb/s oAverage access time 50 s o200 Gb size oT10KB oaverage speed of 120 Mb/s oAverage access time 49 s o1 Tb size DLT700 T9940B T10KB (next generation)

Page 10PV2009, 1-3/12/2009 AMASS (What next ?) End Of Life (EOL) for: –AMASS –Sun Storagetek ‘PwoderHorn’ libraries – Q –Sun Storagetek T9940B Block-read is not efficient with high I/O speed tape drives. Block-read is not efficient with high speed repositioning and load tape drives. T10KB not supported by AMASS (neither by PowderHorn libraries)

Page 11PV2009, 1-3/12/2009 SAMFS (next MMFI HSM) Support storage tiering Support multiple copies of the same file Support to new library/tape drive technologies File based and block based reads (run time configuration per file/directory). Data handling based on regular expression on path and files Tape can be read without SAMFS. Enhanced Web-based management portal Open source code Agreement with Sun of a wide/global 40 PB license for 10 ESA MMFI sites

Page 12PV2009, 1-3/12/2009 SAMFS (vs tape drive & Libraries I) The new HSM lead the possibility to renew the HW technologies: –Sun Storagetek SL8500, located in ESA ESRIN (Frascati, Italy). –Sun Storagetek SL3000, located in Matera (I), Maspalomas (S), Kiruna (SE), Farnborough (UK), Oberfaffenhofen (D). –Sun Storagetek SL1400, located in Tromsoe (N) and Sodankyla (Fi). –50 T10KB drives spread ESA centres.

Page 13PV2009, 1-3/12/2009 SAMFS (vs tape drive & Libraries II) SL8500 Up to 7 libraries inter-connected from 1448 to slots 64 tape drives Drive support (any combination): T10000B, T10000A, T9840D, T9840C, LTO 4,LTO 3 No downtime for library management/maintenance SL3000 from 200 to >5000 slots 56 tape drives Drive support : T10000B, T10000A, T9840D, T9840C, LTO 4,LTO 3 No downtime for library main tasks management / maintenance

Page 14PV2009, 1-3/12/2009 Disk-based Archives The long data preservation together with a modern HSM can support the whole operational infrastructure to speed up operations T10KB drive speed ~120 Mb/s, with 6 drives storage needs to serve ~720 Mb/s An Mid/Enterprise storage is needed to give to the archive such I/O: Disk-Array architecture is important Disk types (FC, SATA, SAS) is important Number of disks is critical

Page 15PV2009, 1-3/12/2009 Disk-based Archives ( synergy in MMFI I ) MMFI shared caches: Central Cache AMS disk archive HSM cache MMFI On-Line Archive MMFE caches: GFE Workspace PD cache PFD Workspace NAS/NFS SAN/Distributed file-system

Page 16PV2009, 1-3/12/2009 MMFI caches: HSM cache MMFI On-Line Archive MMFI cache (Central Cache, AMS disk archive, MMFE caches) Disk-based Archives ( synergy in MMFI II )

Page 17PV2009, 1-3/12/2009 Scalability Individual nodes: cluster size and disk storage are all scalable ( > clients, #PB of shared storage. Performance: Lustre delivers over 240 GB/sec aggregate in production deployments. On single non-aggregate transfers, Lustre offers current single node performance of 2 GB/s client throughout (max) and 2.5 GB/s OSS throughput (max). POSIX compliance High-availability: shared storage partitions for OSS targets (OSTs) and a shared storage partition for MDS target (MDT) Security:TCP connections only from privileged ports are optional. Group membership handling is server-based. POSIX ACLs are supported Open source. Disk-based Archives ( lustre )

Page 18PV2009, 1-3/12/2009 Exchange data between archives ESA VPN between EO archives Full duplex links Same data archived in more that one centre Current infrastructure have centers that upload (Stations) and centre that download (PACs) same products Geant topology does not allow high speed traffic with standard BitTorrent clients Using a modified client (<1 Mb of ‘network block size’) ~ ftp nominal throughput

Page 19PV2009, 1-3/12/2009 Conclusions The MMFI is in operations at ESA since several years Proven to be very effective on one side in improving the performance of the PDGS of all missions, and on the other to decrease the overall operational costs for the ESA EO payload data exploitation Constant evolution to: –implement new missions requirements –improve the overall performance –decrease the operational costs to a more affordable level –This is normally done by upgrading specific MMFI components or further standardizing and harmonizing the infrastructure. The archive function is one of the most important for the MMFI and will require continuous monitor and evolution toward better and cheaper architectures and systems