HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006.

Slides:



Advertisements
Similar presentations
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Advertisements

Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.
Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
SAN Last Update Copyright Kenneth M. Chipps Ph.D. 1.
Storage Task Force Intermediate pre report. History GridKa Technical advisory board needs storage numbers: Assemble a team of experts. 04/05 At HEPiX.
Storage Networking Technologies and Virtualization Section 2 DAS and Introduction to SCSI1.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Upgrading the Platform - How to Get There!
BACKUP/MASTER: Immediate Relief with Disk Backup Presented by W. Curtis Preston VP, Service Development GlassHouse Technologies, Inc.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Module 10 Configuring and Managing Storage Technologies.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
Nexenta Proprietary Global Leader in Software Defined Storage Nexenta Technical Sales Professional (NTSP) COURSE CONTENT.
Chapter 5 Section 2 : Storage Networking Technologies and Virtualization.
School of EECS, Peking University Microsoft Research Asia UStore: A Low Cost Cold and Archival Data Storage System for Data Centers Quanlu Zhang †, Yafei.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Storage Trends: DoITT Enterprise Storage Gregory Neuhaus – Assistant Commissioner: Enterprise Systems Matthew Sims – Director of Critical Infrastructure.
1 U.S. Department of the Interior U.S. Geological Survey Contractor for the USGS at the EROS Data Center EDC CR1 Storage Architecture August 2003 Ken Gacke.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
NL Service Challenge Plans Kors Bos, Sander Klous, Davide Salomoni (NIKHEF) Pieter de Boer, Mark van de Sanden, Huub Stoffers, Ron Trompert, Jules Wolfrat.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory.
Caitriana Nicholson, CHEP 2006, Mumbai Caitriana Nicholson University of Glasgow Grid Data Management: Simulations of LCG 2008.
Tier1 Andrew Sansum GRIDPP 10 June GRIDPP10 June 2004Tier1A2 Production Service for HEP (PPARC) GRIDPP ( ). –“ GridPP will enable testing.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
Click to add text Introduction to the new mainframe: Large-Scale Commercial Computing © Copyright IBM Corp., All rights reserved. Chapter 6: Accessing.
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Tier-2 storage A hardware view. HEP Storage dCache –needs feed and care although setup is now easier. DPM –easier to deploy xrootd (as system) is also.
KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association STEINBUCH CENTRE FOR COMPUTING - SCC
SATA In Enterprise Storage Ron Engelbrecht Vice President and General Manager Engineering and Manufacturing Operations September 21, 2004.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
US ATLAS Tier 1 Facility Rich Baker Deputy Director US ATLAS Computing Facilities October 26, 2000.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
A Computing Tier 2 Node Eric Fede – LAPP/IN2P3. 2 Eric Fede – 1st Chinese-French Workshop Plan What is a Tier 2 –Context and definition To be a Tier 2.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
26. Juni 2003Bernd Panzer-Steindel, CERN/IT1 LHC Computing re-costing for for the CERN T0/T1 center.
Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Integrating Disk into Backup for Faster Restores
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Direct Attached Storage and Introduction to SCSI
What is Fibre Channel? What is Fibre Channel? Introduction
Bernd Panzer-Steindel, CERN/IT
Bernd Panzer-Steindel, CERN/IT
LHC Computing re-costing for
Introduction to Networks
Bernd Panzer-Steindel CERN/IT
Storage Virtualization
Direct Attached Storage and Introduction to SCSI
Real IBM C exam questions and answers
Storage Networks and Storage Devices
Keith Spayth ACSG 520 Dr. Alzoubi
Storage Trends: DoITT Enterprise Storage
Keith Spayth ACSG 520 Dr. Alzoubi
Presentation transcript:

HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006

February 13 th 2005Mandate –Examine the current LHC experiment computing models. –Attempt to determine the data volumes, access patterns and required data security for the various classes of data, as a function of Tier and of time. –Consider the current storage technologies, their prices in various geographical regions and their suitability for various classes of data storage. –Attempt to map the required storage capacities to suitable technologies. –Formulate a plan to implement the required storage in a timely fashion.

February 13 th 2005Membership -o- Roger Jones, Lancaster, ATLAS -o- Andrew Sansum, RAL, -o- Bernd Panzer/ Helge Meinhard, CERN, -o- David Stickland (latterly) (CMS) -o- Peter Malzacher GSI Tier-2, Alice, -o- Andrei Maslennikov,CASPUR, -o- Jos van Wezel GridKA, HEPiX, Shadow 1 Shadow 2 -o- Vincenzo Vagnoni Bologna, LHCb, -o- Luca dell’Agnello -o- Kors Bos, NIKHEF by invitation Thanks to all members!

February 13 th 2005 Degree of Success Assessment of Computing Model –RJ shoulders the blame for this area! –Computing TDRs help – see many talk at this conference –Estimates of contention etc rough; toy simulations are exactly that, and we need to improve this area beyond the lifetime of the task force. Disk –Thorough discussion of disk issues –Recommendations, prices etc Archival media –Less complete discussion –Final reporting here in April HEPiX/GDB meeting in Rome Procurement –Useful guidelines to help tier 1 and tier 2 procurement

February 13 th 2005Outcome Interim document available through the GDB Current High Level Recommendations –It is recommended that a better information exchange mechanism be established between (HEP) centres to mutually improve purchase procedures. –An annual review should be made of the storage technologies and prices, and a report made publicly available. –Particular further study of archival media is required, and tests should be made of the new technologies emerging. –A similar regular report is required for CPU purchases. This is motivated by the many Tier-2 centres now making large purchases. –People should note that the lead time from announcement to effective deployment of new technologies is up to a year. –It is noted that the computing models assume that archived data is available at the time of attempted processing. This implies that the software layer allows pre-staging and pinning of data.

February 13 th 2005Inputs Informed by C-TDRs and computing model documents –Have tried to estimate contentions etc, but this requires much more detailed simulation work –Have looked at data classes and associated storage/access requirements, but his could be taken further E.g. models often provide redundancy on disk, but some sites assume they still need to back disk to tape in all cases –Have included bandwidths to MSS from LHC4 exercise, but more detail would be good

February 13 th 2005 Storage Classes 1)tape, archive, possibly offline (vault), access > 2 days, 100 MB/s 2)tape, on line in library, access > 1 hour, 400 MB/s 3)disk, any type, in front of tape caches 4)disk, SATA type optimised for large files, sequential Read only IO 5)disk, SCSI/FC type optimised for small files, Read/Write random IO 6)disk, high speed and reliability RAID 1 or 6 (catalogues, home directories etc)

February 13 th 2005Disk Two common disk types –SCSI/FibreChannel Higher speed and throughput Little longer lifetime (~4 years) More expensive –SATA (II) Cheaper Available in storage arrays Lifetime >3 years (judging by warrantees!) RAID5 gives fair data security –Could still have 10TB/1PB unavailable on any given day RAID6 looks more secure –Some good initial experiences –Care needed with drive and other support Interconnects –Today SATA (300 MB/s) –Good for disk to server, point to point Fibre channel (400 MB/s) –High speed IO interconnect, fabric –Soon (2006) Serial Attached SCSI (SAS – multiple 300 MB/s) Infiniband (IBA 900 MB/s)

February 13 th 2005Architectures Direct Attached Storage –Disk is directly attached to CPU –Cheap but administration costly Network Attached Storage –File servers on Ethernet network –Access by file-based protocols Slightly more expensive but smaller number of dedicated nodes Storage in a box – servers have internal disks Storage out of box – fiber or SCSI connected Storage Area Networks –Block not file transport –Flexible and redundant paths, but expensive

February 13 th 2005 Disk Data Access Access rates –50 streams per RAID group or 2 MB/s per stream on a 1 Gbit interface –Double this for SCSI Can be impaired by –Software interface/SRM –Non-optimal hardware configuration CPU, kernel, network interfaces –Recommend 2 x nominal interfaces for read and 3 x nominal for write

February 13 th 2005 Disk Recommendations Storage in a box (DAS/NAS disks together with server logic in a single enclosure) –most storage for a fixed cost –more experience with large SATA + PCI RAID deployments desirable –more expensive solutions may require less labour/be more reliable (experiences differ) –high quality support may be the deciding factor Recommendation –Sites should declare the products they have in use A possible central place would be the central repository setup at hepix.org –Where possible, experience with trial systems should be shared (Tier-1s and CERN have a big role here)

February 13 th 2005 Procurement Guidelines These come from H Meinhard Many useful suggestions for procurement May need to be modified to local rules

February 13 th 2005 Disk Prices DAS/NAS: storage in a box (disks together with server logic in a single enclosure) – € per usable 10 TB SAN/S: SATA based storage systems with high speed interconnect. – € per usable 10 TB SAN/F: FibreChannel/SCSI based storage systems with high speed interconnect –~55000 € per usable 10 TB These numbers are reassuringly close to those from Pasta reviews, but it should be noted there is a spread from geography and other situations Evolution (raw disks) –Expect Moore’s Law density increase of 1.6/year between 2006 and 2010 –Also consider effect of increase at only 1.4/year –Cost reduction 30-40% per annum

February 13 th 2005 Tape and Archival This area is ongoing and needs more work –Less frequent procurements Disk system approaches active tape system costs by ~2008 Note computing models generally only assume archive copies at the production site Initial price indications similar to LCG planning projections –40 CHF/TB for medium –25MB/s effective scheduled bandwidth drive + server is 15kCHF - 35 kCHF –Effective throughput is much lower for chaotic usage –6000 slot silo is ~500 kCHF New possibilities include spin on demand disk etc –Needs study by T0 and T1s, should start now –Would be brave to change immediately

February 13 th 2005Plans The group is now giving more consideration to archival –Need to do more on archival media –General need for more discussion of storage classes –More detail to be added on computing model operational details Final report in April Further task forces needed every year or so