Storage Task Force Intermediate pre report. History GridKa Technical advisory board needs storage numbers: Assemble a team of experts. 04/05 At HEPiX.

Slides:

Advertisements

Similar presentations

GridKa May 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Installing dCache into an existing Storage environment at GridKa Forschungszentrum.

Advertisements

Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Torsten Antoni – LCG Operations Workshop, CERN 02-04/11/04 Global Grid User Support - GGUS -

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Jos van Wezel Doris Ressmann GridKa, Karlsruhe TSM as tape storage backend for disk pool managers.

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.

T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.

Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.

Virtualization Across The Enterprise Rob Lowden Director, Enterprise Infrastructure Indiana University 23 May 2007.

© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.

Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

Transition to a new CPU benchmark on behalf of the “GDB benchmarking WG”: HEPIX: Manfred Alef, Helge Meinhard, Michelle Michelotto Experiments: Peter Hristov,

Day 10 Hardware Fault Tolerance RAID. High availability All servers should be on UPSs –2 Types Smart UPS –Serial cable connects from UPS to computer.

Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.

HEPiX Storage Task Force Roger Jones Lancaster CHEP06, Mumbai, February 2006.

Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.

HEPiX FSWG Progress Report Andrei Maslennikov Michel Jouvin GDB, May 2 nd, 2007.

Monitoring the Grid at local, national, and Global levels Pete Gronbech GridPP Project Manager ACAT - Brunel Sept 2011.

Introduction Kors Bos, NIKHEF, Amsterdam GDB Meeting at CERN, June 22, 2005 LCG.

LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.

Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.

GridPP3 Project Management GridPP20 Sarah Pearce 11 March 2008.

14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.

SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.

RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.

CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.

JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.

CSCS Status Peter Kunszt Manager Swiss Grid Initiative CHIPP, 21 April, 2006.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

From the Transatlantic Networking Workshop to the DAM Jamboree to the LHCOPN Meeting (Geneva-Amsterdam-Barcelona) David Foster CERN-IT.

Monte Carlo Data Production and Analysis at Bologna LHCb Bologna.

IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.

1 LHCC RRB SG 16 Sep P. Vande Vyvre CERN-PH On-line Computing M&O LHCC RRB SG 16 Sep 2004 P. Vande Vyvre CERN/PH for 4 LHC DAQ project leaders.

CERN IT Department CH-1211 Genève 23 Switzerland t IHEPCCC/HEPiX benchmarking WG Helge Meinhard / CERN-IT LCG Management Board 11 December.

CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.

CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,

Tier-2 storage A hardware view. HEP Storage dCache –needs feed and care although setup is now easier. DPM –easier to deploy xrootd (as system) is also.

WLCG Service Report ~~~ WLCG Management Board, 7 th July 2009.

Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.

NA62 computing resources update 1 Paolo Valente – INFN Roma Liverpool, Aug. 2013NA62 collaboration meeting.

KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association STEINBUCH CENTRE FOR COMPUTING - SCC

Tier 1 at Brookhaven (US / ATLAS) Bruce G. Gibbard LCG Workshop CERN March 2004.

Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.

GDB, 07/06/06 CMS Centre Roles à CMS data hierarchy: n RAW (1.5/2MB) -> RECO (0.2/0.4MB) -> AOD (50kB)-> TAG à Tier-0 role: n First-pass.

PSM, Database requirements for POOL (File catalog performance requirements) Maria Girone, IT-DB Strongly based on input from experiments: subject.

AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

12 Mars 2002LCG Workshop: Disk and File Systems1 12 Mars 2002 Philippe GAILLARDON IN2P3 Data Center Disk and File Systems.

Enterprise Storage Initiative Lyndon R. Oleson Lead, DOI Storage Technical Evaluation Team U.S. Geological Survey Center for Earth Resources Observation.

LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.

The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.

WLCG Service Report Jean-Philippe Baud ~~~ WLCG Management Board, 24 th August

IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, December 2009 Stefano Zani 10/11/2009Stefano Zani INFN CNAF (TIER1 Staff)1.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

A Computing Tier 2 Node Eric Fede – LAPP/IN2P3. 2 Eric Fede – 1st Chinese-French Workshop Plan What is a Tier 2 –Context and definition To be a Tier 2.

RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.

HEPiX FSWG Progress Report (Phase 2) Andrei Maslennikov November 2007 – St. Louis.

Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.

Luca dell’Agnello INFN-CNAF

Castor services at the Tier-0

ALICE Computing Model in Run3

GridPP Tier1 Review Fabric

HEPiX Fall 2005, SLAC October 10th 2005

Bernd Panzer-Steindel CERN/IT

Storage Virtualization

Development of LHCb Computing Model F Harris

Dirk Duellmann ~~~ WLCG Management Board, 27th July 2010

The LHCb Computing Data Challenge DC06

Presentation transcript:

Storage Task Force Intermediate pre report

History GridKa Technical advisory board needs storage numbers: Assemble a team of experts. 04/05 At HEPiX 05/05 –many sites indicate similar question(s) –What hardware needed for which profile –HEPiX community, GDB chair and LCG project leader agree on task force. First (telephone) meeting on 17/6/05 –Since then 1 face to face, several phone conf., numerous s

Mandate Examine the current LHC experiment computing models. Attempt to determine the data volumes, access patterns and required data security for the various classes of data, as a function of Tier and of time. Consider the current storage technologies, their prices in various geographical regions and their suitability for various classes of data storage. Attempt to map the required storage capacities to suitable technologies. Formulate a plan to implement the required storage in a timely fashion. Report to GDB and/or HEPiX

Members Roger Jones (convener, ATLAS) Tech experts Luca d’Agnello (CNAF) Martin Gasthuber (DESY) Andrei Maslennikov (CASPUR) Helge Meinhard (CERN) Andrew Sansum (RAL) Jos van Wezel (FZK) Experiment experts Peter Malzacher (Alice) Vincenso Vagnoni (LHCb) nn (CMS) Report due 2 nd wk October at HEPiX (SLAC)

Methods used Define hardware solutions –storage block with certain capacities and capabilities Perform simple trend analysis –costs, densities, CPU v.s. IO throughput Follow storage/data path in computing models –ATLAS, CMS (and Alice) used at the moment –assume input rate is fixed (scaling?) –estimate inter T1 data flow –estimate data flow to T2 –attempt to estimate file and array/tape contentions Define storage classes –based on storage use in models –reliability, throughput (=> costs function) Using CERN tender of DAS storage as the basis for an example set of requirements

From storage models src: Roger Jones

Trying to deliver Type of storage fitted to the specific req. –access patterns –applications involved –data classes Amount of storage at time t 0 to t +4 years –what is needed when –growth of data sets is ?? (Non exhaustive) list of hardware –via web site (feed with recent talks) –experiences –need maintenance Determination list for disk storage at T2 (and T1) –IO rates via connections (ether, scsi, i-band) –Availability types (RAID) –Management, maintenance and replacement costs

For T1 only Tape access and throughput specs. –involves disk storage for caching –relates to the amount of disk (cache/data ratio) CMS and ATLAS seem to have different tape access predictions. Tape contention for –raw data –reconstruction Atlas 1500 MB/s in 2008, 4000 MB/s in 2010 all T1’s –T2 access

Conclusions Disk size is leading factor. IO throughput is of minor importance (observed 1 MB/s / cpu) but rates at production are not known DAS storage model is cheaper but less experience Cache to data ratio is not known. Probably need yearly assessment of hardware. (especially important for those that buy infrequently) Experiment models differ on several points: more analysis needed. Disk server scaling difficult because network to disk ratio is not useful. Analysis access pattern is not well known but will hit databases most – should be revisited in a year. Non-posix access requires more CPU

What in it for GridKa Alignment of expert views on hardware –realization of very different local circumstances –survey of estimated costs returned similar numbers SAN vs. DAS/storage in a box factor 1.5 more expensive Tape: 620 euro/TB Better understanding of experiment requirements -in house workshop with experts maybe worthwhile -did not lead to altered or reduced storage requirements DAS/Storage in a box will not work for all –Large centers buy both. There are success and failure stories. T2 is probably different

DAS versus SAN connected

GridKa storage preview Next year expansion with 325 TB –Use expansion as disk-cache to tape via dCache –Re-use reliable storage from existing SAN pool On-line disk of LHC experiments will move to cache pool Storage in a box (DAS/NAS) –Cons: hidden costs? difficult to estimate maintenance: failure rate is multiplied by recovery/service time not adaptable to different architectures/uses needs more rack space, power (est 20%) –Pro: price