Storage issues for end-user analysis Bernd Panzer-Steindel, CERN/IT 08 July 2008 1.

Slides:



Advertisements
Similar presentations
Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.
Advertisements

Distributed Tier1 scenarios G. Donvito INFN-BARI.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
Belle computing upgrade Ichiro Adachi 22 April 2005 Super B workshop in Hawaii.
Secure Off Site Backup at CERN Katrine Aam Svendsen.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
CT NIKHEF June File server CT system support.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
November 2009 Network Disaster Recovery October 2014.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Building Advanced Storage Environment Cheng Yaodong Computing Center, IHEP December 2002.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
Igor Gaponenko ( On behalf of LCLS / PCDS ).  An integral part of the LCLS Computing System  Provides:  Mid-term (1 year) storage for experimental.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
Your university or experiment logo here CEPH at the Tier 1 Brain Davies On behalf of James Adams, Shaun de Witt & Rob Appleyard.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.
Interoperability and Image Analysis KC Stegbauer.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
CERN IT Department CH-1211 Genève 23 Switzerland Introduction to CERN Computing Services Bernd Panzer-Steindel, CERN/IT.
Marco Cattaneo - DTF - 28th February 2001 File sharing requirements of the physics community  Background  General requirements  Visitors  Laptops 
Future home directories at CERN
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
23.March 2004Bernd Panzer-Steindel, CERN/IT1 LCG Workshop Computing Fabric.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
GridKa Summer 2010 T. Kress, G.Quast, A. Scheurer Migration of data from old to new dCache instance finished on Nov. 23 rd almost 500'000 files (600.
Handling of T1D0 in CCRC’08 Tier-0 data handling Tier-1 data handling Experiment data handling Reprocessing Recalling files from tape Tier-0 data handling,
01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Cloud Archive By: Kimberly Nolan. What it is?  The goal of a cloud archiving service is to provide a data storage (ex. Google drive and SkyDrive) as.
Cloud computing: IaaS. IaaS is the simplest cloud offerings. IaaS is the simplest cloud offerings. It is an evolution of virtual private server offerings.
19. November 2007Bernd Panzer-Steindel, CERN/IT1 CERN Computing Fabric Status LHCC Review, 19 th November 2007.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
Western Digital 2TB External Hard Disk Element - Addocart.
Patrick Gartung 1 CMS 101 Mar 2007 Introduction to the User Analysis Facility (UAF) Patrick Gartung - Fermilab.
Grid technologies for large-scale projects N. S. Astakhov, A. S. Baginyan, S. D. Belov, A. G. Dolbilov, A. O. Golunov, I. N. Gorbunov, N. I. Gromova, I.
Predrag Buncic CERN Data management in Run3. Roles of Tiers in Run 3 Predrag Buncic 2 ALICEALICE ALICE Offline Week, 01/04/2016 Reconstruction Calibration.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
Compute and Storage For the Farm at Jlab
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Introduction to Data Management in EGI
Bernd Panzer-Steindel, CERN/IT
Model (CMS) T2 setup for end users
CERN Lustre Evaluation and Storage Outlook
Bernd Panzer-Steindel, CERN/IT
NGS Oracle Service.
Artem Trunov and EKP team EPK – Uni Karlsruhe
Ákos Frohner EGEE'08 September 2008
Computing Infrastructure for DAQ, DM and SC
WLCG Collaboration Workshop;
Storage Virtualization
Cloud Migration What to Consider When Switching Providers NAME: SYED TARIQ SHAH “WAQIF” REG NO: K1S18MCS0021 SUB: CLUSTER AND CLOUD COMPUTING.
Cloud vs Local: Better Data Storage Device
R. Graciani for LHCb Mumbay, Feb 2006
IBM Tivoli Storage Manager
Presentation transcript:

Storage issues for end-user analysis Bernd Panzer-Steindel, CERN/IT 08 July

F Batch and Interactive nodes G End-User Analysis Analysis E B A B D C MC and Data Import/Export MC and Data Import/Export Tape Read Tape Migration Online DAQ Data Export file merge file copy file merge file copy T0 Calibration and Alignment T2/T3 Analysis Production Data Disk Pools Scratch file copy Reprocessing Tape Read Reprocessing Tape Read T1/T2/T3 High level Data Flow 08 July 20082Bernd Panzer-Steindel, CERN/IT

Tentative ‘requirements’ for end-user analysis storage  Storage capacity of 1-2 TB per user (  assume 1.5 TB) (Ntuple, data samples, logfiles, ….)  Reliable storage, ‘server-mirroring’ 99.9% availability  4 times per year unavailable for 4 h each  No tape access  too many small files  Some backup possibility ( archive ? ) Backup  5% changes per day (75 GB/d) + 4 month retention time = 9 TB backup space per user  Quota system  Easy accessibility from batch and interactive worker nodes and the notebook  POSIX access type  distributed file system  World-wide access  High file access read/write performance  User identity and security  ……………………………. 08 July 20083Bernd Panzer-Steindel, CERN/IT

Estimated 500 users at CERN  750 TB of analysis disk storage + backup and archive? 1.5 TB USB disk Amazon Cloud Storage TSM Backup HSM Backup, Archive AFS Isilon, BlueArc, Exanet, DataDirect NFS4, Lustre, xrootd, Cost and Technology Scenarios Distributed File System implementations Hardware investments over 2 years ‘guestimates’ 7 MCHF 0.4 MCHF 3 MCHF 2.5 MCHF 1 MCHF 2 MCHF 4.5 MCHF HSM, disk-only 2 MCHF 08 July 20084Bernd Panzer-Steindel, CERN/IT + software operation, support, functionality,..

Questions  Where are these ‘extra’ resources coming from ?  Is there only one unique storage per user world-wide ?  What about users working on different sites ?  Do they have multiple end-user storage instances ?  How is data transferred between instances ?  The difference between the ‘home-directory’ storage and end-user analysis space is small.  Analysis tools/programs and the data itself must be accessed at the same time.  Who decides which user gets how much space where ?  Experiment specific policies  What is the data flow model ?  Notebook disk + site local file system + global file system  Notebook disk + site local scratch + cloud storage  Global file system only  ….. More combinations……….  Notebook issues  OS support, virtual analysis infrastructure, network connectivity = data ‘gas station’  ……many more questions…………. 08 July 20085Bernd Panzer-Steindel, CERN/IT

Is there some common interest to solve this problem ? Need/interest for the creation of a working group to investigate in more detail? Experiments, Sites ? 08 July 20086Bernd Panzer-Steindel, CERN/IT