LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.

Slides:



Advertisements
Similar presentations
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
IRODS usage at CC-IN2P3 Jean-Yves Nief. Talk overview What is CC-IN2P3 ? Who is using iRODS ? iRODS administration: –Hardware setup. iRODS interaction.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
March 6, 2009Tofigh Azemoon1 Real-time Data Access Monitoring in Distributed, Multi Petabyte Systems Tofigh Azemoon Jacek Becla Andrew Hanushevsky Massimiliano.
Jefferson Lab Site Report Kelvin Edwards Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
20-22 September 1999 HPSS User Forum, Santa Fe CERN IT/PDP 1 History  Test system HPSS 3.2 installation in Oct 1997 IBM AIX machines with IBM 3590 drives.
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
Data Distribution and Management Tim Adye Rutherford Appleton Laboratory BaBar Computing Review 9 th June 2003.
Personal Computer - Stand- Alone Database  Database (or files) reside on a PC - on the hard disk.  Applications run on the same PC and directly access.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
Experience with the Thumper Wei Yang Stanford Linear Accelerator Center May 27-28, 2008 US ATLAS Tier 2/3 workshop University of Michigan, Ann Arbor.
SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.
Integrating JASMine and Auger Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief.
1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
8 October 1999 BaBar Storage at CCIN2P3 p. 1 Rolf Rumler BaBar Storage at Lyon HEPIX and Mass Storage SLAC, California, U.S.A. 8 October 1999 Rolf Rumler,
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
Status SC3 SARA/Nikhef 20 juli Status & results SC3 throughput phase SARA/Nikhef Mark van de Sanden.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
11th November 2002Tim Adye1 Distributed Analysis in the BaBar Experiment Tim Adye Particle Physics Department Rutherford Appleton Laboratory University.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
A UK Computing Facility John Gordon RAL October ‘99HEPiX Fall ‘99 Data Size Event Rate 10 9 events/year Storage Requirements (real & simulated data)
01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
12 Mars 2002LCG Workshop: Disk and File Systems1 12 Mars 2002 Philippe GAILLARDON IN2P3 Data Center Disk and File Systems.
GDB meeting - Lyon - 16/03/05 An example of data management in a Tier A/1 Jean-Yves Nief.
Oct. 6, 1999PHENIX Comp. Mtg.1 CC-J: Progress, Prospects and PBS Shin’ya Sawada (KEK) For CCJ-WG.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief.
BIG DATA/ Hadoop Interview Questions.
11th September 2002Tim Adye1 BaBar Experience Tim Adye Rutherford Appleton Laboratory PPNCG Meeting Brighton 11 th September 2002.
CCIN2P3 Site Report - BNL, Oct 18, CCIN2P3 Site report Wojciech A. Wojcik IN2P3 Computing Center.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
FileStager test results
Bernd Panzer-Steindel, CERN/IT
SAM at CCIN2P3 configuration issues
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
Universita’ di Torino and INFN – Torino
Support for ”interactive batch”
CC and LQCD dimanche 13 janvier 2019dimanche 13 janvier 2019
Using an Object Oriented Database to Store BaBar's Terabytes
Kanga Tim Adye Rutherford Appleton Laboratory Computing Plenary
The LHCb Computing Data Challenge DC06
Presentation transcript:

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Overview of CC-IN2P3 (I) CC-IN2P3: mirror site of Slac for BaBar since November 2001: –real data. –simulation data. (total = 220 TB) Provides the infrastructure needed to analyze these data by the end users. Open to all the BaBar physicists.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Overview of CC-IN2P3 (II) 2 types of data available: –Objectivity format (commercial OO database): giving it up. –Root format (ROOT I/O: Xrootd SLAC). Hardware: –200 GB tapes (type: 9940). –20 tape drives (r/w rate = 20 MB/s). –20 Sun servers. –30 TB of disks (ratio disk/tape = 15%). actually ratio ~ 30% (ignoring rarely accessed data) permanentcache

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 BaBar CC-IN2P – 2004: ~ 20% of the CPU available (on a total of ~1000 CPUs available). Up to users’ jobs running in // « Distant access » of the Objy and root files from the batch worker (BW):  random access to the files: only the objects needed by the client are transfered to the BW (~kB per request).  hundreds of connections per server.  thousands of requests per second.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Data access model Client (etc…) Data servers disks HPSS Master daemon: Xrootd / Objy T1.root (1) (4) (2) Master servers T1.root ? (etc…) Slave daemon: Xrootd / Objy (5) (3) (6) (1) + (2): dynamic load balancing (4) + (5): dynamic staging (6): random access to the data

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Dynamic staging Average file size: 500 MB. Average staging time: 120 s. When the system was overloaded (before dyn. load balancing era): min delays (with only 200 jobs) Up to 10k files from tape to disk cache / day (150k staging requests/month!). Max of 4 TB from tape to disk cache / day

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Dynamic load balancing Up and running since December 2003 for Objectivity (before a file could only be staged on a given server).  no more delayed jobs (even with 450 jobs in //).  more efficient management of the disk cache (entire disk space seen as a single file system).  fault tolerance in case of server crashes.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Pros … Mass Storage System (MSS) usage completly transparent for the end user. No cache space management by the user. Extremely fault tolerant (server crashes or during maintenance work). Highly scalable + entire disk space efficiently used. On the admin side: can choose your favourite MSS, favourite protocol to do the staging (Slac: pftp, Lyon: RFIO, ….).

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 … and cons Entire machinery relies on a lot of different components (especially a MSS). In case of a very high demand on the client side  response time can be real slow. But also depending on: –number of data sets available. –a good data structure.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Data structure: the fear factor A performant data access model depends also on this. Deep copies vs « pointers’ » files (only containing pointers to other files) ? Deep copies« Pointers » files - duplicated data - ok in a «full disk» scenario - ok if used with a MSS - no data duplication - ok in a «full disk» scenario - potentially very stressful on the MSS (VERY BAD)

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 What about other experiments ? Xrootd well adapted for users’ jobs using ROOT to analyze a large dataset. being included in the official version of ROOT. already setup in Lyon and being used or tested by other groups: D0, EUSO and INDRA.  access to files stored in HPSS transparently.  no need to manage the disk space.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Summary Storage and data access is the main challenge. Good ratio disk/tape hard to find: depends on many factors (users, number of tape drives etc…). Xrootd provides lots of interesting features for distant data access.  extremely robust (great achievement for a distributed system).