Download presentation
Presentation is loading. Please wait.
Published byHugh Price Modified over 8 years ago
1
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1
2
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Overview of BaBar @ CC-IN2P3 (I) CC-IN2P3: mirror site of Slac for BaBar since November 2001: –real data. –simulation data. (total = 220 TB) Provides the infrastructure needed to analyze these data by the end users. Open to all the BaBar physicists.
3
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Overview of BaBar @ CC-IN2P3 (II) 2 types of data available: –Objectivity format (commercial OO database): giving it up. –Root format (ROOT I/O: Xrootd developped @ SLAC). Hardware: –200 GB tapes (type: 9940). –20 tape drives (r/w rate = 20 MB/s). –20 Sun servers. –30 TB of disks (ratio disk/tape = 15%). actually ratio ~ 30% (ignoring rarely accessed data) permanentcache
4
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 BaBar usage @ CC-IN2P3 2002 – 2004: ~ 20% of the CPU available (on a total of ~1000 CPUs available). Up to 450-500 users’ jobs running in // « Distant access » of the Objy and root files from the batch worker (BW): random access to the files: only the objects needed by the client are transfered to the BW (~kB per request). hundreds of connections per server. thousands of requests per second.
5
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Data access model Client (etc…) Data servers disks HPSS Master daemon: Xrootd / Objy T1.root (1) (4) (2) Master servers T1.root ? (etc…) Slave daemon: Xrootd / Objy (5) (3) (6) (1) + (2): dynamic load balancing (4) + (5): dynamic staging (6): random access to the data
6
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Dynamic staging Average file size: 500 MB. Average staging time: 120 s. When the system was overloaded (before dyn. load balancing era): 10-15 min delays (with only 200 jobs) Up to 10k files from tape to disk cache / day (150k staging requests/month!). Max of 4 TB from tape to disk cache / day
7
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Dynamic load balancing Up and running since December 2003 for Objectivity (before a file could only be staged on a given server). no more delayed jobs (even with 450 jobs in //). more efficient management of the disk cache (entire disk space seen as a single file system). fault tolerance in case of server crashes.
8
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Pros … Mass Storage System (MSS) usage completly transparent for the end user. No cache space management by the user. Extremely fault tolerant (server crashes or during maintenance work). Highly scalable + entire disk space efficiently used. On the admin side: can choose your favourite MSS, favourite protocol to do the staging (Slac: pftp, Lyon: RFIO, ….).
9
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 … and cons Entire machinery relies on a lot of different components (especially a MSS). In case of a very high demand on the client side response time can be real slow. But also depending on: –number of data sets available. –a good data structure.
10
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Data structure: the fear factor A performant data access model depends also on this. Deep copies vs « pointers’ » files (only containing pointers to other files) ? Deep copies« Pointers » files - duplicated data - ok in a «full disk» scenario - ok if used with a MSS - no data duplication - ok in a «full disk» scenario - potentially very stressful on the MSS (VERY BAD)
11
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 What about other experiments ? Xrootd well adapted for users’ jobs using ROOT to analyze a large dataset. being included in the official version of ROOT. already setup in Lyon and being used or tested by other groups: D0, EUSO and INDRA. access to files stored in HPSS transparently. no need to manage the disk space.
12
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Summary Storage and data access is the main challenge. Good ratio disk/tape hard to find: depends on many factors (users, number of tape drives etc…). Xrootd provides lots of interesting features for distant data access. extremely robust (great achievement for a distributed system).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.