Download presentation
Presentation is loading. Please wait.
Published byFrederica Jackson Modified over 9 years ago
1
Caitriana Nicholson, CHEP 2006, Mumbai Caitriana Nicholson University of Glasgow Grid Data Management: Simulations of LCG 2008
2
Caitriana Nicholson, CHEP 2006, Mumbai Outline Introduction –what will LHC data analysis and management be like in 2008? The OptorSim grid simulator OptorSim architecture Experimental setup Results Conclusions
3
Caitriana Nicholson, CHEP 2006, Mumbai Introduction LHC raw data rate of ~15 PB/year LCG to provide data storage and computing infrastructure Actual analysis behaviour still unknown use simulation to investigate behaviour investigate dynamic data replication
4
Caitriana Nicholson, CHEP 2006, Mumbai OptorSim OptorSim is a grid simulator with a focus on data management Developed as part of EDG WP2 –Thanks to all members of the Optimisation Team: David Cameron, Ruben Carvajal-Schiaffino, Paul Millar, Kurt Stockinger, Floriano Zini Based on EDG architecture Used to examine automated decisions about replica placement and deletion http://edg-wp2.web.cern.ch/edg-wp2/optimization/optorsim.html
5
Caitriana Nicholson, CHEP 2006, Mumbai Architecture Sites with CE and/or SE Replica Optimiser decides replications for its site Resource Broker schedules jobs Replica Catalogue maps logical to physical filenames Replica Manager controls and registers replications
6
Caitriana Nicholson, CHEP 2006, Mumbai Algorithms Job scheduling –Details not covered in this talk –“QueueAccessCost” scheduler used in these results Data replication –No replication –Simple replication:“always replicate, delete existing files if necessary” Least Recently Used (LRU) Least Frequently Used (LFU) –Economic model: “replicate only if profitable” Sites “buy” and “sell” files using auction mechanism Files deleted if less valuable than new file
7
Caitriana Nicholson, CHEP 2006, Mumbai Experimental Setup - Jobs & Files Job types based on computing models “Dataset” for each experiment ~1 year’s AOD 2GB files Placed at CERN and Tier-1s at start See experiment computing TDRs for more details JobEvent Size (kB) Total no. of files Files per job alice-pp502500025 alice-hi25012500125 atlas10010000050 cms503750025 lhcb-small753750038 lhcb-big7537500375
8
Caitriana Nicholson, CHEP 2006, Mumbai Experimental Setup - Storage Resources CERN & T1 site capacities from LCG TDR “Canonical” T2 capacity of 197 TB each (18.8 PB / 95 sites) Storage metric D = (average SE size) (total dataset size) Memory limitations -> scale down T2 SE sizes to 500 GB –Allows file deletion to start quickly –Disadvantage of small D
9
Caitriana Nicholson, CHEP 2006, Mumbai Experimental Setup - Computing & Network Most (chaotic) analysis jobs run at T2s –T1s not given CE, except those running LHCb jobs –CERN Analysis Facility with CE of 7840 kSI2k –T2s with averaged CE of 645 kSI2k each (61.3 MSI2k / 95 sites) Network based on NREN topologies –Sites connected to closest router –Default of 155 Mbps if published value not available
10
Caitriana Nicholson, CHEP 2006, Mumbai Network Topology
11
Caitriana Nicholson, CHEP 2006, Mumbai Parameters Job scheduler “QueueAccessCost” –Combines data location and queue information Sequential access pattern 1000 jobs per simulation Site policies set according to LCG Memorandum of Understanding
12
Caitriana Nicholson, CHEP 2006, Mumbai Evaluation Metrics Different grid users will have different criteria of evaluation Used in these summary results are: –Mean job time Average time taken for job to run, from scheduling to completion –Effective Network Usage (ENU) (File requests which use network resources) (Total number of file requests)
13
Caitriana Nicholson, CHEP 2006, Mumbai Results: Data Replication Performance of algorithms measured with varying D D varied by reducing dataset size 20-25% gain in mean job time as D approaches realistic value
14
Caitriana Nicholson, CHEP 2006, Mumbai Results: Data Replication ENU shows similar gain Allows clearer distinction between strategies
15
Caitriana Nicholson, CHEP 2006, Mumbai Results: Data Replication Number of jobs increased to 4000 Mean job time increases linearly Relative improvement as D increases will hold for higher numbers of jobs Realistic number of jobs is >O(10000)
16
Caitriana Nicholson, CHEP 2006, Mumbai Results: Site Policies Vary site policies: –All Job Types Sites accept jobs from any VO –One Job Type Sites accept jobs from one VO –Mixed default All Job Types is ~60% faster than One Job Type
17
Caitriana Nicholson, CHEP 2006, Mumbai Results: Site Policies All Job Types also give ~25% lower ENU than other policies Egalitarian approach benefits all grid users
18
Caitriana Nicholson, CHEP 2006, Mumbai Results: Access Patterns Sequential access likely for many HEP applications Zipf-like access will also occur –Some files accessed frequently, many infrequently Replication gives performance gain of ~75% when Zipf access pattern used
19
Caitriana Nicholson, CHEP 2006, Mumbai Results: Access Patterns ENU also ~75% lower with Zipf access Any Zipf-like element makes replication highly desirable Size of efficiency gain depends on streaming model, etc
20
Caitriana Nicholson, CHEP 2006, Mumbai Conclusions OptorSim used to simulate LCG in 2008 Dynamic data replication reduces running time of simulated grid jobs: –20% reduction with sequential access –75% reduction with Zipf-like access –Similar reductions in network usage Little difference between replication strategies –Simpler LRU, LFU 20-30% faster than economic model Site policy which allows all experiments to share resources gives most effective grid use
21
Caitriana Nicholson, CHEP 2006, Mumbai Replica optimiser architecture Access Mediator (AM) - contacts replica optimisers to locate the cheapest copies of files and makes them available locally Storage Broker (SB) - manages files stored in SE, trying to maximise profit for the finite amount of storage space available P2P Mediator (P2PM) - establishes and maintains P2P communication between grid sites
22
Caitriana Nicholson, CHEP 2006, Mumbai GridPP: Executive Summary Tony Doyle
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.