Presentation is loading. Please wait.

Presentation is loading. Please wait.

Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing.

Similar presentations


Presentation on theme: "Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing."— Presentation transcript:

1 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing Model -> real per-site estimates n More detail exists than is presented in the Megatable à Original process: n CMS had a significant resource shortfall (esp. T1) n To respect pledges -> ad hoc descoping of CM à After new LHC planning n New top-down planning roughly matches overall pledged resource n Allow resource requirements at T1 centres to float a little n Establish self-consistent balance of resources à Outputs n Transfer capacity estimates between centres n New guidance on balance of resources on T1/T2

2 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Inputs: CMS Model à Data rates, event sizes n Trigger rate: ~300Hz (450MB/s) n Sim to real ratio is 1:1 (though not all full simulation) n RAW (sim) 1.5 (2.0) MB/evt; RECO (sim) 250 (400) kB/evt n All AOD is 50kB/evt à Data placement n RAW/RECO: one copy across all T1, disk1tape1 n Sim RAW/RECO: one copy across all T1, on tape with 10% disk cache How is this expressed in diskXtapeY formalism? Is this formalism in fact appropriate for resource questions…? n AOD: one copy at each T1, disk1tape1

3 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Inputs: LHC, Centres à 2008 LHC assumptions n 92 days of ‘running’ (does not include long MD periods) n 50% efficiency during ‘running’ n Practical implication: the T0 is 100% busy for this time n Input smoothing at T0 required; assume queue < few days n T0 output rate is flat during ‘running’ (straight from T0 capacity) n More precise input welcomed + would be useful Not expected to have strong effects upon most of the estimates à Efficiencies, overheads, etc n Assume 70% T1/T2 disk fill factor (the 30% included in expt reqt) n Assume 100% tape fill factor (i.e. any overhead owned by centre) n T1 CPU efficiency back to 75 / 85% (chaotic/shed)

4 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Centre Roles: T1 à T1 storage reqts: n Curation of assigned fraction of RAW Assigned raw data fractions 1st fundamental input to T1/T2 process n Storage of corresponding RECO / MC from associated T2 centres Association of T1/T2 2nd fundamental input to T1/T2 process n Hosting of entire AOD à T1 processing reqts: n Re-reconstruction: RAW -> RECO -> AOD n Skimming; group and end-user bulk analysis of all data tiers n Calibration, alignment, detectors studies, etc à T1 connections n T0 -> T1: Prompt RAW/RECO from T0 (to tape) n T1 T1: Replication of new AOD version / hot data n T1 -> T2; T2 -> T1 (see below)

5 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Centre Roles: T2 à T2 storage reqts: n Caching of T1 data for analysis; no custodial function n Working space for analysis groups, MC production à T2 processing reqts: n Analysis / MC production only n Assume ratio of analysis:MC constant across T2 à T1 -> T2 dataflow: n AOD: comes from any T1 in principle, often from associated T1 For centres without ‘local’ T1, can usefully share the load n RECO: must come from defined T1 with that sample n Implies full T1 -> T2 many-to-many interconnection Natural consequence of storage-efficient computing model à T2 -> T1 dataflow: n MC data always goes to associated T1

6 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 T1/T2 Associations à NB: These are working assumptions in some cases à Stream “allocation” ~ available storage at centre

7 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Centre Roles: CERN CAF / T1 à CAF functionality n Provides short-latency analysis centre for critical tasks n e.g. detector studies, DQM, express analysis, etc n All data available in principle à T1 functionality n CERN will act as associated T1 for RDMS / Ukraine T2 n Note: not a full T1 load, since no T1 processing, no RECO serving n There is the possibility to carry out more general T1 functions e.g. second source of some RECO in case of overload n Reserve this T1 functionality to ensure flexibility Same spirit as the CAF concept à CERN non-T0 connections n Specific CERN -> T2 connection to associated centres n Generic CERN -> T2 connection for service of unique MC data, etc n T1 CERN connection for new-AOD exchange

8 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Transfer Rates à Calculating data flows n T0->T1: data rates, running period Rate is constant during running, zero otherwise n T1 T1; total AOD size, replication period (currently 14 days) High rate, short duty cycle (so OPN capacity can be shared) Short repl. period driven by disk reqd for multiple AOD copies n T1->T2: T2 capacity; refresh period at T2 (currently 30 days) This gives the average rate only - not a realistic use pattern n T2->T1: total MC per centre per year à Peak versus average (T1 -> T2) n Worst-case peak for T1 is sum of T2 transfer capacities Weighted by data fraction at T1 n Realistically, aim for: average_rate < T1_capacity < peak_rate n Difference between peak / avg is uniformly a factor 3-4 n Better information on T2 connection speeds will be needed

9 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Outputs: Rates à Units are MB/s à These are raw rates: no catchup (x2?), no overhead (x2?) n Potentially some large factor to be added n A common understanding is needed à FNAL T2-out-avg is around 50% US, 50% external

10 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Outputs: Capacities à “Resource” from a simple estimate of relative unit costs n CPU : Disk : Tape at 0.5 : 1.5 : 0.3 (a la B. Panzer) à Clearly some fine-tuning left to do n But is a step towards a reasonably balanced model à Total is consistent with top-down input to CRRB, by construction à Storage classes are still under study n Megatable totals are firm, but diskXtapeY categories are not n This may be site-dependent (also, details of cache)

11 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 e.g. RAL Storage Planning

12 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Comments / Next Steps? à T1 / T2 process: n Has been productive and useful; exposed many issues à What other information is useful for sites? n Internal dataflow estimates for centres (-> cache sizes, etc) n Assumptions on storage classes, etc. n Similar model estimates for 2007 / 2009+ n Documentation of assumed CPU capacities at centres à What does CMS need? n Feedback from sites (not overloaded with this so far) n Understanding of site ramp-up plans, resource balance, network capacity n Input on realistic LHC schedule, running conditions, etc n Feedback from providers on network requirements à Goal: detailed self-consistent model for 2007/8 n Based upon real / guaranteed centre, network capacities… n Gives at least an outline for ramp-up at sites, global experiment n Much work left to do…

13 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Backup: Rate Details

14 Dave.Newbold@cern.chpre-GDB Meeting, 5/12/06 Backup: Capacity Details


Download ppt "Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing."

Similar presentations


Ads by Google