Download presentation
Presentation is loading. Please wait.
1
CMS-HI Offline Computing
Charles F. Maguire Vanderbilt University For the CMS-HI US Institutions May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
2
CMS-HI at the LHC for High Density QCD with Heavy Ions
LHC: New Energy Frontier for Relativistic Heavy Ion Physics Quantum Chromodynamics at extreme conditions (density, temperature, …) Pb+Pb collisions at 5.5 TeV, thirty times larger than Au+Au at RHIC Expecting a longer-lived Quark Gluon Plasma state accompanied by much enhanced yields of hard probes with high mass and/or transverse momentum CMS: Excels as a Heavy Ion Collisions Detector at the LHC Sophisticated high level trigger for getting rare important events at a rapid rate Best momentum resolution and tracking granularity Large acceptance tracking and calorimetry --> proven jet finding in HI events May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
3
Draft Version 4 on May 13 at 2:30 PM CDT
CMS-HI in the US 10 Participating Institutions Colorado, Iowa, Kansas, LANL, Maryland, Minnesota, MIT, UC Davis, UIC, Vanderbilt Projected to contribute ~60 FTEs (Ph.D. and students) as of 2012 MIT as lead US institution with Boleslaw Wyslouch as Project Manager US-CMS-HI Tasks (Research Management Plan in 2007) Completion of HLT CPU Farm Upgrade for HI events at CMS (MIT) Construction of the Zero Degree Calorimeter at CMS (Kansas and Iowa) Development of the CMS-HI Compute Center in the US (task force established) Task force recommended that the Vanderbilt University group lead the proposal composition Also want to retain the expertise developed by the MIT HI group at their CMS Tier2 site CMS-HI Compute Proposal to DOE is Due This Month Will be reviewed by ESnet managers, and also by external consultants May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
4
Basis of Compute Model for CMS-HI
Heavy Ion Data Operations for CMS at the LHC Heavy ion collisions expected in 2009, second year of running for the LHC Heavy ion running takes place during a 1 month period (106 seconds) At designed DAQ bandwidth the CMS detector will be writing 225 TBytes of raw data per heavy ion running period, plus ~75 TBytes support files Raw data will likely stay resident at CERN Tier0 disks for a few days at most, while transfers take place to the CMS-HI compute center in the US Possibility to have 300 TBytes of disk dedicated to HI data (NSF REDDnet project) Raw data will not be reconstructed at the Tier0, but will be written to a write-only (emergency archive) tape system before deletion from Tier0 disks Projected Data Volumes (optimistic scenario) Assumes that we don’t need 400 TBytes of disk space immediately TBytes in first year of HI operations TBytes in second year of HI operations 300 TBytes nominal size achieved in third year of HI operations May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
5
Draft Version 4 on May 13 at 2:30 PM CDT
Data Transport Options for CMS-HI Following ESnet-NP 2008 Workshop Recommendations CMS-HEP Raw Data Transport from CERN to FNAL Tier1 Using LHCnet to cross the Atlantic LHCnet terminating at Starlight HUP ESnet transport from Starlight into FNAL Tier1 centre Links are rated at 10 Gbps CMS-HI Raw Data Transport from CERN to US Compute Center Network topology has not been established at this time Vanderbilt is establishing a 10 Gbps path to SOX-Atlanta for end of 2008 Network Options (DOE requires a non-LHCnet backup plan) Use LHCnet to Starlight during one month when HI beams are being collided Transport data from Starlight to Vanderbilt compute center via ESnet/Internet2 Transfer links will still be rated at 10 Gbps to transfer data within ~1 month Do not use LHCnet but use other trans-Atlantic links supported by NSF, with links rated at 10 Gbps such that data are transferred over ~1 month Install 300 TByte disk buffer at CERN and use non-LHCnet trans-Atlantic links to transfer data over 4 months (US-Alice model) at ~2.5 Gbps May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
6
Vanderbilt to SoX Using Managed Service Connecting 10G to ACCRE only
May 14, 2008 Vanderbilt I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Vanderbilt I n f o r m a t i o n T e c h n o l o g y S e r v i c e s 6
7
Draft Version 4 on May 13 at 2:30 PM CDT
Data Transport Issues for CMS-HI Following ESnet-NP 2008 Workshop Recommendations To Use LHCnet or Not To Use LHCnet Use of LHCnet to US, following CMS-HEP path, is the simplest approach A separate trans-Atlantic link will require dedicated CMS-HI certifications DOE requires a non-LHCnet plan be discussed in CMS-HI compute proposal Issues With the Use of LHCnet by CMS-HI ESnet officials believe that LHCnet is already fully subscribed by FNAL HI month was supposed to be used for getting final sets HEP data transferred FNAL was quoted as having only 5% “headroom” left with use of LHCnet HI data volume is 10% of the HEP data volume Issues With the Non-Use of LHCnet by CMS-HI Non-use of LHCnet would be a new, unverified path for data out of the LHC CERN computer management would have to approve (same for US-ALICE) Installing a 300 TByte disk buffer system at CERN (ESnet recommendation) would also have to approved and integrated into the CERN Tier0 operations May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
8
Tape Archive for CMS-HI? Raw Data and Processed Data for Each Year
CMS-HI Raw Data Tape Archive: FNAL or Vanderbilt? Original suggestion was that the raw data archive be at Vanderbilt Vanderbilt already has a tape robot system, can upgrade to 1.5 more PBytes Possibly more economical to archive the raw (and processed) data at FNAL? Savings are a result of not having a person at VU dedicated to tape service Cost savings result in more CPUs for the CMS-HI compute center Issues for Tape Archive Decision Is there a true cost savings ($135K over 5 years) as anticipated? [see slide 10] What specific new burdens would be assumed at FNAL? Serving raw data twice per year to Vanderbilt for reconstruction Receiving processed data annually for archiving into tape system Is the tape archive decision coupled with the use/non-use of LHCnet? Administrative issue of transferring funds from DOE-NP to DOE-HEP May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
9
Data Processing for CMS-HI
Real Data Processing Model Two reconstruction passes per year taking 4 months each Two following analysis passes per year taking 2 months each Analyzed data placed at selected data depots (MIT, UIC, Vanderbilt) A CMS-HI Tier2 center is also forecast to exist in South Korea All CMS-HI institutions access AOD files via OSG job submissions Real Data Output File Production Model Normal reco output is 20% of raw data input Reco output with diagnostics is 400% of raw data input (only for initial runs) CPU Processing Requirement Extensive (several months) studies with MC HI events done at MIT Tier2 MC studies done with current CMSSW framework Conclusion that ~2900 CPU cores will be required for nominal year running Cores at SpecInt2K 1600 equivalent (obsolete unit, will do actual bench testing) Ramp up to nominal year will require fewer cores with fewer complex events May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
10
NOTE: None of these scenarios includes costs for network transfers
Implementation of CMS-HI Compute Center Three Scenarios Being Considered Raw Data + Output Production Archived on Tape at FNAL Cost transfer of $35K/year to FNAL, obtains 2.23 PBytes of tape in 5 years Five year sequence: TBytes of tape Cost savings estimated at $135K over 5 years After 5 years there will be 3000 CPU cores available at Vanderbilt Raw Data + Output Production Archived on Tape at VU Costs will include tapes and a dedicated FTE to support additional tape load After 5 years there will be 2480 CPU cores available at Vanderbilt Tape Archive at VU and 300 TBytes Buffer Disk at CERN Buffer disk at CERN allows raw data to be transferred over 4 months instead of 1 month Would need consultation with CERN Tier0 managers Suggested by ESnet managers as a fallback if trans-Atlantic network is not fast enough After 5 years there will be 2120 CPU cores available at Vanderbilt Possible these 300 TBytes could be funded by a separate proposal, e.g. NSF REDDnet NOTE: None of these scenarios includes costs for network transfers May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
11
Draft Version 4 on May 13 at 2:30 PM CDT
Backup Slides 1) VU network topologies ( ) 2) Gunther’s spreadsheet (to be added) May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT
12
Draft Version 4 on May 13 at 2:30 PM CDT
12
13
Draft Version 4 on May 13 at 2:30 PM CDT
Add ORNL or Starlight later using Dark Fiber (unmanaged) May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT 13
14
Draft Version 4 on May 13 at 2:30 PM CDT
LHCNet?? Fermilab?? *Would need to have peering or layer 2 VLAN setup. May 14, 2008 Draft Version 4 on May 13 at 2:30 PM CDT 14
15
Draft Version 4 on May 13 at 2:30 PM CDT
15
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.