Download presentation
Presentation is loading. Please wait.
Published byJudith Bösch Modified over 6 years ago
1
Near Real Time Reconstruction of PHENIX Run7 Minimum Bias Data From RHIC
Project Goals Reconstruct 10% of PHENIX min bias data from the RHIC Run7 (Spring 2007, 200 GeV) in near real time at Vanderbilt University computer farm Envisioned as a demonstration project to PHENIX, CMS, VU, and DOE Speed the process of deriving physics information from the detector’s raw data Currently about one year elapses between end of data taking and end of reconstruction Run7 data taken by PHENIX may not be analyzed in time for QM’08 (February 2008) In future Vanderbilt could be reconstructing ~1/3 of PHENIX raw data in real time, with ~1/3 analyzed at CCF and CCJ (PHENIX computer centers France and Japan), and 1/3 at RHIC Computer Facility (RCF) ===> more time and more CPUs for analyses ! Project Requirements Large computer farm with ~50 TBytes of disk space, 200 CPUs Fast and automated file transport to and from the Vanderbilt computer farm Close coordination among PHENIX, Vanderbilt, and the RHIC Computer Facility Dedicated scientific staff at Vanderbilt (1 faculty, 1 PD, 3 students) Most important facet is complete automation and comprehensive monitoring tools June 15, 2007 CMS-HI at CERN
2
PRDFs Vanderbilt Farm 1600 CPUs, 80 TBytes disk 45TB and 200 CPUs
Available for Run7 Reconstruction RCF RHIC computing facility Reconstruction 200 jobs/cycle PRDFs nanoDSTs 18 hours/job 3 FDT 45 MB/s FDT 45 MB/s 1 770 GBytes per cycle PRDFs Raw data files GridFTP to VU 30 MBytes/sec 2 nanoDSTs Reco output to RCF GridFTP 23 MB/sec Dedicated GridFTP Server Firebird 4 4.4TB of buffer disk space June 15, 2007
3
Actual Experience April - June 2007
Raw Data File Transport (PHENIX DAQ -> Vanderbilt) Automated GridFTP tools in use since 2005 (to CCJ, ORNL, VU, CCF) Works very well and fast enough (30 MBytes/second for current needs) Can (and should) be pushed harder New FDT scripts run for internal transfers can run at 100 MBytes/second FDT is a java-based technology supported out of CERN Throttled back to 45 MBytes/second to minimize impact elsewhere on farm May receive 37 TB of raw data (Au+Au, 200 GeV) containing 325M events by the end of June Reconstruction cycle Larger than expected output size: 70% of input size, PHENIX had previewed 35% Transport time of output limits the compute cycle to being 200 jobs to avoid pileup Cycle is 18 hours (jobs are 12 CPU-hours but new quad CPUs are only 75% efficient [memory bus] !!) Weekends, holidays are slack times --> 200 jobs are start running right away Weekdays are busier, means that there can be a 12 hour wait for the last job to start Reconstructed Output File Transport (Vanderbilt -> RCF) Most fragile component due to disk I/O competition at RCF with other PHENIX users PHENIX is severely short on disk space for this year, but better next fiscal year Developed “fault tolerant” GridFTP scripts which recover if transfers fail mid-stream Large demand to view files arriving at RCF (4 new subsystems installed for Run7) June 15, 2007 CMS-HI at CERN
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.