Download presentation
Presentation is loading. Please wait.
Published byAllan Lindsey Modified over 9 years ago
1
Post-DC2/Rome Production Kaushik De, Mark Sosebee University of Texas at Arlington U.S. Grid Phone Meeting July 13, 2005
2
July 11, 2005 Kaushik De 2 Overview Why restart managed production? ~450 people attended Rome meeting, ~100 talks, many based on DC2/Rome data (though some still using DC1 or ATLFAST data). Growing number of physicists looking at data every day. Since Rome, many physics groups need data urgently and are starting to do private production since ‘grid is not available’. SUSY needs background sample, top needs large dataset, Higgs… Private production of common data samples is wasteful – many samples repeated, never used, mistakes… Private production is not the correct model for 2008 – we do not have infinite computing resources, we will need quality control… Grid needs to be available for ATLAS – just like the components of the detector, or core software – as a service to the collaboration.
3
July 11, 2005 Kaushik De 3 Production Proposal Plan is being developed jointly with physics coordination, software releases, and with prodsys development Ian Hinchliffe & Davide Costanzo presented physics plan June 21 st KD is organizing distributed production for all ATLAS - presented this talk during phone meeting July 11 th General assumptions about grid: Production needs to help with testing of new prodsys Production must allow for shutdowns required to upgrade middleware (like OSG, ARC, LCG), services (like RLS, DQ, DB) First proposal: Restart low level production mid-July July/August – validate release 10.0.4 (first on OSG, later NG/LCG) September – test new prodsys October – 1M event production to provide data for Physics Workshop
4
July 11, 2005 Kaushik De 4 July 2005 Proposal Finish up Rome pile-up sample on LCG Archive/clean-up files on all grids, after checking with physics groups and making general announcement with 2 weeks notice Archive and delete all DC1 and DC2 files? Archive and delete Rome simul files? Upgrade middleware/fabric OSG – started end of June, ready by mid-July (Yuri Smirnov doing tests of U.S. sites already with Rome top sample) ARC? LCG? Prodsys/grid testing – started Do 10.0.4 validation Xin Zhao started installations on OSG Pavel Nevski defining jobs
5
July 11, 2005 Kaushik De 5 August 2005 Proposal Start production of some urgent physics samples Use DC2 software Get list from physics groups Validation of 10.0.x Stage-in input files needed for September Prodsys integration and scaling testsGrid infrastructure tests of new fabric – top sample RTT set-up and testing (for nightly validation) DDM testing Complete deployment of 10.0.4/10.0.x on all grids
6
July 11, 2005 Kaushik De 6 Production Samples Physics groups will define 3 major samples Sample A for quick validation or debugging of software scale 10^5 events, 15 datasets Sample B validation of major releases scale 10^6 events, 25 datasets Sample C production sample scale 10^7 events (same as DC2 or Rome), 50 datasets
7
July 11, 2005 Kaushik De 7 September - Validation Sep. 12-26, 2005 Goal: test production readiness (validate prodsys) Use sample A (throw-away sample) with Rome release 10.0.1 Start RTT nightly tests with 10^4 events Start grid & prodsys tests with 10^5 events Steps: evgen, simul, digi and reco Scale: 2000 jobs (50 events each) in 2 weeks (<1% of Rome rate) Sep. 26-Oct. 3, 2005 Goal: test production readiness (validate new release) Use sample A with release 10.0.4, same steps Grid, prodsys & DDM (pre-production) tests with 10^5 events Scale: 2000 jobs in 1 week (~2% of Rome rate)
8
July 11, 2005 Kaushik De 8 Computer Commissioning Proposal Oct. 3-17, 2005 Goal: production for Physics Workshop scheduled end of October Prodsys – ready? Rod? DDM – ready? Miguel? Contingency plan? Use sample B with release 10.0.x Steps: evgen, simul, digi, reco, tag (concatenation) Scale: 10,000 jobs with 100 events each in 2 weeks (~10-20% of peak Rome rate, sample size is ~15% of total Rome sample) Note: this is the end of line for release 10
9
July 11, 2005 Kaushik De 9 Release 11 CSC Proposal October/November RTT run every night to discover problems – Davide? 1 week after any bug fix release, run Sample A on grid, scale 1000 jobs (100 events each), all steps if possible, typically in 2-5 days 1 week after any stable release, run Sample B on grid, scale 10k jobs, all steps, typically in 1-2 weeks (this is still <10% of peak Rome production rate) November/December Goal: generate background sample for blind challenge, test prodsys Use sample C with stable/tested release 11.0.x Steps: evgen, simul, digi, reco, tag (concatenation) Scale: 100k jobs with 100 events each (should exceed Rome rate, sample size approx. same as Rome)
10
July 11, 2005 Kaushik De 10 2006 CSC Proposal Early 2006 Goal: blind challenge (physics algorithm tune-up, test analysis model) Mix background data with blind samples (not done on grid to protect content – possibly run at a few Tier 2 sites) Continue to run Samples A and B on grid for every release Calibration/alignment test with Release 12 – require Sample C scale production (equivalent to Rome)
11
July 11, 2005 Kaushik De 11 Appendix: Definition of Sample A ~100k events, 10k per sample 1. Min Bias 2. Z->ee 3. Z->mumu 4. Z->tautau forced to large pt so that the missing et is large. 5. H->gamgam (130 GeV) 6. b-tagging sample 1 7. b-tagging sample 2 8. top 9. J1 10. J4 11. J8
12
July 11, 2005 Kaushik De 12 Appendix: Definition of Samples B, C Sample B: 1M events, at least 25k per sample Includes all the sets from sample B, plus additional physics samples Sample C: 10M events, at least 100k per sample Includes all the sets from sample A, 500k events each Additional physics samples
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.