CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.

Slides:



Advertisements
Similar presentations
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
Advertisements

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
L3 Filtering: status and plans D  Computing Review Meeting: 9 th May 2002 Terry Wyatt, on behalf of the L3 Algorithms group. For more details of current.
The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.
1 Data Management D0 Monte Carlo needs The NIKHEF D0 farm The data we produce The SAM data base The network Conclusions Kors Bos, NIKHEF, Amsterdam Fermilab,
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
MiniBooNE Computing Description: Support MiniBooNE online and offline computing by coordinating the use of, and occasionally managing, CD resources. Participants:
Plans for Trigger Software Validation During Running Trigger Data Quality Assurance Workshop May 6, 2008 Ricardo Gonçalo, David Strom.
André Augustinus 10 September 2001 DCS Architecture Issues Food for thoughts and discussion.
Stephen Wolbers CHEP2000 February 7-11, 2000 Stephen Wolbers CHEP2000 February 7-11, 2000 CDF Farms Group: Jaroslav Antos, Antonio Chan, Paoti Chang, Yen-Chu.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Tape logging- SAM perspective Doug Benjamin (for the CDF Offline data handling group)
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
CDF Production Farms Yen-Chu Chen, Roman Lysak, Stephen Wolbers May 29, 2003.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
The KLOE computing environment Nuclear Science Symposium Portland, Oregon, USA 20 October 2003 M. Moulson – INFN/Frascati for the KLOE Collaboration.
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
Grid Glasgow Outline LHC Computing at a Glance Glasgow Starting Point LHC Computing Challenge CPU Intensive Applications Timeline ScotGRID.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
June 17th, 2002Gustaaf Brooijmans - All Experimenter's Meeting 1 DØ DAQ Status June 17th, 2002 S. Snyder (BNL), D. Chapin, M. Clements, D. Cutts, S. Mattingly.
1 EIR Nov 4-8, 2002 DAQ and Online WBS 1.3 S. Fuess, Fermilab P. Slattery, U. of Rochester.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
DoE Review January 1998 Online System WBS 1.5  One-page review  Accomplishments  System description  Progress  Status  Goals Outline Stu Fuess.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
LHCbComputing LHCC status report. Operations June 2014 to September m Running jobs by activity o Montecarlo simulation continues as main activity.
5 June 2003Alan Norton / Focus / EP Topics1 Other EP Topics Some 2003 Running Experiments - NA48/2 (Flavio Marchetto) - Compass (Benigno Gobbo) - NA60.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
CPU on the farms Yen-Chu Chen, Roman Lysak, Miro Siket, Stephen Wolbers June 4, 2003.
MC Production in Canada Pierre Savard University of Toronto and TRIUMF IFC Meeting October 2003.
04/09/2007 Reconstruction of LHC events at CMS Tommaso Boccali - INFN Pisa Shahram Rahatlou - Roma University Lucia Silvestris - INFN Bari On behalf of.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?
Status Report on Data Reconstruction May 2002 C.Bloise Results of the study of the reconstructed events in year 2001 Data Reprocessing in Y2002 DST production.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
September 26, 2003K User's Meeting1 CCJ Usage for Belle Monte Carlo production and analysis –CPU time: 170K hours (Aug.1, 02 ~ Aug.22, 03)
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
ALICE Computing Upgrade Predrag Buncic
Grid Canada Testbed using HEP applications
DØ MC and Data Processing on the Grid
The ATLAS Computing Model
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
Presentation transcript:

CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001

CDF Collaboration Meeting, May 30, Outline Farms Status, Size, Capacity Processing History –Commissioning –1x8, 36x36 Ramp up to full data-taking rates Monte Carlo Production Possibilities

CDF Collaboration Meeting, May 30, CDF Run 2 Farms –48(50) PIII/500 duals running Linux, 512 MB memory, 42 GB disk, 100 Mbit ethernet –40 PIII/800 duals running Linux, 512 MB memory, 50 GB disk, 100 Mbit ethernet –SGI O2200 I/O node + SGI O2000 I/O node Disk on both and tapedrives on the O2000 –Cisco 6509 switch to connect it all together –64 PIII/1 GHz duals have been purchased and they will be in production in July/August. –Total: 152 duals, equivalent to 240 duals (500 MHz)

CDF Collaboration Meeting, May 30, Run II CDF PC Farm

CDF Collaboration Meeting, May 30,

6 Total Capacity Assume 5 s/event on a PIII/500 machine. –Full code, fully optimized, R2 luminosity At 100% utilization: –Events/s = 240*2/5 –= 96 Hz Even if one assumes a more realistic utilization it will be easy to keep up with new data at peak rates. In addition there will be reprocessing and Monte Carlo. More machines can and probably will be purchased in FY02.

CDF Collaboration Meeting, May 30, Processing History Commissioning Run, first pass: g –Was run 4 weeks after the data were collected –9.8 Million Events, 730 GB input, 1080 GB output –179 code crashes –CPU/event = 1.2 seconds (PIII/500)

CDF Collaboration Meeting, May 30, Processing History Commissioning Data, second pass: –Was run 12 weeks after data-taking –9.9 Million Events, 914 GB input, 1.5 TB output –32 code crashes –CPU/event = 1.5 seconds (PIII/500)

CDF Collaboration Meeting, May 30, Processing History 1x8 Data: _int1_patch –Run in quasi-real time, as the data was taken –2.5 Million Events, 400 GB input, 498 GB output –0 crashes (exception handling was off) –4 streams were processed (a,b,c,d) –CPU/event = 0.5 seconds (PIII/500)

CDF Collaboration Meeting, May 30, Processing History 36x36 Pass 1: c ProductionExe –Ran about 1 week after data was taken –5.1 Million events, 1.2 TB input –1.6 TB output –15 code crashes –CPU/event = 1.0 seconds (PIII/500)

CDF Collaboration Meeting, May 30, First Pass of 36x36 Production Scheduled Hardware work during accelerator downtime 75 Hz

CDF Collaboration Meeting, May 30, Processing History 36x36 pass 2 –ProductionExe Started May 20 CPU/event = 2.3 seconds (PIII/500)

CDF Collaboration Meeting, May 30, Processing Summary Commissioning3.11.0g9.8 Million1.2 sec/event Commissioning Million1.5 sec/event 1x Million0.5 sec/event 36x c5.1 Million1.0 sec/event 36x Million2.3 sec/event TOTAL32.5 Million events TOTAL CPU 44.7 x 10E6 CPU seconds = 2.3 days on current farm CPU time/event is affected by luminosity, trigger mix, and code

CDF Collaboration Meeting, May 30, Issues for Ramp up to full data taking rates Hardware Software –Expressline External to farms People

CDF Collaboration Meeting, May 30, Hardware Plenty of CPU Disk, networking, OK Tapedrives and tape I/O – part of DH system Overall -- OK

CDF Collaboration Meeting, May 30, Software Farm Software – mostly ready. Need to run with multi-streams, input and output. Essentially ready, no problems expected. Expressline: Will dedicate a portion of the farm to expressline at all times.

CDF Collaboration Meeting, May 30, External to Farms Farms rely on many things: –ProductionExe –Calibrations –Data Handling System –Networks –Dataset definitions The farms group will continue to work with CDF offline, data handling group, CDF L3, etc. to ensure that timely production is assured.

CDF Collaboration Meeting, May 30, People Production Coordinators (shifters) –Working well (thanks!) –Debugging, checking output, checking logs, checking status, running jobs, etc. –Work, responsibilities, will ramp up with the data-taking rate increasing. –Look at fnpcc.fnal.gov to see the status

CDF Collaboration Meeting, May 30, How much Monte Carlo? The original farm design document (CDF 4810) envisioned a part of the farm for Monte Carlo generation and reconstruction. Assume that this is about ¼ of the farm, on the average. This would allow about 8 Hz of Monte Carlo, assuming that simulation+reconstruction is about 15 s/event on a PIII/500 (700,000 events/day). –This compares to 28 Hz average data-taking rate for CDF –Degrade by some factor for efficiency –Will be reduced if time/event increases  500,000 events/day Needs CDF physicist support! Effort to coordinate and verify that the simulation is correct for CDF.

CDF Collaboration Meeting, May 30, Summary Farms CPU Capacity is in place. Farms Infrastructure is in place. Effort needs to go into smoother operations, better communications, quicker startup, better quality control, better documentation as we head toward full-rate data-taking. All of CDF offline has had a big role in making the farms work.