UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.

Slides:

Advertisements

Similar presentations

4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.

Advertisements

McFarm: first attempt to build a practical, large scale distributed HEP computing cluster using Globus technology Anand Balasubramanian Karthik Gopalratnam.

Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen.

Workload Management Massimo Sgaravatto INFN Padova.

The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.

The Mass Storage System at JLAB - Today and Tomorrow Andy Kowalski.

The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.

High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.

9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.

The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.

Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.

Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.

UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.

CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.

November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.

Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.

D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,

D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.

UTA Site Report Jae Yu UTA Site Report 2 nd DOSAR Workshop UTA Mar. 30 – Mar. 31, 2006 Jae Yu Univ. of Texas, Arlington.

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.

Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.

Jan. 17, 2002DØRAM Proposal DØRACE Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Remote Analysis Station ArchitectureRemote.

DØ RAC Working Group Report Progress Definition of an RAC Services provided by an RAC Requirements of RAC Pilot RAC program Open Issues DØRACE Meeting.

DØ Computing Model & Monte Carlo & Data Reprocessing Gavin Davies Imperial College London DOSAR Workshop, Sao Paulo, September 2005.

DØ RACE Introduction Current Status DØRAM Architecture Regional Analysis Centers Conclusions DØ Internal Computing Review May 9 – 10, 2002 Jae Yu.

21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.

Status of UTA IAC + RAC Jae Yu 3 rd DØSAR Workshop Apr. 7 – 9, 2004 Louisiana Tech. University.

Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.

Status of Grid-enabled UTA McFarm software Tomasz Wlodek University of the Great State of TX At Arlington.

Spending Plans and Schedule Jae Yu July 26, 2002.

26SEP03 2 nd SAR Workshop Oklahoma University Dick Greenwood Louisiana Tech University LaTech IAC Site Report.

D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.

DØSAR a Regional Grid within DØ Jae Yu Univ. of Texas, Arlington THEGrid Workshop July 8 – 9, 2004 Univ. of Texas at Arlington.

What is SAM-Grid? Job Handling Data Handling Monitoring and Information.

Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.

From DØ To ATLAS Jae Yu ATLAS Grid Test-Bed Workshop Apr. 4-6, 2002, UTA Introduction DØ-Grid & DØRACE DØ Progress UTA DØGrid Activities Conclusions.

HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.

UTA Site Report DØrace Workshop February 11, 2002.

GRID activities in Wuppertal D0RACE Workshop Fermilab 02/14/2002 Christian Schmitt Wuppertal University Taking advantage of GRID software now.

Feb. 14, 2002DØRAM Proposal DØ IB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) Introduction Partial Workshop Results DØRAM Architecture.

Re-Reconstruction Of Generated Monte Carlo In a McFarm Context 2003/09/26 Joel Snow, Langston U.

MC Production in Canada Pierre Savard University of Toronto and TRIUMF IFC Meeting October 2003.

Feb. 13, 2002DØRAM Proposal DØCPB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) IntroductionIntroduction Partial Workshop ResultsPartial.

Batch Software at JLAB Ian Bird Jefferson Lab CHEP February, 2000.

D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

McFarm Improvements and Re-processing Integration D. Meyer for The UTA Team DØ SAR Workshop Oklahoma University 9/26 - 9/27/2003

A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.

Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.

Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?

PROTO-GRID Status of Grid-enabled UTA McFarm software Tomasz Wlodek University of the Great State of Texas At Arlington.

Workload Management Workpackage

Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.

Belle II Physics Analysis Center at TIFR

Monte Carlo Production and Reprocessing at DZero

SAM at CCIN2P3 configuration issues

Grid Canada Testbed using HEP applications

The History of Operating Systems

DØ MC and Data Processing on the Grid

Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.

Lee Lueking D0RACE January 17, 2002

DØ RAC Working Group Report

Proposal for a DØ Remote Analysis Model (DØRAM)

Presentation transcript:

UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software What has been happening?? Conclusion

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 2 UTA DØ Monte Carlo Farm UTA operates 2 Linux MC farms: HEP and CSE HEP farm : 6x566, 36x866 MHz processors, 3 file servers, (250 GB) one job server, 8mm tape drive. CSE farm: 10x866 MHz processors, 1 file server (20 GB), 1 job server Exploring an option of adding a third farm ( ACS, 36x866 MHz ) Control software (job submission, load balancing, archiving, bookkeeping, job execution control etc) developed entirely in UTA by Drew Meyer Scalable : started with 7 and 52 processors at present

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 3 MCFARM – UTA Farm Control Software MCFARM is a specialized batch system for: Pythia, Isajet, D0g, D0sim, D0reco, recoanalyze Can be adapted for other sites and experiments with relative minor change Reliable Error recovery and check point system: It knows how to handle and recover from typical error conditions. Robust – even if several nodes crash the production can continue Interfaced to SAM and bookkeeping package, easily exports production status to WWW page

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 4 HEP farm at UTA (1 supervisor, 3 file servers, 21 workers, tape drive) CSE farm at UTA (1 supervisor, 1 file server, 10 workers) 25 dual 866MHz MHz …… SAM mass storage in FNAL bb_ftp UTA cluster of Linux farms and current expansion plans 8 mm tape ACS farm at UTA (planned) MHz UTA www server UTA analysis server (300Gb) bb_ftp (planned)

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 5 Main server (Job Manager) Can read and write to all other nodes Contains executables and job archive Execution node (The worker) Mounts its home directory on main server. Can read and write to file server disk File server Mounts /home on main server. Its disk stores min bias and generator files and is readable and writable by everybody HEP and CSE farms share the same layout, differing only by the number of nodes involved and by the export software Flexible layout allows for simple expansion process

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 6 Generator job (Pythia, Isajet, …) DØgstar (DØ GEANT) DØsim (Detector response ) Underlying events (prepared in advance) DØreco (reconstruction) RecoA (root tuple) SAM storage in FNAL DØ Monte Carlo Production Chain

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 7 Distribute daemon Bookkeeper Lock manager Execute daemon Monitordaemon Gather daemon Job archive Cache disk SAM Remote machine UTA MC farm software daemons and their control WWW Rootdaemon

Job Life Cycle Distribute queue Distributer daemon Execute queue Executer daemon Error queue Gatherer queue Cache, SAM, archive Gatherer daemon

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 9 Mcp10 production (Oct2001-now) Jobs done recoA files In SAM Reco events in SAM

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 10 What’s been happening for Grid? Investigating network bandwidth capacity at UTA –Conducting tests using normal FTP and bbftp –The UTA farm will be put on a gigabit bandwidth link Would like to leverage on our extensive experience with Job packaging and control –Would like to interface farm control to more generic Grid tools –A design document for such higher level interface has been submitted for perusal to the DØGrid group. Expand to include ACS farm Exploit SAM station set up and exercise remote reconstruction –Proposed to the displaced vertex group to reconstruct their special data set  More complex than originally anticipated due to DB transport Upgrade the HEP farm server

Feb. 12, 2002Jae Yu, UTA Grid Effort DØRACE Workshop 11 Conclusions The UTA farm has been very successful –The internally developed UTA MCFARM software is solid and robust –The MC production is very efficient We plan to use our farms for data reprocessing, not only MC production. We would like to leverage on the extensive experience of running MC production farm –We believe we can contribute significantly in higher level user interface and job packaging