B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham
Fergus Wilson 2 Outline Personnel. Current BaBar Computing Model Monte Carlo Data Reconstruction User Analyses Projections of required resources. BaBar GRID effort and planning. Monte Carlo User Analysis
5th July 2005, Durham Fergus Wilson 3 BaBar GRID Personnel (2.5 FTEs) James Werner Manchester GridPP funded Giuliano Castelli RAL GridPP funded Chris Brew RAL 50% GRID Roger Barlow Manchester BaBar GRID PI We do not have an infinite number of monkeys… our goals are therefore constrained Fergus Wulson RAL
5th July 2005, Durham Fergus Wilson 4 BaBar Computing Model – Monte Carlo Monte Carlo is generated at ~25 sites around the world. Database driven production. ~20KBytes per event. ~10 seconds per event. 2.8 billion events generated last year. 99.5% efficient. Need million events per week. MC datasets (ROOT files) are merged and sent to SLAC. MC datasets are distributed from SLAC to any Tier 1/2/3 that wants them.
5th July 2005, Durham Fergus Wilson 5 BaBar Computing Model - Data 10 Mbytes/sec to tape at SLAC. Reconstructed at Padova (1.5 fb -1 /day). Skimmed into datasets at Karlsruhe. Skimmed datasets (ROOT files) sent to SLAC. Datasets are distributed from SLAC to any Tier 1/2/3 that wants them. An analysis can be run on a laptop.
5th July 2005, Durham Fergus Wilson 6 BaBar Computing Model – User Analysis Location of datasets provided by mySQL/Oracle database. Data/Monte Carlo datasets accessed via Xrootd file server (load-balancing, fault-tolerant, disk or tape interface). Conditions accessed from proprietary Objectivity database. User Code XrootdObjectivity Files Tier 1/2/3 mySQL
5th July 2005, Durham Fergus Wilson 7 Current Status at RAL Tier 1 RAL imports data and Monte Carlo every night. RAL has the full data and Monte Carlo for 4 out 15 of the Analysis Working Group. All disk and tape are full. Importing has stopped. We will have to delete our backups of the data. Moving to a disk/tape staging system but unlikely to keep up with demand. CPU underused at the moment.
5th July 2005, Durham Fergus Wilson 8 BaBar Projections Bottom-up planning driven by luminosity: Double dataset by 2006 (500 fb -1 ) Quadruple dataset by 2008 (1000 fb -1 )
5th July 2005, Durham Fergus Wilson 9 BaBar Monte Carlo on the GRID We have already produced 30 million Monte Carlo events on the GRID at Bristol/RAL/Manchester/RHUL (2004 using globus). Now using LCG at RAL: Software is installed via an RPM at sites (provided by BaBar Italian GRID groups). Job submission/control from RAL. 1.2 million events per week during June This is 7.5% of BaBar weekly production (during a slow period). Will aim to soak up 25% of our Tier 1 allocation with SP as requested by GridPP. Should do 3-6 million per week at RAL.
5th July 2005, Durham Fergus Wilson 10 BaBar Monte Carlo on the GRID – Tier 2 We are merging the QMUL, Birmingham and Bristol BaBar farms: 240 slow (866MHz) cpus. We will setup regional Objectivity servers that can be accessed over WAN. This means Objectivity is not needed at every Tier site. We need a large stable Tier 2 if we are to roll this out beyond RAL. We dont have the manpower to develop the MC and manage lots of small sites.
5th July 2005, Durham Fergus Wilson 11 BaBar GRID Data Analysis We now have a standard generic initialisation script for all GRID sites. Sets up BaBar environment. Sets up xrootd/objectivity. Identifies what software releases are available. Identifies what conditions are available. Identifies what collections of datasets are available. Identifies if site is setup and/or validated for Monte Carlo production.
5th July 2005, Durham Fergus Wilson 12 BaBar GRID Data Analysis Prototype Job Submission System (EasyGrid): interfaces to mySQL database to identify required datasets and allocates them to jobs. Submits jobs Resubmits jobs when they fail. Resubmits jobs when they fail again. Monitors progress. Retrieves output (usually root files). Have analysed 60 million events this way with jobs submitted from Manchester to RAL.
5th July 2005, Durham Fergus Wilson 13 BaBar GRID Data Analysis The Data Analysis works if you know that the data exists at a particular site. Datasets are not static: MC always being generated. Billions of events. Millions of files. Thousands (currently 36000) collections of datasets (arranged by processing release and physics process). The challenge will be to Interrogate sites about their available data. Allocate jobs according to available data and site resources. Monitor it all. First Step: Shortly the local mySQL database that identifies the locally available datasets will also know about the availability of datasets at every other site. Can then form the backend of an RLS.
5th July 2005, Durham Fergus Wilson 14 Conclusion We are already doing Monte Carlo production on the GRID. We have met all our deliverables. We will start major production at RAL. We need some large Tier 2 sites if this is to go anywhere in the UK. We are already doing Data Analysis on the GRID. We have met all our deliverables. Concentrate on sites with BaBar infrastructure and local datasets. Provide WAN-accessible servers. We have a prototype data analysis GRID interface. Still many GRID issues to be tackled before allowing normal people near it. BUT…the GRID still has prove it can provide a production quality service on the time scale of running experiments.