Presentation is loading. Please wait.

Presentation is loading. Please wait.

GridPP10 Meeting CERN June 3 rd 2004

Similar presentations


Presentation on theme: "GridPP10 Meeting CERN June 3 rd 2004"— Presentation transcript:

1 GridPP10 Meeting CERN June 3 rd 2004
BaBarGrid Roger Barlow Manchester University 1:Simulation 2: Data Distribution: The SRB 3: Distributed Analysis GridPP10 Meeting CERN June 3 rd 2004

2 1: Grid based simulation (Fergus Wilson + Co.)
Using existing UK farms (80 CPUs) Dedicated process at RAL merging output and sending to SLAC Use VDT Globus rather than LCG Why? Installation difficulty/Reliability/stability problems. VDT Globus is subset of LCG: running on LCG system perfectly possible (in principle) US groups talk of using GRID3. VDT Globus is also a subset of GRID3 – but GRID3 and LCG different. Mistake to rely on LCG features? BaBarGrid: GridPP10, CERN June3 2004

3 BaBarGrid: GridPP10, CERN June3 2004
Current situation 5 Million events in official production since 7th March. Best week (so far!) 1.6 million events. Now producing at RHUL & Bristol. Manchester & Liverpool in ~2 weeks. Then QMUL & Brunel. 4 farms will produce 3-4 million a week. Sites cooperative (need to install BaBar Conditions Database which uses Objectivity) Major problem has been firewalls. Complicated interaction with all the communication and ports. Identifying the source has been hard. BaBarGrid: GridPP10, CERN June3 2004

4 What the others are doing
Italians and Germans going full-blown LCG route Objectivity database through networked ams servers (need 1 server per ~30 processes) Otherwise assume BaBar environment available at remote hosts Our approaches will converge one day Meanwhile, they will try sending jobs to RAL, we will try sending jobs to Ferrara. BaBarGrid: GridPP10, CERN June3 2004

5 BaBarGrid: GridPP10, CERN June3 2004
Future Keep production running. Test an LCG interface (RAL? Ferrara? Manchester Tier 2?) when we have the manpower. Will give more functionality and stability in the long-term. Smooth and streamline process BaBarGrid: GridPP10, CERN June3 2004

6 Richard P. Mount SLAC May 20, 2004
2: Data Distribution and The SRB SLAC/BaBar Richard P. Mount SLAC May 20, 2004 These slides stolen (with permission) from a PPDG talk

7 SLAC-BaBar Computing Fabric
Client Client Client Client Client Client 1500 dual CPU Linux 900 single CPU Sun/Solaris Objectivity/DB object database + HEP-specific ROOT software (Xrootd) IP Network (Cisco) Disk Server Disk Server Disk Server Disk Server Disk Server Disk Server 120 dual/quad CPU Sun/Solaris 400 TB Sun FibreChannel RAID arrays IP Network (Cisco) HPSS + SLAC enhancements to Objectivity and ROOT server code Tape Server Tape Server Tape Server Tape Server Tape Server 25 dual CPU Sun/Solaris 40 STK 9940B 6 STK 9840A 6 STK Powderhorn over 1 PB of data BaBarGrid: GridPP10, CERN June3 2004

8 BaBarGrid: GridPP10, CERN June3 2004
BaBar Tier-A Centers A component of the Fall 2000 BaBar Computing Model Offer resources at the disposal of BaBar; Each provides tens of percent of total BaBar computing/analysis need; 50% of BaBar computing investment was in Europe in 2002, 2003 CCIN2P3, Lyon, France in operation for 3+ years; RAL, UK in operation for 2+ years INFN-Padova, Italy in operation for 2 years GridKA, Karlsruhe, Germany in operation for 1 year. BaBarGrid: GridPP10, CERN June3 2004

9 BaBarGrid: GridPP10, CERN June3 2004
SLAC-PPDG Grid Team Richard Mount 10% PI Bob Cowles Strategy and Security Adil Hasan 50% BaBar Data Mgmt Andy Hanushevsky 20% Xrootd, Security … Matteo Melani 80% New hire Wilko Kroeger 100% SRB data distribution Booker Bense Grid software installation Post Doc BaBar - OSG BaBarGrid: GridPP10, CERN June3 2004

10 BaBarGrid: GridPP10, CERN June3 2004
Network/Grid Traffic BaBarGrid: GridPP10, CERN June3 2004

11 BaBarGrid: GridPP10, CERN June3 2004
SLAC-BaBar-OSG BaBar-US has been: Very successful in deploying Grid data distribution (SRB US-Europe) Far behind BaBar-Europe in deploying Grid job execution (in production for simulation) SLAC-BaBar-OSG plan Focus on achieving massive simulation production in US within 12 months make 1000 SLAC processors part of OSG Run BaBar simulation on SLAC and non-SLAC OSG resources BaBarGrid: GridPP10, CERN June3 2004

12 3: Distributed Analysis
At GridPP9: Good news: Basic grid job submission system deployed and working (Alibaba / Gsub) with GANGA portal Bad news: Low take up because of Users uninterested Poor reliability BaBarGrid: GridPP10, CERN June3 2004

13 BaBarGrid: GridPP10, CERN June3 2004
Since then… Mike Give talk at IoP parallel session Write Abstract (accepted) for All Hands meeting Write Thesis No real progress Alessandra Move to Tier 2 system manager post James Starts June 14th Attended GridPP10 meeting Roger Submit Proforma 3 Complete quarterly progress report Revise Proforma 3 Advertise and recruit replacement post Negotiate on revised Proforma 3 Write Abstract (pending) for CHEP Submit JeSRP-1 Write contribution for J Phys G Grid article Janusz Improve portal Develop web-based version BaBarGrid: GridPP10, CERN June3 2004

14 Future two-point plan(1)
James to review/revise/relaunch job submission system Work with UK Grid/SP team (short term) and Italian/German LCG system (long term) Improve reliability through core team of users on development system BaBarGrid: GridPP10, CERN June3 2004

15 Future two-point plan (2)
Drive Grid usage through incentive RAL CPUs very heavily loaded by BaBar. Slow turnround  stressed users Make significant CPU resources available to BaBar users only through the Grid Some of the new Tier 1/A resources All the Tier 2 (Manchester) resources And see that Grid certificate take-up grow! BaBarGrid: GridPP10, CERN June3 2004

16 Final Word Our problems today will be your problems tomorrow
challenges challenges BaBarGrid: GridPP10, CERN June3 2004


Download ppt "GridPP10 Meeting CERN June 3 rd 2004"

Similar presentations


Ads by Google