A B A B AR InterGrid Testbed Proposal for discussion Robin Middleton/Roger Barlow Rome: October 2001
Background B A B AR exists Already many millions of events, many Terabytes of data “Not data challenges but challenging data’’ Data increasing faster than SLAC computer center can buy computers Must use distributed computing model Tier A sites: SLAC and IN2P3, RAL… Tier B sites at universities Users will want to use such sites in a unified environment ‘as if’ they were working at SLAC. Solution: the Grid
Many aspects 1. Data distribution Smart network copying, tape archiving, etc 2. Data management Multiple copies, reprocessed data, selection of data, full and DST formats, with and without Objectivity 3. Job submission “Run this job on this data” without specifying details This proposal tackles Number 3
The sites SLAC IN2P3 RAL MAN ? ? ? ?
Use Case User prepares binary as usual Specify data to run on with Metadata description Press ‘Go’ button Wait for output
Behind the scenes Data split into many jobs running on many nodes (possibly at several sites) Take jobs to the data: transfer binaries (if necessary), control files, environment Run jobs, monitor, restart(?) failures… Collate output files
Issues Mutual recognition of certificates accounts Common environment at sites DLLs Databases How to match jobs to nodes (‘Want ads’) What to specify Application to Reprocessing MC production User analysis
Who’s doing what Work within grids is proceeding E.g. Manchester RAL within GridPP B A B AR computing effort specifically assigned to the problem UK-French discussion ongoing Aim of this InterGrid project is to bring all this together in one homogenous system
Targets Proof-of-principle demonstrator in 12 months leading to Production quality system becoming part of B A B AR way of life in 1-3 years