Distributed Data Bases & GRID Interactive Access

Slides:



Advertisements
Similar presentations
1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
June, 20013rd ROOT Workshop1 PROOF and ROOT Grid Features Fons Rademakers.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 PROOF & GRID Update Fons Rademakers. 2 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI Bring the KB to the PB not the PB to the KB.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
ROOT Tutorials - Session 101 PROOT Tutorials – Session 10 PROOF, GRID, AliEn Fons Rademakers Bring the KB to the PB not the PB to the KB.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Evolution of Parallel Programming in HEP F. Rademakers – CERN International Workshop on Large Scale Computing VECC, Kolkata.
1 Marek BiskupACAT2005PROO F Parallel Interactive and Batch HEP-Data Analysis with PROOF Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
The ALICE short-term use case DataGrid WP6 Meeting Milano, 11 Dec 2000Piergiorgio Cerello 1 Physics Performance Report (PPR) production starting in Feb2001.
4/5/2007Data handling and transfer in the LHCb experiment1 Data handling and transfer in the LHCb experiment RT NPSS Real Time 2007 FNAL - 4 th May 2007.
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
INSTR021 Computing in High Energy Physics INSTR02 March Ren é Brun CERN.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
ROOT WorkShop1 ALICE F.Carminati/CERN Root WorkShop Geneva, October 14.
DataGrid is a project funded by the European Commission under contract IST rd EU Review – 19-20/02/2004 WP8 - Demonstration ALICE – Evolving.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB.
M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE.
The MEG Offline Project General Architecture Offline Organization Responsibilities Milestones PSI 2/7/2004Corrado Gatto INFN.
September, 2002CSC PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
ALICE RRB-T ALICE Computing – an update F.Carminati 23 October 2001.
Sept. 2000CERN School of Computing1 PROOF and ROOT Grid Features Fons Rademakers.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
LHCb computing model and the planned exploitation of the GRID Eric van Herwijnen, Frank Harris Monday, 17 July 2000.
Workload Management Workpackage
The EDG Testbed Deployment Details
Database Replication and Monitoring
U.S. ATLAS Grid Production Experience
CMS High Level Trigger Configuration Management
Moving the LHCb Monte Carlo production system to the GRID
Status of the Analysis Task Force
PROOF – Parallel ROOT Facility
INFN-GRID Workshop Bari, October, 26, 2004
Alice Week Offline Day F.Carminati June 17, 2002.
ALICE Physics Data Challenge 3
ALICE – Evolving towards the use of EDG/LCG - the Data Challenge 2004
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
April HEPCG Workshop 2006 GSI
Alice Week Offline Day F.Carminati March 18, 2002 ALICE Week India.
Partner: LMU (Atlas), GSI (Alice)
LCG middleware and LHC experiments ARDA project
US ATLAS Physics & Computing
Grid Data Integration In the CMS Experiment
Support for ”interactive batch”
Alice Software Demonstration
PROOF - Parallel ROOT Facility
ALICE Data Challenges Fons Rademakers Click to add notes.
Gridifying the LHCb Monte Carlo production system
ATLAS DC2 & Continuous production
Status and plans for bookkeeping system and production tools
Short to middle term GRID deployment plan for LHCb
Presentation transcript:

Distributed Data Bases & GRID Interactive Access Chicago June 1st 2002 René Brun CERN

How Much Data is Involved? High Level-1 Trigger (1 MHz) High No. Channels High Bandwidth (500 Gbit/s) Level 1 Rate (Hz) 106 1 billion people surfing the Web LHCB ATLAS CMS 105 HERA-B KLOE 104 CDF II High Data Archive (5 PetaBytes/year) 10 Gbits/s in Data base CDF 103 H1 ZEUS ALICE NA49 UA1 102 STAR 104 105 106 107 LEP Event Size (bytes)

ALICE Event/100 Front View of a simulated event with only 1/100 of the expected multiplicity

ALICE Event/100 After L3, the DAQ will generate 1.25 GigaBytes/second Side View of a simulated event with only 1/100 of the expected multiplicity Estimated size of one raw event = 40 Mbytes Simulated event with hits = 1.5 Gbytes Time to simulate one event = 24 hours After L3, the DAQ will generate 1.25 GigaBytes/second 2 PetaBytes/year

LHC Computing - a Multi-Tier Model 'X''Y''Z': RAL, IN2P3, FNAL, BNL, FZK(?), . . . Department    Desktop CERN – Tier 0 (CERN - Tier 1) Tier 1 X Y Z 622 Mbps 2.5 Gbps 155 mbps Tier2 Lab a Uni b Lab c Uni n  Organising Software: "Grid-Middleware" "Transparent" user access to applications and all data 

ROOT + RDBMS Model Run/File Catalog Event Store Trees ROOT files Oracle MySQL histograms Calibrations Geometries Run/File Catalog Trees Event Store

Memory <--> Tree The Tree entry serial number T.GetEntry(6) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 T.Fill() 18 T

Tree Friends Public read User Write tr Entry # 8 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Public read User Write Public read

Tree Friends Processing time independent of the number of friends Collaboration-wide public read Processing time independent of the number of friends unlike table joins in RDBMS Analysis group protected user private x Root > TFile f1(“tree1.root”); Root > tree.AddFriend(“tree2”,“tree2.root”) Root > tree.AddFriend(“tree3”,“tree3.root”); Root > tree.Draw(“x:a”,”k<c”); Root > tree.Draw(“x:tree2.x”,”sqrt(p)<b”);

Binary search in table above ch.GetEntryWithIndex(12,567); Chains of Trees f0 f1 f2 f3 f4 f5 f6 f7 2 4 4 5 5 4 7 11 12 17 23 26 32 37 45 49 1 2 3 4 5 6 7 TChain ch(“T”); ch.Add(“f0.root”); ch.Add(“f1.root”); ch.Add(“f2.root”); ch.Add(“f3.root”); ch.Add(“f4.root”); ch.Add(“f5.root”); ch.Add(“f6.root”); ch.Add(“f7.root”); Binary search in table above find slot 4, local entry 2 T.GetEntry(2) in f4.root ch.GetEntry(28); ch.GetEntryWithIndex(12,567);

8 leaves of branch Electrons A double-click to histogram the leaf 8 Branches of T

The Tree Viewer & Analyzer A very powerful class supporting complex cuts, event lists, 1-d,2-d, 3-d views parallelism

Performance Monitoring ALICE Data challenges Raw Data Simulated Data GEANT3 GEANT4 FLUKA AliRoot DAQ File Catalogue Performance Monitoring ROOT I/O CERN TIER 0 TIER 1 ROOT CASTOR Regional TIER 1 TIER 2 GRID

ALICE Data Challenge III Need to run yearly DC of increasing complexity and size to reach 1.25GB/s ADC III gave excellent system stability during 3 months DATE throughput: 550 MB/s (max) 350 MB/s (ALICE-like) DATE+ROOT+CASTOR throughput: 120 MB/s, <85> MB/s 2200 runs, 2* 107 events, 86 hours, 54 TB DATE run 500 TB in DAQ, 200 TB in DAQ+ROOT I/O, 110 TB in CASTOR 105 files > 1GB in CASTOR and in MetaData DB HP SMP’s: cost-effective alternative to inexpensive disk servers Online monitoring tools developed MB/s Writing to local disk Migration to tape

ALICE GRID resources 1000 physicists 37 people GRID-aware http://www.to.infn.it/activities/experiments/alice-grid Yerevan CERN Saclay Lyon Dubna Capetown, ZA Birmingham Cagliari NIKHEF GSI Catania Bologna Torino Padova IRB Kolkata, India OSU/OSC LBL/NERSC Merida Bari 1000 physicists 37 people GRID-aware 21 institutions

The CORE GRID functionality ALICE GRID File Catalogue as a global file system on a RDB TAG Catalogue, as extension Secure Authentication Interface to Globus available Central Queue Manager ("pull" vs "push" model) Interface to EDG Resource Broker available Monitoring infrastructure The CORE GRID functionality Automatic software installation with AliKit http://alien.cern.ch

ALIEN File catalogue Perl5 SOAP Architecture Data access Bookkeeping File catalogue : global file system on top of relational database Secure authentication service independent of underlying database Central task queue API Services (file transport, sync) Perl5 SOAP Architecture Data access ALICE USERS SIM Tier1 LOCAL |--./ | |--cern.ch/ | | |--user/ | | | |--a/ | | | | |--admin/ | | | | | | | | | |--aliprod/ | | | | | | | |--f/ | | | | |--fca/ | | | |--p/ | | | | |--psaiz/ | | | | | |--as/ | | | | | | | | | | | |--dos/ | | | | | |--local/ |--simulation/ | |--2001-01/ | | |--V3.05/ | | | |--Config.C | | | |--grun.C | |--36/ | | |--stderr | | |--stdin | | |--stdout | | | |--37/ | |--38/ | | | |--b/ | | | | |--barbera/ Files, commands (job specification) as well as job input and output, tags and even binary package tar files are stored in the catalogue File catalogue --./ | |--r3418_01-01.ds | |--r3418_02-02.ds | |--r3418_03-03.ds | |--r3418_04-04.ds | |--r3418_05-05.ds | |--r3418_06-06.ds | |--r3418_07-07.ds | |--r3418_08-08.ds | |--r3418_09-09.ds | |--r3418_10-10.ds | |--r3418_11-11.ds | |--r3418_12-12.ds | |--r3418_13-13.ds | |--r3418_14-14.ds | |--r3418_15-15.ds D0 path dir hostIndex entryId char(255) integer(11) <fk> <pk> T2526 type name owner ctime comment content method methodArg gowner size char(4) integer(8) char(64) char(8) char(16) char(80) char(20) T2527 Bookkeeping Authentication

GRID and Interactive systems So far, The GRID middleware has been designed to support batch services like large scale simulations or reconstruction. No doubts. These services will work. They require agreements between major Labs, Tier1 centers and the production managers. The potential of the GRID is more interesting for interactive systems in data analysis, the area where most physicists spend their time.

DataGrid & PROOF Bring the KB to the PB and not the PB to the KB PROOF Selection Parameters Procedure PROOF TagD B CPU RD B Local DB 1 DB 4 DB 5 DB 6 DB 3 DB 2 Proc.C Proc.C Remote Proc.C Proc.C Proc.C Bring the KB to the PB and not the PB to the KB

Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a set of files parallel analysis of objects in a set of files on clusters of heterogeneous machines Its design goals are: transparency, scalability, adaptability Prototype developed in 1997 as proof of concept (only for simple queries resulting in 1D histograms)

Parallel Script Execution #proof.conf slave node1 slave node2 slave node3 slave node4 Local PC Remote PROOF Cluster proof proof = master server root stdout/obj proof proof = slave server ana.C proof proof proof *.root TFile node1 ana.C TNetFile *.root $ root root [0] .x ana.C $ root root [0] .x ana.C root [1] gROOT->Proof(“remote”) root [2] gProof->Exec(“.x ana.C”) $ root root [0] .x ana.C root [1] gROOT->Proof(“remote”) $ root node2 TFile *.root node3 TFile *.root node4

Workflow For Tree Analysis Slave 1 Master Slave N Tree->Draw() Tree->Draw() Initialization Packet generator Initialization GetNextPacket() GetNextPacket() Process 0,100 100,100 Process GetNextPacket() GetNextPacket() Process 200,100 300,40 GetNextPacket() Process GetNextPacket() Process 340,100 440,50 Process GetNextPacket() GetNextPacket() Process 490,100 590,60 Process SendObject(histo) SendObject(histo) Wait for next command Add histograms Wait for next command Display histograms

Running a PROOF Job // Analyze TChains in parallel gROOT->Proof(); TChain *chain = new TChain(“AOD"); chain->Add("lfn://alien.cern.ch/alice/prod2002/file1"); . . . chain->Process(“myselector.C++”); // Analyze generic data sets in parallel gROOT->Proof(); TDSet *objset = new TDSet("MyEvent", "*", "/events"); objset->Add("lfn://alien.cern.ch/alice/prod2002/file1"); . . . objset->Add(set2003); objset->Process(“myselector.C”);

Different PROOF Scenarios – Static, stand-alone This scheme assumes: no third party grid tools remote cluster containing data files of interest PROOF binaries and libs installed on cluster PROOF daemon startup via (x)inetd per user or group authentication setup by cluster owner static basic PROOF config file In this scheme the user knows his data sets are on the specified cluster. From his client he initiates a PROOF session on the cluster. The master server reads the config file and fires as many slaves as described in the config file. User issues queries to analyse data in parallel and enjoy near real-time response on large queries. Pros: easy to setup Cons: not flexible under changing cluster configurations, resource availability, authentication, etc.

Different PROOF Scenarios – Dynamic, PROOF in Control This scheme assumes: grid resource broker, file catalog, meta data catalog, possible replication manager PROOF binaries and libraries installed on cluster PROOF daemon startup via (x)inetd grid authentication In this scheme the user queries a metadata catalog to obtain the set of required files (LFN's), then the system will ask the resource broker where best to run depending on the set of LFN's, then the system initiates a PROOF session on the designated cluster. On the cluster the slaves are created by querying the (local) resource broker and the LFN's are converted to PFN's. Query is performed. Pros: use grid tools for resource and data discovery. Grid authentication. Cons: require preinstalled PROOF daemons. User must be authorized to access resources.

Different PROOF Scenarios – Dynamic, AliEn in Control This scheme assumes: AliEn as resource broker and grid environment (taking care of authentication, possible via Globus) AliEn file catalog, meta data catalog, and replication manager In this scheme the user queries a metadata catalog to obtain the set of required files (LFN's), then hands over the PROOF master/slave creation to AliEn via an AliEn job. AliEn will find the best resources, copy the PROOF executables and start the PROOF master, the master will then connect back to the ROOT client on a specified port (callback port was passed as argument to AliEn job). In turn the slave servers are started again via the same mechanism. Once connections have been setup the system proceeds like in example 2. Pros: use AliEn for resource and data discovery. No pre-installation of PROOF binaries. Can run on any AliEn supported cluster. Fully dynamic. Cons: no guaranteed direct response due to the absence of dedicated "interactive" queues.

Different PROOF Scenarios – Dynamic, Condor in Control This scheme assumes: Condor as resource broker and grid environment (taking care of authentication, possible via Globus) Grid file catalog, meta data catalog, and replication manager This scheme is basically same as previous AliEn based scheme. Except for the fact that in the Condor environment Condor manages free resources and as soon as a slave node is reclaimed by its owner, it will kill or suspend the slave job. Before any of those events Condor will send a signal to the master so that it can restart the slave somewhere else and/or re-schedule the work of that slave on the other slaves. Pros: use grid tools for resource and data discovery. No pre-installation of PROOF binaries. Can run on any Condor pool. No specific authentication. Fully dynamic. Cons: no guaranteed direct response due to the absence of dedicated "interactive" queues. Slaves can come and go.

TGrid Class – Abstract Interface to AliEn class TGrid : public TObject { public: virtual Int_t AddFile(const char *lfn, const char *pfn) = 0; virtual Int_t DeleteFile(const char *lfn) = 0; virtual TGridResult *GetPhysicalFileNames(const char *lfn) = 0; virtual Int_t AddAttribute(const char *lfn, const char *attrname, const char *attrval) = 0; virtual Int_t DeleteAttribute(const char *lfn, const char *attrname) = 0; virtual TGridResult *GetAttributes(const char *lfn) = 0; virtual void Close(Option_t *option="") = 0; virtual TGridResult *Query(const char *query) = 0; static TGrid *Connect(const char *grid, const char *uid = 0, const char *pw = 0); ClassDef(TGrid,0) // ABC defining interface to GRID services };

Running PROOF Using AliEn TGrid *alien = TGrid::Connect(“alien”); TGridResult *res; res = alien->Query(“lfn:///alice/simulation/2001-04/V0.6*.root“); TDSet *treeset = new TDSet("TTree", "AOD"); treeset->Add(res); gROOT->Proof(res); // use files in result set to find remote nodes treeset->Process(“myselector.C”); // plot/save objects produced in myselector.C . . .

DataGrid & ROOT ROOT RDB Grid RB Grid cost evaluator Grid perf mon Selection parameters selected events LFN #hits Grid RB TAG DB Grid cost evaluator Grid perf mon Grid MDS Grid net ws Grid replica catalog Grid perf log best places Grid log & monitor Grid replica manager Grid authenticate Spawn PROOF tasks PROOF loop output LFNs Update Root RDB Send results back Grid replica catalog