1 PROOF & GRID Update Fons Rademakers. 2 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a.

Slides:



Advertisements
Similar presentations
Virtual Machine Technology Dr. Gregor von Laszewski Dr. Lizhe Wang.
Advertisements

B4 Application Environment Load Balancing Job and Queue Management Tim Smith CERN/IT.
A “Dynamic” Firewall Jon Hillier Oxford University/ eScience Centre.
June, 20013rd ROOT Workshop1 PROOF and ROOT Grid Features Fons Rademakers.
PROOF and AnT in PHOBOS Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting.
CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Grids and Globus at BNL Presented by John Scott Leita.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
10 Oct. 2001CMS/ROOT Workshop1 Networking, Remote Access, SQL, PROOF and GRID Fons Rademakers René Brun.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI Bring the KB to the PB not the PB to the KB.
ROOT Tutorials - Session 51 ROOT Tutorials – Session 8 GUI, Signal/Slots, Image Processing, Carrot Fons Rademakers.
June 2001ROOT 2001 FNAL1 What Else is New Since ROOT 2000? Fons Rademakers.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
PVM. PVM - What Is It? F Stands for: Parallel Virtual Machine F A software tool used to create and execute concurrent or parallel applications. F Operates.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
ROOT Tutorials - Session 101 PROOT Tutorials – Session 10 PROOF, GRID, AliEn Fons Rademakers Bring the KB to the PB not the PB to the KB.
Grid Computing I CONDOR.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Evolution of Parallel Programming in HEP F. Rademakers – CERN International Workshop on Large Scale Computing VECC, Kolkata.
1 Marek BiskupACAT2005PROO F Parallel Interactive and Batch HEP-Data Analysis with PROOF Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
Condor Week 2004 The use of Condor at the CDF Analysis Farm Presented by Sfiligoi Igor on behalf of the CAF group.
ROOT Tutorials - Session 51 ROOT Tutorials – Session 5 Dictionary Generation, rootcint, Simple I/O, Hands-on Fons Rademakers.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
A prototype for an extended PROOF What is PROOF ? ROOT analysis model … … on a multi-tier architecture Status New development Prototype based on XRD Demo.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
1 Status of PROOF G. Ganis / CERN Application Area meeting, 24 May 2006.
March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB.
Oct 20024th Intl. ROOT Workshop1 New Infrastructure Features Since ROOT 2001 Fons Rademakers.
The overview How the open market works. Players and Bodies  The main players are –The component supplier  Document  Binary –The authorized supplier.
September, 2002CSC PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Sept. 2000CERN School of Computing1 PROOF and ROOT Grid Features Fons Rademakers.
Virtualisation for NA49/NA61
PROOF – Parallel ROOT Facility
Virtualisation for NA49/NA61
April HEPCG Workshop 2006 GSI
Partner: LMU (Atlas), GSI (Alice)
Viet Tran Institute of Informatics Slovakia
LCG middleware and LHC experiments ARDA project
Distributed Data Bases & GRID Interactive Access
Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting
Support for ”interactive batch”
PROOF - Parallel ROOT Facility
Presentation transcript:

1 PROOF & GRID Update Fons Rademakers

2 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a set of files parallel analysis of objects in a set of files on clusters of heterogeneous machines Its design goals are: transparency, scalability, adaptability Prototype developed in 1997 as proof of concept (only for simple queries resulting in 1D histograms)

3 Parallel Script Execution root Remote PROOF Cluster proof TNetFile TFile Local PC $ root ana.C stdout/obj node1 node2 node3 node4 $ root root [0].x ana.C $ root root [0].x ana.C root [1] gROOT->Proof(“remote”) $ root root [0].x ana.C root [1] gROOT->Proof(“remote”) root [2] gProof->Exec(“.x ana.C”) ana.C proof proof = slave server proof proof = master server #proof.conf slave node1 slave node2 slave node3 slave node4 *.root TFile

4 Running a PROOF Job gROOT->Proof(); TDSet *treeset = new TDSet("TTree", "AOD"); treeset->Add("lfn:/alien.cern.ch/alice/prod2002/file1");... gProof->Process(treeset, “myselector.C”); gROOT->Proof(); TDSet *objset = new TDSet("MyEvent", "*", "/events"); objset->Add("lfn:/alien.cern.ch/alice/prod2002/file1");... objset->Add(set2003); gProof->Process(objset, “myselector.C”);

5 Different PROOF Scenarios – Static, stand-alone This scheme assumes: no third party grid tools remote cluster containing data files of interest PROOF binaries and libs installed on cluster PROOF daemon startup via (x)inetd per user or group authentication setup by cluster owner static basic PROOF config file In this scheme the user knows his data sets are on the specified cluster. From his client he initiates a PROOF session on the cluster. The master server reads the config file and fires as many slaves as described in the config file. User issues queries to analyse data in parallel and enjoy near real-time response on large queries. Pros: easy to setup Cons: not flexible under changing cluster configurations, resource availability, authentication, etc.

6 Different PROOF Scenarios – Dynamic, PROOF in Control This scheme assumes: grid resource broker, file catalog, meta data catalog, possible replication manager PROOF binaries and libraries installed on cluster PROOF daemon startup via (x)inetd grid authentication In this scheme the user queries a metadata catalog to obtain the set of required files (LFN's), then the system will ask the resource broker where best to run depending on the set of LFN's, then the system initiates a PROOF session on the designated cluster. On the cluster the slaves are created by querying the (local) resource broker and the LFN's are converted to PFN's. Query is performed. Pros: use grid tools for resource and data discovery. Grid authentication. Cons: require preinstalled PROOF daemons. User must be authorized to access resources.

7 Different PROOF Scenarios – Dynamic, AliEn in Control This scheme assumes: AliEn as resource broker and grid environment (taking care of authentication, possible via Globus) AliEn file catalog, meta data catalog, and replication manager In this scheme the user queries a metadata catalog to obtain the set of required files (LFN's), then hands over the PROOF master/slave creation to AliEn via an AliEn job. AliEn will find the best resources, copy the PROOF executables and start the PROOF master, the master will then connect back to the ROOT client on a specified port (callback port was passed as argument to AliEn job). In turn the slave servers are started again via the same mechanism. Once connections have been setup the system proceeds like in example 2. Pros: use AliEn for resource and data discovery. No pre-installation of PROOF binaries. Can run on any AliEn supported cluster. Fully dynamic. Cons: no guaranteed direct response due to the absence of dedicated "interactive" queues.

8 Different PROOF Scenarios – Dynamic, Condor in Control This scheme assumes: Condor as resource broker and grid environment (taking care of authentication, possible via Globus) Grid file catalog, meta data catalog, and replication manager This scheme is basically same as previous AliEn based scheme. Except for the fact that in the Condor environment Condor manages free resources and as soon as a slave node is reclaimed by its owner, it will kill or suspend the slave job. Before any of those events Condor will send a signal to the master so that it can restart the slave somewhere else and/or re-schedule the work of that slave on the other slaves. Pros: use grid tools for resource and data discovery. No pre- installation of PROOF binaries. Can run on any Condor pool. No specific authentication. Fully dynamic. Cons: no guaranteed direct response due to the absence of dedicated "interactive" queues. Slaves can come and go.

9 TGrid Class – Abstract Interface to AliEn class TGrid : public TObject { public: virtual Int_t AddFile(const char *lfn, const char *pfn) = 0; virtual Int_t DeleteFile(const char *lfn) = 0; virtual TGridResult *GetPhysicalFileNames(const char *lfn) = 0; virtual Int_t AddAttribute(const char *lfn, const char *attrname, const char *attrval) = 0; virtual Int_t DeleteAttribute(const char *lfn, const char *attrname) = 0; virtual TGridResult *GetAttributes(const char *lfn) = 0; virtual void Close(Option_t *option="") = 0; virtual const char *GetInfo() = 0; const char *GetGrid() const { return fGrid; } const char *GetHost() const { return fHost; } Int_t GetPort() const { return fPort; } virtual TGridResult *Query(const char *query) = 0; static TGrid *Connect(const char *grid, const char *uid = 0, const char *pw = 0); ClassDef(TGrid,0) // ABC defining interface to GRID services };

10 Running PROOF Using AliEn TGrid *alien = TGrid::Connect(“alien://alien.cern.ch”, rdm); TGridResult *res; res = alien->Query(“lfn:///alice/simulation/ /V0.6*.root“); TDSet *treeset = new TDSet("TTree", "AOD"); treeset->Add(res); gROOT->Proof(); gProof->Process(treeset, “myselector.C”); // plot/write objects produced in myselector.C...

11 Varia… New developments in ROOT GUI (TG) classes by Valeriy Onuchin removal on limitations of number of items in browsers Kerberos5 authentication support By Johannes Muelmenstaedt of FNAL and MIT Image processing classes By Reiner Rohlfs of ISDC