PROOF – Parallel ROOT Facility

Slides:



Advertisements
Similar presentations
June, 20013rd ROOT Workshop1 PROOF and ROOT Grid Features Fons Rademakers.
Advertisements

PROOF and AnT in PHOBOS Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 PROOF & GRID Update Fons Rademakers. 2 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
MASPLAS ’02 Creating A Virtual Computing Facility Ravi Patchigolla Chris Clarke Lu Marino 8th Annual Mid-Atlantic Student Workshop On Programming Languages.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI Bring the KB to the PB not the PB to the KB.
Evolution to CIMI Charles (Cal) Loomis & Mohammed Airaj LAL, Univ. Paris-Sud, CNRS/IN2P3 29 August 2013.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
ROOT Tutorials - Session 101 PROOT Tutorials – Session 10 PROOF, GRID, AliEn Fons Rademakers Bring the KB to the PB not the PB to the KB.
Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN.
1 Marek BiskupACAT2005PROO F Parallel Interactive and Batch HEP-Data Analysis with PROOF Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
1 PROOF The Parallel ROOT Facility Gerardo Ganis / CERN CHEP06, Computing in High Energy Physics 13 – 17 Feb 2006, Mumbai, India Bring the KB to the PB.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
PROOF in Atlas Tier 3 model Sergey Panitkin 1 BNL.
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
A prototype for an extended PROOF What is PROOF ? ROOT analysis model … … on a multi-tier architecture Status New development Prototype based on XRD Demo.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
1 Status of PROOF G. Ganis / CERN Application Area meeting, 24 May 2006.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
ANALYSIS TOOLS FOR THE LHC EXPERIMENTS Dietrich Liko / CERN IT.
March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB.
September, 2002CSC PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
GridShell/Condor: A virtual login Shell for the NSF TeraGrid (How do you run a million jobs on the NSF TeraGrid?) The University of Texas at Austin.
Sept. 2000CERN School of Computing1 PROOF and ROOT Grid Features Fons Rademakers.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
Workload Management Workpackage
Experience of PROOF cluster Installation and operation
OpenMosix, Open SSI, and LinuxPMI
Diskpool and cloud storage benchmarks used in IT-DSS
Heterogeneous Computation Team HybriLIT
INFN-GRID Workshop Bari, October, 26, 2004
GSIAF & Anar Manafov, Victor Penso, Carsten Preuss, and Kilian Schwarz, GSI Darmstadt, ALICE Offline week, v. 0.8.
PROOF in Atlas Tier 3 model
Bernd Panzer-Steindel, CERN/IT
April HEPCG Workshop 2006 GSI
LCG middleware and LHC experiments ARDA project
Distributed Data Bases & GRID Interactive Access
Grid Canada Testbed using HEP applications
Haiyan Meng and Douglas Thain
Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting
Support for ”interactive batch”
Module 01 ETICS Overview ETICS Online Tutorials
Alice Software Demonstration
PROOF - Parallel ROOT Facility
Wide Area Workload Management Work Package DATAGRID project
Presentation transcript:

PROOF – Parallel ROOT Facility Fons Rademakers http://root.cern.ch Bring the KB to the PB not the PB to the KB March, 2003 CHEP'03

PROOF Collaboration between core ROOT group at CERN and MIT Heavy Ion Group Part of and based on ROOT framework Uses heavily ROOT networking and other infrastructure classes Currently no external technologies March, 2003 CHEP'03

Main Motivation Design a system for the interactive analysis of very large sets of ROOT data files on a cluster of computers The main idea is to speed up the query processing by employing parallelism In the GRID context, this model will be extended from a local cluster to a wide area “virtual cluster”. The emphasis in that case is not so much on interactive response as on transparency With a single query, a user can analyze a globally distributed data set and get back a “single” result The main design goals are: transparency, scalability, adaptability March, 2003 CHEP'03

Parallel Script Execution #proof.conf slave node1 slave node2 slave node3 slave node4 Local PC Remote PROOF Cluster proof proof = master server root stdout/obj proof proof = slave server ana.C proof proof proof *.root TFile node1 ana.C TNetFile *.root $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) root [2] chain->Process(“ana.C”) $ root root [0] .x ana.C root [1] gROOT->Proof(“remote”) $ root root [0] .x ana.C $ root node2 TFile *.root node3 TFile *.root node4 March, 2003 CHEP'03

PROOF - Architecture Data Access Strategies Transparency Local data first, also rootd, rfio, SAN/NAS Transparency Input objects copied from client Output objects merged, returned to client Scalability and Adaptability Vary packet size (specific workload, slave performance, dynamic load) Heterogeneous Servers Migrate to multi site configurations March, 2003 CHEP'03

Workflow For Tree Analysis – Pull Architecture Slave 1 Master Slave N Process(“ana.C”) Process(“ana.C”) Initialization Packet generator Initialization GetNextPacket() GetNextPacket() Process 0,100 100,100 Process GetNextPacket() GetNextPacket() Process 200,100 300,40 GetNextPacket() Process GetNextPacket() Process 340,100 440,50 Process GetNextPacket() GetNextPacket() Process 490,100 590,60 Process SendObject(histo) SendObject(histo) Wait for next command Add histograms Wait for next command Display histograms March, 2003 CHEP'03

Additional Issues Error handling Authentication Death of master and/or slaves Ctrl-C interrupt Authentication Globus, ssh, kerb5, SRP, clear passwd, uid/gid matching March, 2003 CHEP'03

Running a PROOF Job // Analyze TChains in parallel gROOT->Proof(“proof.cern.ch”); TChain *chain = new TChain(“AOD"); chain->Add("lfn://alien.cern.ch/alice/prod2002/file1"); . . . chain->Process(“myselector.C”); // Analyze generic data sets in parallel gROOT->Proof(“proof.cern.ch”); TDSet *objset = new TDSet("MyEvent", "*", "/events"); objset->Add("lfn://alien.cern.ch/alice/prod2002/file1"); . . . objset->Add(set2003); objset->Process(“myselector.C++”); March, 2003 CHEP'03

PROOF Scalability 8.8GB, 128 files 1 node: 325 s 32 nodes in parallel: 12 s 32 nodes: dual Itanium II 1 GHz CPU’s, 2 GB RAM, 2x75 GB 15K SCSI disk, 1 Fast Eth, 1 GB Eth nic (not used) Each node has one copy of the data set (4 files, total of 277 MB), 32 nodes: 8.8 Gbyte in 128 files, 9 million events March, 2003 CHEP'03

PROOF and Data Grids Many services are a good fit Authentication File Catalog, replication services Resource brokers Monitoring  Use abstract interfaces Phased integration Static configuration Use of one or multiple Grid services Driven by Grid infrastructure March, 2003 CHEP'03

Different PROOF–GRID Scenarios Static stand-alone Current version, static config file, pre-installed Dynamic, PROOF in control Using grid file catalog and resource broker, pre-installed Dynamic, ALiEn in control Idem, but installed and started on the fly by AliEn Dynamic, Condor in control Idem, but allowing in addition slave migration in a Condor pool March, 2003 CHEP'03

Running PROOF Using AliEn TGrid *alien = TGrid::Connect(“alien”); TGridResult *res; res = alien->Query(“lfn:///alice/simulation/2001-04/V0.6*.root“); TDSet *treeset = new TDSet("TTree", "AOD"); treeset->Add(res); gROOT->Proof(res); // use files in result set to find remote nodes treeset->Process(“myselector.C”); // plot/save objects produced in myselector.C . . . March, 2003 CHEP'03

Near Future Working with some early users Ironing out of several remaining issues Writing install and user guide After CHEP general release Part of standard ROOT distribution Still to do for analysis: Support for: event lists, friend trees Still to do for grid: Interfacing to file catalogs, resource broker Multi site PROOF sessions March, 2003 CHEP'03