PROOF in Atlas Tier 3 model

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Ch 4. The Evolution of Analytic Scalability
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Interactive Data Analysis with PROOF Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers CERN.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
Brookhaven Analysis Facility Michael Ernst Brookhaven National Laboratory U.S. ATLAS Facility Meeting University of Chicago, Chicago 19 – 20 August, 2009.
Tier 3 architecture Doug Benjamin Duke University.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
PROOF in Atlas Tier 3 model Sergey Panitkin 1 BNL.
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
Overview of grid activities in France in relation to FKPPL FKPPL Workshop Thursday February 26th, 2009 Dominique Boutigny.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
1 Status of PROOF G. Ganis / CERN Application Area meeting, 24 May 2006.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Analysis experience at GSIAF Marian Ivanov. HEP data analysis ● Typical HEP data analysis (physic analysis, calibration, alignment) and any statistical.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
ROOT and PROOF Tutorial Arsen HayrapetyanMartin Vala Yerevan Physics Institute, Yerevan, Armenia; European Organization for Nuclear Research (CERN)
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
 IO performance of ATLAS data formats Ilija Vukotic for ATLAS collaboration CHEP October 2010 Taipei.
PROOF on multi-core machines G. GANIS CERN / PH-SFT for the ROOT team Workshop on Parallelization and MultiCore technologies for LHC, CERN, April 2008.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Lyon Analysis Facility - status & evolution - Renaud Vernet.
ANL T3g infrastructure S.Chekanov (HEP Division, ANL) ANL ASC Jamboree September 2009.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
Compute and Storage For the Farm at Jlab
PROOF system for parallel NICA event processing
Database Replication and Monitoring
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Netscape Application Server
Solid State Disks Testing with PROOF
Diskpool and cloud storage benchmarks used in IT-DSS
Applying Control Theory to Stream Processing Systems
Report PROOF session ALICE Offline FAIR Grid Workshop #1
Status of the Analysis Task Force
Status of the CERN Analysis Facility
PROOF – Parallel ROOT Facility
GSIAF & Anar Manafov, Victor Penso, Carsten Preuss, and Kilian Schwarz, GSI Darmstadt, ALICE Offline week, v. 0.8.
Bernd Panzer-Steindel, CERN/IT
Comments about PROOF / ROOT evolution during coming years
G. Ganis, 2nd LCG-France Colloquium
OffLine Physics Computing
Grid Canada Testbed using HEP applications
Ch 4. The Evolution of Analytic Scalability
Support for ”interactive batch”
PROOF - Parallel ROOT Facility
Control Theory in Log Processing Systems
Presentation transcript:

PROOF in Atlas Tier 3 model Sergey Panitkin BNL Sergey Panitkin

Outline Part One: Proof in T3 environment Part Two: I/O Issues in analysis on multi-core hardware Sergey Panitkin

ATLAS Analysis Model – analyzer view Contents defined by physics group(s) - made in official production (T0) - remade periodically on T1 Produced outside official production on T2 and/or T3 (by group, sub-group, or Univ. group) T0/T1 T2 T3 thin/ skim/ slim D1PD 1st stage anal DnPD root Streamed ESD/AOD histo ESD/AOD, D1PD, D2PD - POOL based D3PD - flat ntuple Jim Cochran’s slide about the Analysis Model

Post-AOD analysis in Atlas Advantages of DPDs for T3: Faster to analyze due to a smaller size and faster I/O Less demand on storage space and infrastructure (1 year worth of DPD ~40TB) Well suited for T3 types of analyses and scales How to make Tier 3 affordable, but still effective for Atlas analysis? How to analyze ~109 DPD events efficiently in Root? How to ensure fast analysis turnaround? Use PROOF! Parallel Root Facility - Root’s native system for parallel distributed analysis! Sergey Panitkin

PROOF Advantages Native to Root Easy to install Comes with build-in high performance storage solution - Xrootd Efficient in extracting maximum performance from cheap hardware Avoids drawbacks of “classical” batch systems Scalable in many ways, allows multi-cluster federations Transparent for users. System complexity is hidden Comes with its own event loop model (TSelector) Development driven by physics community Free, open source software Can nicely coexist with batch queues on the same hardware Sergey Panitkin

Native to Root A system for the interactive or batch analysis of very large sets of Root data files on a cluster of computers Optimized for processing of data in Root format Speed up the query processing by employing inherent parallelism in event data Parallel Root Facility, originally developed by MIT physicists about 10 years ago PROOF is an integral part of Root now. Developed and supported by Root team. Distributed with Root. If you installed Root you have Proof. Sergey Panitkin

PROOF and Xrootd One of the main advantages of modern PROOF implementation is its close cooperation with Xrootd PROOF is just a plug-in for Xrootd Typically PROOF uses Xrootd for communication, clustering, data discovery and file serving Note that PROOF can be used without Xrootd based storage. It works with many types of SE (dCache, Lustre, NFS boxes, etc...) But PROOF is at its best when Xrootd provides high performance distributed storage When PROOF is used in conjunction with Xrootd SE it automatically ensures data discovery and local data processing: a job running on a given node reads data stored on that node. This typically provides maximum analysis rate and optimal farm scalability. Sergey Panitkin

PROOF and Xrootd Easy configuratioon # General section that applies to all servers # all.export /atlas if redirector.slac.stanford.edu all.role manager else all.role server fi all.manager redirector.slac.stanford.edu 3121 # Cluster management specific configuration cms.allow *.slac.stanford.edu # xrootd specific configuration xrootd.fslib /opt/xrootd/prod/lib/libXrdOfs.so xrootd.port 1094 ### Load the XrdProofd protocol: if exec xrootd xrd.protocol xproofd:1093 /opt/xrootd/prod/lib/libXrdProofd.so Easy configuratioon A few extra lines in Xrootd config. file Plus a simple PROOF config. file Node1 master Node2 worker Node3 Worker 8

Efficient Processing Efficient in extracting maximum performance from (cheap) hardware Easy, self-organized clustering Well suited for (if not geared to) analysis farms with distributed local storage Local data processing is encouraged – automatic matching of code with data. This typically means optimal I/O A job is automatically split in optimal number of sub-jobs All jobs are dynamically managed to insure maximum resource utilization Load is actively managed and balanced by master (packetizer) Pull architecture for load management When local processing is exhausted remote access automatically begins so all CPUs are utilized Sergey Panitkin

Efficient processing Pull architecture The PROOF packetizer is the heart of the system It runs on the client/master and hands out work (packets) to the workers Packet can be a few hundred events The packetizer takes data locality and storage type into account Matches workers to local data when possible By managing workers load it makes sure that all workers end at the same time Pull architecture workers ask for work, no complex worker state in the master

PROOF Pull Technology Avoids Long Tails In push approach last job determines the total execution time Basically a Landau distribution Example: Total expected time 20h, target 1h 20 sub-jobs, 1h +/- 5% 10000 toy experiments Long tails, e.g. 15% > 4h Time of slowest sub-job Courtesy Fons Rademakers, Gerri Ganis

Efficient Processing Akira’s talk about ROOT analysis comparison at the Jamboree http://indico.cern.ch/getFile.py/access?contribId=10&sessionId=0&resId=0&materialId=slides&confId=38991 100 KHz analysis rate Akira Shibata reported his tests with PROOF Root analysis benchmark package Good results for Ntuple based analysis Xrootd improves I/O performance Sergey Panitkin

Scalability From PROOF….. Courtesy Fons Rademakers, Gerri Ganis xrootd/xpd PROOF Worker From PROOF….. ROOT Client xrootd/xpd PROOF Master xrootd/xpd PROOF Worker xrootd/xpd PROOF Worker TCP/IP Unix Socket Node Courtesy Fons Rademakers, Gerri Ganis

Scalability …To PROOF Lite… Perfect for multi-core desktops/laptops PROOF Worker …To PROOF Lite… ROOT Client/ PROOF Master PROOF Worker PROOF Worker Unix Socket Perfect for multi-core desktops/laptops Node Courtesy Fons Rademakers, Gerri Ganis

PROOF Lite PROOF optimized for single many-core machines Zero configuration setup (no config. files and no daemons) Workers are processes and not threads for added robustness Like PROOF it can exploit fast disks, SSD’s, lots of RAM, fast networks and fast CPU’s Once your analysis runs on PROOF Lite it will also run on PROOF Works with exactly the same user code as PROOF Sergey Panitkin

Scalability ….To Federated Clusters…. Adapts to wide area virtual clusters Geographically separated domains, heterogeneous machines Super master is users' single point of entry. System complexity is hidden Automatic data discovery and job matching with local data

Support and documentation Main PROOF Page at CERN, PROOF worldwide forum http://root.cern.ch/twiki/bin/view/ROOT/PROOF USAtlas Wiki PROOF page http://www.usatlas.bnl.gov/twiki/bin/view/ProofXrootd/WebHome Web page/TWIKI at BNL with general farm information, help, examples, tips, talks, links to Ganglia page, etc. http://www.usatlas.bnl.gov/twiki/bin/view/AtlasSoftware/ProofTestBed Hypernews forum for Atlas PROOF users created hn-atlas-proof-xrootd@cern.ch https://hypernews.cern.ch/HyperNews/Atlas/get/proofXrootd.html Sergey Panitkin

PROOF at LHC Alice Atlas CMS Sergey Panitkin will run PROOF on CAF for calibrations, alignment, analysis, etc PROOF farm at GSI T2 Various local T3s Integration of PROOF with AliRoot Atlas PROOF farm at T2 at Munich LMU (and German NAF?) PROOF farm(s) at BNL T1 PROOF test farm at UTA T2 Proof test farm at Universidad Autonoma de Madrid T2 PROOF farm at Wisconsin T3 CMS PROOF at NAF Germany PROOF cluster at Purdue - USCMS T2 PROOF farm at Oviedo Planned farms at ETH and PSI (T3 or T2) Sergey Panitkin

Part Two: I/O Issues PROOF can effectively exploit local processing (I/O localization) How does it work with multi-core hardware? When disk subsystem becomes a bottleneck? Next few slides are from my CHEP09 talk about PROOF and SSDs: It can be found here: http://indico.cern.ch/contributionDisplay.py?contribId=395&sessionId= 61&confId=35523 Sergey Panitkin

H->4l analysis. SSD vs HDD CPU Limited Dual quad core 16GB RAM SSD is about 10 times faster at full load Best HDD performance at 2 worker load Single analysis job generates ~10 -14 MB/s load with given hardware Sergey Panitkin

Interactive analysis. SSD vs HDD Worker is a PROOF parallel analysis job CPU limited SSD holds clear speed advantage ~Up to10 times faster in concurrent read scenario Sergey Panitkin

H->4l analysis. SSD RAID 0 SSD 2 disk RAID 0 shows little impact up to 4 worker load Sergey Panitkin

H->4l analysis. HDD: single vs RAID 3x750 GB HDD RAID peaks at ~3 worker load Single HDD disk peaks at 2 worker load, then performance rapidly deteriorates Sergey Panitkin

The End Sergey Panitkin

The basic PROOF+Condor Integration Condor + Xrootd + PROOF pool Normal Production Condor jobs Condor Master COD requests PROOF requests The local storage on each machine PROOF jobs PROOF Master Sergey Panitkin

The ROOT Data Model Trees & Selectors preselection analysis Ok Output list Process() Branch Leaf Event n Read needed parts only Chain Loop over events 1 2 last Terminate() - Finalize analysis (fitting, ...) Begin() - Create histograms - Define output list

How to use PROOF Input Files (TChain) Analysis (TSelector) Output PROOF is designed for analysis of independent objects, e.g. ROOT Trees (basic data format in partice physics) Files to be analyzed are put into a chain (TChain) or a data set (TDSet), e.g. collection of files Analysis written as a selector Input/Output is sent using dedicated lists If additional libraries are needed, these have to be distributed as a “package” Input Files (TChain) Analysis (TSelector) Output (TList) Input (TList)

TSelector TSelector is a framework for analysis of event like data You derive from TSelector class and implement member functions with specific algorithm details During processing Root calls your functions in a predefined sequence TSelector skeleton can be automatically generated ROOT provides the TTree::MakeSelector function to generate a skeleton class for a given TTree. root > TFile *f = TFile::Open("treefile.root") root > TTree *t = (TTree *) f->Get("T") root > t->MakeSelector("MySelector") root > .!ls MySelector* MySelector.C MySelector.h Sergey Panitkin

Multisession performance As expected per query performance drops as number of queries increases. Resource sharing between jobs with equal priorities. Sergey Panitkin

Analysis rate scaling Aggregate analysis rate saturates at about 3 (full load) queries Max analysis rate is about 360 MB/s for a given analysis type It makes sense to run PROOF farm at optimal number of queries Sergey Panitkin