ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

June, 20013rd ROOT Workshop1 PROOF and ROOT Grid Features Fons Rademakers.
CHEP031 Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn,
Blueprint RTAGs1 Coherent Software Framework a Proposal LCG meeting CERN- 11 June Ren é Brun ftp://root.cern.ch/root/blueprint.ppt.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Introduction to DoC Private Cloud
ROOT courses1 The ROOT System A Data Access & Analysis Framework February 2003 Ren é Brun/EP Trees.
Statistics of CAF usage, Interaction with the GRID Marco MEONI CERN - Offline Week –
Large Data Bases in High Energy Physics Frontiers in Diagnostic Technologies Frascati November 26 Frascati November 26 Rene Brun/CERN.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Perspective on Future Data AnalysisL1 Computing in High Energy Physics 2003 La Jolla 24 March Ren é Brun CERN Perspective on Future Data Analysis in HENP.
ROOT An object oriented HEP analysis framework.. Computing in Physics Physics = experimental science =>Experiments (e.g. at CERN) Planning phase Physics.
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI Bring the KB to the PB not the PB to the KB.
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Group Computing Strategy Introduction and BaBar Roger Barlow June 28 th 2005.
UPPMAX and UPPNEX: Enabling high performance bioinformatics Ola Spjuth, UPPMAX
Int. Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2005), Zeuthen, Germany, May 2005 Bitmap Indices for Fast End-User.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
1 Marek BiskupACAT2005PROO F Parallel Interactive and Batch HEP-Data Analysis with PROOF Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
Acat OctoberRene Brun1 Future of Analysis Environments Personal views Rene Brun CERN.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
1 PROOF The Parallel ROOT Facility Gerardo Ganis / CERN CHEP06, Computing in High Energy Physics 13 – 17 Feb 2006, Mumbai, India Bring the KB to the PB.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
MySQL and GRID status Gabriele Carcassi 9 September 2002.
Status of India CMS Grid Computing Facility (T2-IN-TIFR) Rajesh Babu Muda TIFR, Mumbai On behalf of IndiaCMS T2 Team July 28, 20111Status of India CMS.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
1 Status of PROOF G. Ganis / CERN Application Area meeting, 24 May 2006.
March 13, 2006PROOF Tutorial1 Distributed Data Analysis with PROOF Fons Rademakers Bring the KB to the PB not the PB to the KB.
Trees: New Developments1 Trees: New Developments Folders and Tasks ROOT Workshop 2001 June 13 FNAL Ren é Brun CERN
September, 2002CSC PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Analysis experience at GSIAF Marian Ivanov. HEP data analysis ● Typical HEP data analysis (physic analysis, calibration, alignment) and any statistical.
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
Sept. 2000CERN School of Computing1 PROOF and ROOT Grid Features Fons Rademakers.
ROOT and PROOF Tutorial Arsen HayrapetyanMartin Vala Yerevan Physics Institute, Yerevan, Armenia; European Organization for Nuclear Research (CERN)
PROOF on multi-core machines G. GANIS CERN / PH-SFT for the ROOT team Workshop on Parallelization and MultiCore technologies for LHC, CERN, April 2008.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
ROOT : Outlook and Developments WLCG Jamboree Amsterdam June 2010 René Brun/CERN.
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
Persistent Object References in ROOT1 Persistent Object References in ROOT I/O Status & Proposal LCG meeting CERN- 5 June Ren é Brun ftp://root.cern.ch/root/longrefs.ppt.
Experience of PROOF cluster Installation and operation
PROOF system for parallel NICA event processing
U.S. ATLAS Grid Production Experience
Report PROOF session ALICE Offline FAIR Grid Workshop #1
PROOF – Parallel ROOT Facility
Distributed object monitoring for ROOT analyses with Go4 v.3
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
Distributed Data Bases & GRID Interactive Access
Summer Students Lecture 21 July 2004
Support for ”interactive batch”
Alice Software Demonstration
PROOF - Parallel ROOT Facility
TeraScale Supernova Initiative
Persistent Object References in ROOT I/O Status & Proposal
Presentation transcript:

ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis2

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis3 ROOT Trends Histogram Ntuple viewers Data Presenters Efficient Access to large and structured event collections Interaction with user & experiment classes Parallelism on the GRID Batch/Interactive Access to Catalogs Resource Brokers Process migration Progress Monitors Proxies/caches Virtual data sets PAW or ROOT like PAW

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis4 Memory Tree Each Node is a branch in the Tree tr T.Fill() T.GetEntry(6) T Memory

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis5 8 Branches of T 8 leaves of branch Electrons A double-click to histogram the leaf

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis6 The Tree Viewer & Analyzer A very powerful class supporting complex cuts, event lists, 1-d,2-d, 3-d views parallelism

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis7 Tree Friends tr Public read Public read User Write Entry # 8

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis8 Tree Friends Root > TFile f1(“tree1.root”); Root > tree.AddFriend(“tree2”,“tree2.root”) Root > tree.AddFriend(“tree3”,“tree3.root”); Root > tree.Draw(“x:a”,”k<c”); Root > tree.Draw(“x:tree2.x”,”sqrt(p)<b”); x Processing time independent of the number of friends unlike table joins in RDBMS Collaboration-wide public read Analysis group protected user private

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis9 Data Volume & Organisation 100MB1GB10GB1TB100GB100TB1PB10TB TTree TChain A TChain is a collection of TTrees or/and TChains A TFile typically contains 1 TTree A TChain is typically the result of a query to the file catalogue

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis10 Data Volume & Processing Time Using technology available in ” 10” 1’ 10’ 1h 10h 1day 1month 1” 1” 10” 1’ 10’ 1h 10h 1day 10days 1” 1” 1” 10” 1’ 10’ 1h 10h 1day 1’ 10’ 1h 10h 100MB 1GB 10GB 100GB 1TB 10TB 100TB 1PB ROOT 1 Processor P IV 2.4GHz 2003 : Time for one query using 10 per cent of data Interactive batch PROOF 10 Processors PROOF 100Processors PROOF/ALIEN 1000Processors

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis11 Data Volume & Processing Time Using technology available in ” 1” 1” 10” 1’ 10’ 1h 10h 1day 1’ 10’ 1h 100MB 1GB 10GB 100GB 1TB 10TB 100TB 1PB ROOT 1 Processor XXXXX 2010 : Time for one query using 10 per cent of data Interactive batch PROOF 10 Processors PROOF 100Processors PROOF/ALIEN 1000Processors 1” 1” 10” 1’ 10’ 1h 10h 1day 10days 1” 1” 1” 1” 10” 1’ 10’ 1h 10h

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis12 Interactive Local Analysis On a public cluster, or the user’s laptop. Tools like PAW or successor are used for visualization and ntuples/trees analysis.

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis13 GRID: Interactive Analysis Case 1 Data transfer to user’s laptop Optional Run/File catalog Optional GRID software Optional run/File Catalog Remote file server eg rootd Trees Analysis scripts are interpreted or compiled on the local machine

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis14 GRID: Interactive Analysis Case 2 Remote data processing Optional Run/File catalog Optional GRID software Optional run/File Catalog Remote data analyzer eg proofd Trees Commands, scripts histograms Analysis scripts are interpreted or compiled on the remote machine

Ren é Brun 5 Oct 03 Intel: Distributed Data Analysis15 GRID: Interactive Analysis Case 3 Remote data processing Run/File catalog Full GRID software Run/File Catalog Remote data analyzer eg proofd Trees Commands, scripts Histograms,trees Trees slave Analysis scripts are interpreted or compiled on the remote master(s)