Summary Distributed Data Analysis Track F. Rademakers, S. Dasu, V. Innocente CHEP06 TIFR, Mumbai.

Slides:



Advertisements
Similar presentations
Data Management Expert Panel - WP2. WP2 Overview.
Advertisements

David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci',
A tool to enable CMS Distributed Analysis
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL July 15, 2003 LCG Analysis RTAG CERN.
CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
PanDA A New Paradigm for Computing in HEP Kaushik De Univ. of Texas at Arlington NRC KI, Moscow January 29, 2015.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Tarball server (for Condor installation) Site Headnode Worker Nodes Schedd glidein - special purpose Condor pool master DB Panda Server Pilot Factory -
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
ARDA Prototypes Andrew Maier CERN. ARDA WorkshopAndrew Maier, CERN2 Overview ARDA in a nutshell –Experiments –Middleware Experiment prototypes (basic.
D. Adams, D. Liko, K...Harrison, C. L. Tan ATLAS ATLAS Distributed Analysis: Current roadmap David Adams – DIAL/PPDG/BNL Dietrich Liko – ARDA/EGEE/CERN.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
ANALYSIS TOOLS FOR THE LHC EXPERIMENTS Dietrich Liko / CERN IT.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
Daniele Spiga PerugiaCMS Italia 14 Feb ’07 Napoli1 CRAB status and next evolution Daniele Spiga University & INFN Perugia On behalf of CRAB Team.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Job submission overview Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
Workload Management Workpackage
(on behalf of the POOL team)
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
INFN-GRID Workshop Bari, October, 26, 2004
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Partner: LMU (Atlas), GSI (Alice)
LCG middleware and LHC experiments ARDA project
Support for ”interactive batch”
Presentation transcript:

Summary Distributed Data Analysis Track F. Rademakers, S. Dasu, V. Innocente CHEP06 TIFR, Mumbai

CHEP06, 17 Feb 20062Fons Rademakers Outline  Introduction  Distributed Analysis Systems  Submission Systems  Bookkeeping Systems  Monitoring Systems  Data Access Systems  Miscellaneous  Conveners’ impressions We have only 20 min for the summary and therefore cannot do justice to all talks

CHEP06, 17 Feb 20063Fons Rademakers Track Statistics  Lies, damn lies and statistics: there were 23 talks number of cancellations 2 number of no-shows 1 average attendance 25 minimum attendance 12 maximum attendance 55 average duration of talks 23 min equipment failures 1 (laser pointer) average outside temperature 31 C average room temperature 21 C

CHEP06, 17 Feb 20064Fons Rademakers What Was This Track All About? DIAL ProdSys BOSS Ganga Analysis Systems PROOF CRAB Submission Systems DIRAC PANDA Bookkeeping Systems JobMon BOSS BbK Monitoring Systems DashBoard JobMon BOSS MonaLisa Data Access Systems xrootd SAM Miscellaneous Go4 ARDA Grid Simulations AJAX Analysis

CHEP06, 17 Feb 20065Fons Rademakers Data Analysis Systems ALICEATLASCMSLHCb PROOFDIAL GANGA CRAB PROOF GANGA  All systems support, or plan to support, parallelism  Except for PROOF all systems achieve parallelism via job splitting and serial batch submission (job level parallelism)  The different analysis systems presented, categorized by experiment:

CHEP06, 17 Feb 20066Fons Rademakers Classical Parallel Data Analysis Storage Batch farm queues manager outputs catalog  “Static” use of resources  Jobs frozen, 1 job / CPU  “Manual” splitting, merging  Limited monitoring (end of single job)  Possible large tail effects submit files jobs data file splitting myAna.C merging final analysis query From PROOF System by Ganis [98]

CHEP06, 17 Feb 20067Fons Rademakers Interactive Parallel Data Analysis catalog Storage Interactive farm scheduler query  Farm perceived as extension of local PC  More dynamic use of resources  Automated splitting and merging  Real time feedback  Much better control of tail effects MASTER query: data file list, myAna.C files final outputs (merged) feedbacks (merged) From PROOF System by Ganis [98]

CHEP06, 17 Feb 20068Fons Rademakers

CHEP06, 17 Feb 20069Fons Rademakers DIAL Distributed Interactive Analysis of Large Datasets  A useful DIAL system has been deployed for ATLAS Common analysis transformations Access to current data For AOD to histograms and large samples, 15 times faster than a single process  Easy to use ROOT interface Web-based monitoring Packaged datasets, applications and example tasks  Demonstrated viability of remote processing Via Condor-G or PANDA Need interactive queues at remote sites With corresponding gatekeeper or DIAL service Or improve PANDA responsiveness From DIAL by Adams [39]

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers Ganga  Designed for data analysis on the Grid LHCb will do all its analysis on T1’s T2’s mostly for simulation  System should not be general – we know all main use cases Use prior knowledge Identified use pattern  Aid user in Bookkeeping aspects Keeping track of many individual jobs  Developed in cooperation between LHCb and ATLAS From LHCb Experiences by Egede [317]

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers CRAB  Makes it easy to create large number of user analysis jobs Assume all jobs are the same except for some parameters (event number to be accessed, output file name…)  Allows to access distributed data efficiently Hiding WLCG middleware complications. All interactions are transparent for the end user  Manages job submission, tracking, monitoring and output harvesting User doesn’t have to take care about how to interact with sometimes complicated grid commands Leaves time to get a coffee …  Uses BOSS as Grid independent submission engine From CRAB by Corvo [273]

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers Submission Systems ALICEATLASCMSLHCb AliEn (not presented) ProdSys PanDA BOSSDIRAC  These systems are the DDA launch vehicles for the Grid based batch analysis solutions  The different submission systems, categorized by experiment:

CHEP06, 17 Feb Fons Rademakers ATLAS Strategy  ATLAS will use all three main Grids: LCG/EGEE OSG NorduGrid  ProdSys was developed to provide seamless access to all ATLAS grid resources  At this point emphasis on batch model to implement the ATLAS Computing model Interactive solutions are difficult to realize on top of the current middleware layer  We expect our users to send large batches of short jobs to optimize their turnaround Scalability Data Access From ATLAS Strategy by Liko [263]

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers BOSS  Batch Object Submission System  A tool for batch job submission, real time monitoring and book keeping  Interfaced to many schedulers both local and grid  Utilizes relational database for persistency  Full logging and bookkeeping information stored  Job commands: submit, kill, query and output retrieval  Can define custom job types which allows specify monitoring unique to the submitted application  Significant new functionality identified and being actively integrated into BOSS From Evolution of BOSS by Wakefield [240]

CHEP06, 17 Feb Fons Rademakers BOSS Workflow boss submit boss query boss kill BOSS DB BOSS Scheduler farm node Wrapper  User specifies job - parameters including: Executable name. Executable type - turn on customized monitoring. Output files to retrieve (for sites without shared file system and grid).  User tells Boss to submit jobs specifying scheduler i.e. PBS, LSF, SGE, Condor, LCG, GLite etc..  Job consists of job wrapper, Real time monitoring service and users executable. From Evolution of BOSS by Wakefield [240]

CHEP06, 17 Feb Fons Rademakers DIRAC

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers Data Access Systems  The different data access systems that were presented: SAM Used by CDF in its CAF environment xrootd server Used by BaBar, ALICE, STAR All BaBar sites run xrootd, extensive deployment experience Winner of the SC05 throughput test Performs better than even the developers ever expected and had hoped for xrootd client Many improvements in the xrootd client side code Reduce latencies using asynchronous read ahead, client side caching and asynchronous opens

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers

CHEP06, 17 Feb Fons Rademakers Acknowledgments  A big thank you to the organizers  And to the speakers for the high quality talks  Especially the ones of whom the talks were not properly summarized  Hope to see you all at CHEP07 to see how the Distributed Data Analysis Systems have evolved

CHEP06, 17 Feb Fons Rademakers  Distributed Data analysis tools are of strategic importance GANGA, DIAL, CRAB, PROOF, … They can be a real differentiator There is a large development activity going on in this area However, none of these tools have yet been exposed to the expected large number of final analysis users  Development of a plethora of grid independent access layers DIRAC, BOSS, ALiEn, PanDA, … Gap between the grid middleware capabilities and user needs, especially data location, placement and bookkeeping services, left room for this activity Although appropriate now, convergence to one or two tools is desired  CPU and data intensive portion of analysis is most suited for the grid Skimming and organized “rootTree making” is enabled by these DDA tools Advantage of adapting production style tools to analysis Can one adapt other stuff from production toolbox? Bookkeeping? Avoid arcane work-group level bookkeeping that is common currently  Interactive analysis on grid with its large latencies PROOF is taking advantage of co-located CPUs for interactive analysis In the era of multi-core CPUs this is only natural Provides incremental data merging for prompt feedback to users Most DDA tools coupled to high-latency batch systems aren’t quite capable Block reservation of co-located nodes, a la Condor MPI Universe, may enable PROOF capabilities over the grid  High throughput AND low latency storage access critical for analysis Attention to performance boosting by deferred opens, caching and read-ahead by xrootd team is encouraging Conveners’ Observations