Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.

Slides:



Advertisements
Similar presentations
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Advertisements

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
Enabling Grids for E-sciencE Overview of System Analysis Working Group Julia Andreeva CERN, WLCG Collaboration Workshop, Monitoring BOF session 23 January.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
The first year of LHC physics analysis using the GRID: Prospects from ATLAS Davide Costanzo University of Sheffield
OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL
LCG / ARC Interoperability Status Michael Grønager, PhD (UNI-C / NBI) January 19, 2006, Uppsala.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
The ILC And the Grid Andreas Gellrich DESY LCWS2007 DESY, Hamburg, Germany
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Metadata requirements for HEP Paul Millar. Slide 2 12 September 2007 Metadata requirements for HEP Some of the players in this game... WLCG – Umbrella.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
ARDA Prototypes Andrew Maier CERN. ARDA WorkshopAndrew Maier, CERN2 Overview ARDA in a nutshell –Experiments –Middleware Experiment prototypes (basic.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Karsten Köneke October 22 nd 2007 Ganga User Experience 1/9 Outline: Introduction What are we trying to do? Problems What are the problems? Conclusions.
ATLAS Distributed Analysis Dietrich Liko. Thanks to … pathena/PANDA: T. Maneo, T. Wenaus, K. De DQ2 end user tools: T. Maneo GANGA Core: U. Edege, J.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
D. Adams, D. Liko, K...Harrison, C. L. Tan ATLAS ATLAS Distributed Analysis: Current roadmap David Adams – DIAL/PPDG/BNL Dietrich Liko – ARDA/EGEE/CERN.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
The ATLAS Cloud Model Simone Campana. LCG sites and ATLAS sites LCG counts almost 200 sites. –Almost all of them support the ATLAS VO. –The ATLAS production.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
INFSO-RI Enabling Grids for E-sciencE ATLAS DDM Operations - II Monitoring and Daily Tasks Jiří Chudoba ATLAS meeting, ,
WLCG Service Report ~~~ WLCG Management Board, 7 th September 2010 Updated 8 th September
CERN IT Department CH-1211 Geneva 23 Switzerland t A proposal for improving Job Reliability Monitoring GDB 2 nd April 2008.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
ANALYSIS TOOLS FOR THE LHC EXPERIMENTS Dietrich Liko / CERN IT.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
ELSSISuite Services QIZHI ZHANG Argonne National Laboratory on behalf of the TAG developers group ATLAS Software and Computing Week, 4~8 April, 2011.
K. Harrison CERN, 21st February 2005 GANGA: ADA USER INTERFACE - Ganga release Python client for ADA - ADA job builder - Ganga release Conclusions.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
The GridPP DIRAC project DIRAC for non-LHC communities.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
CERN IT Department CH-1211 Genève 23 Switzerland t CHEP 2009, Monday 26rd March 2009 (Prague) Patricia Méndez Lorenzo on behalf of the IT/GS-EIS.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
Computing Operations Roadmap
L’analisi in LHCb Angelo Carbone INFN Bologna
Data Challenge with the Grid in ATLAS
New monitoring applications in the dashboard
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
Introduction to Distributed Analysis
LHC Data Analysis using a worldwide computing grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

Distributed Analysis Tutorial Dietrich Liko

Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG Other tools  How to access to the grid ? Certificate VOMS  How to find your data ? Where is the data stored Which data is really available ?

Three grids ….  Grids have different middleware Different software to submit jobs Different catalogs to store the data  We have to aim to hide this differences from the ATLAS user

EGEE  Job submission via LCG Resource Broker The new gLite RB is on its way …  LFC File catalog  Also CondorG submission is possible Requires some expertise and has no support from the service provider

Resource Broker Model RB CE

OSG/Panda  PANDA is an integrated production and distributed analysis system Pilot job based and similar to DIRAC & Alien  Simple File Catalogs at sites  Again CondorG submission possible

Panda Model Task queue CE

Nordugrid  ARC middleware for job submission Powerful and simple  RLS Filecatalog  At this time mainly production and not yet a place for general ATLAS Distributed Analysis

ARC Model CE

How can we live with that ?  Data management layer to hide this differences – Don Quixote 2  Tools that aim to hide the difficulties to submit jobs pathena/PANDA on OSG GANGA on LCG  In the future better interoperability On level of the ATLAS tools On the level of the middleware

Distributed Analysis  Data Analysis AOD & ESD analysis TAG based analysis pathena/PANDA GANGA/LCG  User Production Prodsys LJSF GANGA (DQ2 Integration)

pathena/PANDA  Lightweight client  Integrated to Athena release Very nice work  A lot of work has been done to support better user jobs Short queues, multitasking pilots etc.  A large set of data is available  Available since some time  Tadashi will tell you more about it

GANGA/LCG  Text UI & GUI A pathena-like interface is available  Multiple backends LCG/EGEE LSF – works also with CAT queues PBS And others

Progress on LCG  Many datasets available at CERN and LYON  Job priorities and short queues are being implemented Short queue: CERN, LYON, NIKHEF, FZK, RAL and some Tier-2 Priorities: NIKHEF, CERN, IFIC (PPS)  As of today one can perform distributed analysis at CERN and in LYON  We hope that within this year all the other Tier-1 centers and some Tier-2’s will follow See later this week in the Tier1/Tier-2 coordination

GANGA Status  Significant developments over summer Data available at CERN and LYON, GANGA would work on most sites Short queues/priorities Full DQ2 integration Transparent access to local resources (e.g. CAT queues)  Still in the pipeline Move data and priorities to all Tier-1’s Get the gLite Resource Broker into production Start iterations with users

Tools for simulation  GANGA (see later today)  LJSF  Prodsys Executor  Condor based submission systems

Dashboard Monitoring  We are setting up a framework to monitor distributed analysis jobs MonaLisa based (OSG, LCG) RGMA Imperial collage DB Production system  We plan to instrument submission system to be able to understand their usage

Since September 1 st …

Login to the grid  grid-proxy-init Basic access as of today  voms-proxy-init –voms atlas Can give access to special rights Today: Job Priorities on LCG to separate Production from Analysis

How to find out which data exists  AMI Metadata  Prodsys database /proddb/monitor/Datasets.php /proddb/monitor/Datasets.php  Dataset browser ndamon/query?overview=dslist ndamon/query?overview=dslist

How to access data ?  Download with dq2_get, analyze locally Works now, is not scalable  Data is distributed on sites, jobs are send to sites to analyze the data DA wants to promote this way of working

Dataset distribution  In principle data should be everywhere AOD & ESD during this year ~ 30 TB max  Three steps Not all data can be consolidated Other grids, Tier-2 Distribution between Tier-1 not yet perfect Distribution to Tier-2’s can only be the next step

CSC11 AOD Data at Tier-1 DatasetsCompleteFilesSize ASGC BNL CERN CNAF FZK LYON RAL SARA PIC TRIUMF

CSC11 ESD Data at Tier-1 DatasetsCompleteFilesSize ASGC BNL CERN CNAF FZK0000 LYON1011 RAL SARA1011 PIC TRIUMF

Monitoring of transfers

Dataset conclusion  AOD Analysis at BNL, CERN, LYON  ESD Analysis only at BNL  We have still to work hard to complete the “collection” of data  We have to push hard to achieve equal distribution between sites  Nevertheless: Its big progress to some month ago!

Dataset details  BNL alidation/html/bnl_datasets.html alidation/html/bnl_datasets.html  CERN ne/CERNCAF_csc11/AOD/list_CC.html ne/CERNCAF_csc11/AOD/list_CC.html  LYON ne/LYONDISK_csc11/AOD/list_CC.html ne/LYONDISK_csc11/AOD/list_CC.html

DQ2 end user tools  dq2_ls List dataset and files  dq2_get Download a dataset  dq2_put Create a dataset  dq2_poolFCjob0 Create a PoolFileCatalog to locally access data  Details:

Lets try out dq2 end user tools  Login on lxplus  source /afs/cern.ch/project/gd/LCG- share/sl3/etc/profile.d/grid_env.sh  alias dq2 = /afs/cern.ch/atlas/offline/external/GRID/ddm/pro02/dq2  source /afs/usatlas.bnl.gov/Grid/Don- Quijote/dq2_user_client/setup.sh.CERN

Summary  Several tools are available to perform Distributed Analysis Integrated with DQ2  Data is being collected and also distributed Still a lot of work in front of us  We learn how to handle user jobs Job Priorities on LCG Multitasking pilots in PANDA

Next steps  Increase the number of sites We have to push getting the data at all Tier-1. They are the backbone of the ATLAS data distribution  Interoperability Will for sure be an issue for the next software week