ATLAS TAG Services Jack Cranshaw with support from Thomas Doherty, Julius Hrivnac, Marcin Nowak.

Slides:



Advertisements
Similar presentations
Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
Advertisements

1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
The Latest news … and Future of ATLAS Databases Elizabeth Gallas - Oxford ATLAS Software & Computing Workshop CERN November 29 to December 3, 2010.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
29 July 2008Elizabeth Gallas1 An introduction to “TAG”s for ATLAS analysis Elizabeth Gallas Oxford Oxford ATLAS Physics Meeting Tuesday 29 July 2008.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.
Project Implementation for COSC 5050 Distributed Database Applications Lab1.
Enterprise Reporting with Reporting Services SQL Server 2005 Donald Farmer Group Program Manager Microsoft Corporation.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Wahid Bhimji University of Edinburgh J. Cranshaw, P. van Gemmeren, D. Malon, R. D. Schaffer, and I. Vukotic On behalf of the ATLAS collaboration CHEP 2012.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
30 Jan 2009Elizabeth Gallas1 Introduction to TAGs Elizabeth Gallas Oxford ATLAS-UK Distributed Computing Tutorial January 2009.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
Introduction to the Adapter Server Rob Mace June, 2008.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
Nurcan Ozturk University of Texas at Arlington US ATLAS Transparent Distributed Facility Workshop University of North Carolina - March 4, 2008 A Distributed.
D. Adams, D. Liko, K...Harrison, C. L. Tan ATLAS ATLAS Distributed Analysis: Current roadmap David Adams – DIAL/PPDG/BNL Dietrich Liko – ARDA/EGEE/CERN.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
The ATLAS TAGs Database - Experiences and further developments Elisabeth Vinek, CERN & University of Vienna on behalf of the TAGs developers group.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
ELSSISuite Services QIZHI ZHANG Argonne National Laboratory on behalf of the TAG developers group ATLAS Software and Computing Week, 4~8 April, 2011.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
A GANGA tutorial Professor Roger W.L. Jones Lancaster University.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Alessandro De Salvo Mayuko Kataoka, Arturo Sanchez Pineda,Yuri Smirnov CHEP 2015 The ATLAS Software Installation System v2 Alessandro De Salvo Mayuko Kataoka,
ATLAS TAGs: Tools from the ELSSI Suite Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial Edinburgh, UK – March 21-22, 2011.
1 DIRAC Project Status A.Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille 10 March, DIRAC Developer meeting.
ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
Introduction to PanDA Client Tools – pathena, prun and others
Progress Apama Fundamentals
Connected Infrastructure
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
Database Replication and Monitoring
What are they? The Package Repository Client is a set of Tcl scripts that are capable of locating, downloading, and installing packages for both Tcl and.
Lead SQL BankofAmerica Blog: SQLHarry.com
PROOF – Parallel ROOT Facility
The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN B00le.
A full demonstration based on a “real” analysis scenario
Accounting at the T1/T2 Sites of the Italian Grid
Readiness of ATLAS Computing - A personal view
Connected Infrastructure
Understanding and Designing with EJB
Module 01 ETICS Overview ETICS Online Tutorials
Teaching slides Chapter 6.
Component-based Applications
Understanding and Designing with EJB
ATLAS TAGs: Tools from the ELSSI Suite
ATLAS DC2 & Continuous production
Status and plans for bookkeeping system and production tools
Presentation transcript:

ATLAS TAG Services Jack Cranshaw with support from Thomas Doherty, Julius Hrivnac, Marcin Nowak

The ELSSI Suite Interfaces – ELSSI (Navigator) (QZ) – iELSSI (QZ) – RunBrowser (RB) Services – Lookup (MN) depends POOL, TASK – Extract (JH) depends POOL/Atlas – Skim (TD) depends GANGA/Atlas – Web/Helper services (QZ) depends COMA, TASK – Histo Service (JH)

Run Browser INPUTS COMPONENTS OUTPUTS ELSSI Web Extract Skim GANGA/ pathena Selection Criteria Service XML TAG File Lumi XML Python Config Job Config Data Files TAG Files Lumi XML iELSSI Extract Skim GANGA/ pathena Example of Component Aggregation All components are available if appropriate inputs are provided. Selection Criteria Service XML TAG Files (Split)

Exchange Formats Services – Services need to exchange information – Services use various programming environments. – Use a non-language specific format with many third-party tools. XML is a natural choice. POOL Utilities – Utilities share many common CLI arguments. – There are cases where a user may want to keep 90% of the arguments the same and modify one, or even use the same arguments with a different utility. – A generalized C++ CLI argument architecture (Args2Container, CmdLineArgs2) was developed by ATLAS and migrated into the POOL code base. – The state of the Args2Container was made persistent by writing it to an XML file.

Analysis Model Context TAG content and multiple use cases presented by Tulay on Monday in PAT session. Two general use cases presented by Mark on Tuesday in Distributed Analysis session. – Coarse selection, touch all/most payload files – Fine selection, touch only a few payload files

Lookup Service (Event) Lookup Service (alias Event Picking) Accessible from command line (python) and Athenaeum web site Currently installed only at CERN (but available worldwide and remote access is not much slower). Can be replicated to other sites hosting a TAG database replica – Current main user is Panda Functionality: for a list of event numbers, the service returns a list of file GUIDs where the events are located. TAGs currently support RAW, ESD, AOD and TAG file GUIDs. All physics streams are searched, unless a subset is specified (much faster) – Example usage: runEventLookup.py --runs_events " , " --stream "express_express„ 840C0BBD-523C-DF C6B94 StreamAOD_ref... Essentially a packaged call to CollListFileGUID using xml input option. SVN – Server: atlasoff/Database/TAGPoolServices – Python client: atlasoff/Database/AthenaPOOL/AtlasCollectionTools Many times uses local POOL patches or POOL version ahead of ATLAS release version

Lookup Service Performance and Latency – Lookup rate is roughly 100 Hz Scalability – Run-based lookup not expected to scale past 1000s of events., but period usage may alleviated this. – Web server performance problems may require output splitting.

Extract Service Extract service provides input TAG files for athena jobs based on TAG queries – Most of the logic is in java/xlst which is built into a web archive and deployed into a j2ee container. – Jobs run using a python and shell scripts on atlddm10 for production or in private accounts on other machines for development. Accessible from web (iELSSI) or command line Input is a config file in xml format oracle:/conn/table Lots of development on configuration and output recently. Essentially a packaged call to CollAppend using xml input option. Output available through web server: wget. SVN: – Manager: atlusr/hrivnac/Athenaeum – Server: atlasusr/Database/TAGPoolServices Uses Atlas release or local patches to an Atlas release

Extract Service Performance and Latency – Extraction rate is roughly 1-10 kHz. – Large selections could be several million events, so extraction times are minutes. – Extract should be called asynchronously. Scalability – Scalability theoretically handled by submitting multiple CollAppend jobs in parallel, which should scale as well as Oracle query planning scales. Issues – Extract not fully functional without ELSSI computations Luminosity Overlap removals “Job” splitting based on TASK – Distributed queries only available in ELSSI

Skim Service

Takes the output of Extract and runs a grid job to generate one of several standard outputs. – Most of the logic is in Skimming.py Accessible from web (iELSSI) or command line Input is a config file in xml format ganga SVN: – Manager: atlusr/hrivnac/Athenaeum – Server: atlasusr/Database/TAGPoolServices

Skim Service Performance and Latency – Production of submission scripts dominated by TagPrepare times. Various optimizations being considered. Otherwise, performance dominated by DA tools. Issues – More options on what flavor of DA tool to use. – Better support for user optimizations User options User code patches – Improve job resubmission logic. – Use transforms instead of job options.

Releases and Testing All code resident in SVN. Tagged versions deployed to production Dependencies tracked in developers’ heads – Plan to put this into the TASK database Change control – Development done in user test areas – Integration and production done in Testing – Component testing in various (un)documented forms available for all services. – Integration testing is currently through running the tutorial.

Resources / Deployment CERN – Production server maintained by IT: atlddm10.cern.ch – Development server available to individual developers: voatlas69.cern.ch Tier 1’s – Only ELSSI at the moment – Deployment model being developed – Each is effectively just another site service in a VO box.