US Atlas Tier 3 Overview Doug Benjamin Duke University.

Slides:



Advertisements
Similar presentations
Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
Advertisements

Xrootd and clouds Doug Benjamin Duke University. Introduction Cloud computing is here to stay – likely more than just Hype (Gartner Research Hype Cycle.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
Duke and ANL ASC Tier 3 (stand alone Tier 3’s) Doug Benjamin Duke University.
GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
CVMFS AT TIER2S Sarah Williams Indiana University.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Tier 3g Infrastructure Doug Benjamin Duke University.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Tier 3(g) Cluster Design and Recommendations Doug Benjamin Duke University.
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Tier 3 Storage Survey and Status of Tools for Tier 3 access, futures Doug Benjamin Duke University.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Tier 3 Computing Doug Benjamin Duke University. Tier 3’s live here Atlas plans for us to do our analysis work here Much of the work gets done here.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Atlas Tier 3 Virtualization Project Doug Benjamin Duke University.
Tier 3 architecture Doug Benjamin Duke University.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
HEPiX FNAL ‘02 25 th Oct 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 25 th October 2002 HEPiX 2002, FNAL.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
ATLAS XRootd Demonstrator Doug Benjamin Duke University On behalf of ATLAS.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Evolution of storage and data management Ian Bird GDB: 12 th May 2010.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
University user perspectives of the ideal computing environment and SLAC’s role Bill Lockman Outline: View of the ideal computing environment ATLAS Computing.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
MC Production in Canada Pierre Savard University of Toronto and TRIUMF IFC Meeting October 2003.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
CernVM-FS Infrastructure for EGI VOs Catalin Condurache - STFC RAL Tier1 EGI Webinar, 5 September 2013.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Cloud computing and federated storage Doug Benjamin Duke University.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Considerations on Using CernVM-FS for Datasets Sharing Within Various Research Communities Catalin Condurache STFC RAL UK ISGC, Taipei, 18 March 2016.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
ATLAS Tier3s Santiago González de la Hoz IFIC-Valencia S. González de la Hoz, II PCI 2009 Workshop, Valencia, 18/11/2010 ATLAS Tier3s (Programa de Colaboración.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Atlas Tier 3 Overview Doug Benjamin Duke University.
PRIN STOA-LHC: STATUS BARI BOLOGNA-18 GIUGNO 2014 Giorgia MINIELLO G. MAGGI, G. DONVITO, D. Elia INFN Sezione di Bari e Dipartimento Interateneo.
ATLAS Tier3 workshop at the OSG all-hand meeting FNAL 8-11 March 2010.
CernVM-FS vs Dataset Sharing
Accessing the VI-SEEM infrastructure
Status: ATLAS Grid Computing
Doug Benjamin Duke University On Behalf of the Atlas Collaboration
Virtualisation for NA49/NA61
Dag Toppe Larsen UiB/CERN CERN,
Dag Toppe Larsen UiB/CERN CERN,
Virtualisation for NA49/NA61
Simulation use cases for T2 in ALICE
Presentation transcript:

US Atlas Tier 3 Overview Doug Benjamin Duke University

Purpose of a Tier 3 Tier 3’s (and experiment computing in general) are tools to aid the Physicists in their work Work – analyzing the data to make measurements and scientific discoveries The computing is a tool and a means to an end Tier 3 Productivity The success of the Tier 3’s will measured by The amount of scientific output Papers written Talks in conferences Students trained (and theses written) Not in CPU hours or events processed

What is a Tier3? Working definition “Non pledged resources” “Analysis facilities” at your University/Institute/... Tier 3 level The name suggests that it is another layer continuing the hierarchy after Tier0, Tier1s, Tier2s...  Probably truly misleading... (except for political reasons some places in Europe really want to be small Tier 2’s)  Qualitative difference here: Final analysis vs simulation and reconstruction Local control vs ATLAS central control Operation load more on local resources (i.e. people) than on the central team (i.e. other people)

Tier 3 Types Tier 3’s are non pledged resources Does not imply that they should be chaotic or troublesome resources though Atlas examples include: Tier 3’s collocated with Tier 2’s Tier 3’s with same functionality as a Tier 2 site There is a very strong push in Europe that All Tier 3’s be grid sites and supply some cycles to Atlas National Analysis facilities Non-grid Tier 3 (Tier 3g) (most common for new sites in the US and should be through out Atlas) Very challenging due to limited support personnel

Tier 3: interesting features Key characteristics (issues, interesting problems) Operations  Must be simple (for the local team)  Must not affect the rest of the system (hence central operations) Data management  Again simplicity  Different access pattern (analysis) I/O bound, iterative/interactive More ROOT-based analysis (PROOF?) Truly local usage  “Performances” Reliability (successful jobs / total ) Efficiency (CPU/elapsed) → events read per second

Tier 3 Of course the recipe Tier 3 = (small) Tier2 could make sense in several cases This seems to be how Europe thinks things should be. But in several other cases: Too heavy for small new sites  “Human cost”  The new model is appealing for small Tier2-like centre as well Once again it appears the US Atlas is diverging from Atlas (we are out in front again) In all cases: We got data! The focus is more and more on doing the analysis than supporting computing facilities ;)

Design a system to be flexible and simple to setup (1 person < 1 week) Simple to operate - < 0.25 FTE to maintain Scalable with Data volumes Fast - Process 1 TB of data over night Relatively inexpensive Run only the needed services/process Devote most resources to CPU’s and Disk Using common tools will make it easier for all of us Easier to develop a self supporting community. Tier 3g design/Philosophy

Tier 3g  Interactive nodes  Can submit grid jobs.  Batch system w/ worker nodes  Atlas Code available  Client tools used for fetch data (dq2-ls, dq2-get)  Including dq2-get + fts for better control  Storage can be one of two types (sites can have both)  Located on the worker nodes  Lustre/GPFS (mostly in Europe)  XROOTD  Located in dedicated file servers (NFS/ XROOTD)

Local Tier 2 Cloud Tier 3g configuration Batch (Condor) User jobs Users Interactive node Pathena/Prun – to grid User’s home area Atlas/common software The Grid NFSV4 mounts User jobs Gridftp server + web server for Panda logs Lustre

How data comes to Tier 3g’s Local Tier1 Tier2 Cloud Sites in DDM ToA will tested frequently Troublesome sites will be blacklisted (no data) extra support load Xrootd/ Proof (pq2 tools) used to manage this Bestman Storage Resource Manager (SRM) (fileserver) Data will come from any Tier 2 site Two methods Enhanced dq2-get (uses fts channel) Data subscription SRM/gridftp server part of DDM Tiers of Atlas Gridftp server

Tier 3g challenges Need simplified way for Users to setup what they need to do work Found a existing solution in Canada Collaborate with Asoka de Silva (Triumf) We test the US part of the software package Atlas constantly producing new software releases Maintaining an up to date code stack much work Tier 1 and Tier 2 sites use grid jobs for code installation Need a transformative solution to solve problem CernVM File system (cvmfs)

OSG All Hands Meeting Fermilab, March CVMFS (v2)

Transformative technologies By their operational requirements non-grid Tier 3 sites will require transformative ideas and solutions Short term examples CVMFS (Cern VM web file system) Minimize effort for Atlas software releases Conditions DB Atlas recently officially request long term Support for CVMFS for Tier 3’s Atlas is testing cvmfs for Tier 1’s and Tier 2’s also

Transformative technologies(2) Xrootd/Lustre Xrootd allows for straight forward storage aggregation Some other sites using Lustre Wide area data clustering will helps groups during analysis (couple xrootd cluster of desktops at CERN with home institution xrootd cluster) Dq2-get with fts data transfer. – Robust client tool to fetch data for Tier 3 (no SRM required – not in ToA – a simplification) Medium/Longer term examples Proof Efficient data analysis Tools can be used for data management at Tier 3 Virtualization / cloud computing

Atlas XROOTD Demonstrator project Last June at WLGC Storage workshop Atlas Tier 3 proposed alternative method for delivering data to Tier 3 using confederated XROOTD clusters Physicists can get the data that they actually use Alternative and simpler than ATLAS DDM In testing now Plan to connect Tier 3 sites and some Tier 2 sites CMS working on something similar (Their focus is between Tier 1/Tier 2 – complimentary – we are collaborating )

Status of Tier 3g’s in US Setup instructions have moved to CERN ARRA funds have finally arrived People are making purchases and making choices Brandeis - First Tier 3g (other than ANL or Duke) setup with ARRA funds is almost completely up. I need to finish: setting up Tier 3 Panda (w/ Alden’s help) Gridftp servers XrootD Proof for managing the data I am revising the instructions as I go Rik and I have advising sites as they request help Willing to travel as needed.

Missing Pieces for Tier 3’s Standard installation method for XROOTD VDT promised a YUM repository (within a few months) Data set affinity (binding jobs to data location) Charles is working on this Proof installation and configuration instructions Important for dataset management and ultimately for data analysis (see Sergey’s talk today about BNL Tier 3) Configuration management (Puppet) Yushu will present the status today Standard Test Suite of “typical” Tier 3 jobs Attila’s talk covered some of this XROOTD Proxy Collaborating with CERN to use a proxy for data severs behind a firewall or on a private network SVN repository for all code/instructions scripts

Conclusions Tier 3 computing important for data analysis in Atlas A coherent Atlas wide effort had begun in earnest Situation now much less clear US Tier 3’s are being designed according to the needs of the local research groups Striving for a design that requires minimal effort to setup and successfully run. Technologies for the Tier 3’s are being chosen and evaluated based on performance and stability for data analysis Ideas from Tier 3 should be moving up the computing chain

Backup Slides

Atlas Code installation NFS file server ManageTier3 SW package (Asoka DeSilva Triumf) Well tested straight forward to use

Atlas Tier 3 Workshop Jan Organizers Massimo Lamanna, Rik Yoshida, DB Follow on to activities in the US the year before Showed the variety of Tier 3’s in Atlas Good attendance from all across Atlas 6 working groups formed to address various issues 1.Distributed storage(Lustre/GPFS and xrootd subgroups) 2.DDM – Tier3 link 3.Tier 3 Support 4.Proof 5.Software and Conditions DB 6.Virtualization