Added value of new features of the ATLAS computing model and a shared Tier-2 and Tier-3 facilities from the community point of view Gabriel Amorós on behalf.

Slides:



Advertisements
Similar presentations
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deployment and Management of Grid Services.
Advertisements

Israel Cluster Structure. Outline The local cluster Local analysis on the cluster –Program location –Storage –Interactive analysis & batch analysis –PBS.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
II IFIC Álvaro Fernández CasaníISGC2012, Taipei, 2th March 2012 Evolution of the Atlas data and computing model for a Tier-2 in the EGI infrastructure.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
The ATLAS Cloud Model Simone Campana. LCG sites and ATLAS sites LCG counts almost 200 sites. –Almost all of them support the ATLAS VO. –The ATLAS production.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.
Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
1 ATLAS-VALENCIA COORDINATION MEETING 9th May 2008 José SALT TIER2 Report 1.- Present Status of the ES-ATLAS-T2 2.- Computing-Analysis issues Overview.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
ATLAS Computing Wenjing Wu outline Local accounts Tier3 resources Tier2 resources.
The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
THE ATLAS COMPUTING MODEL Sahal Yacoob UKZN On behalf of the ATLAS collaboration.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Compute and Storage For the Farm at Jlab
Dynamic Extension of the INFN Tier-1 on external resources
Status: ATLAS Grid Computing
Status of BESIII Distributed Computing
(Prague, March 2009) Andrey Y Shevel
SuperB – INFN-Bari Giacinto DONVITO.
The ATLAS “DQ2 Accounting and Storage Usage Service”
WP18, High-speed data recording Krzysztof Wrona, European XFEL
LCG Service Challenge: Planning and Milestones
Doug Benjamin Duke University On Behalf of the Atlas Collaboration
U.S. ATLAS Grid Production Experience
Belle II Physics Analysis Center at TIFR
Report of Dubna discussion
ATLAS Cloud Operations
ATLAS Distributed Analysis tests in the Spanish Cloud
Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.
Data Challenge with the Grid in ATLAS
INFN-GRID Workshop Bari, October, 26, 2004
Bernd Panzer-Steindel, CERN/IT
Farida Fassi, Damien Mercie
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
ATLAS Sites Jamboree, CERN January, 2017
Статус ГРИД-кластера ИЯФ СО РАН.
Job workflow Pre production operations:
Simulation use cases for T2 in ALICE
R. Graciani for LHCb Mumbay, Feb 2006
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
ATLAS Distributed Analysis tests in the Spanish Cloud
ATLAS DC2 & Continuous production
The LHCb Computing Data Challenge DC06
Presentation transcript:

Added value of new features of the ATLAS computing model and a shared Tier-2 and Tier-3 facilities from the community point of view Gabriel Amorós on behalf of the IFIC Atlas computing group: FERNANDEZ, Álvaro - VILLAPLANA, Miguel - FASSI, Farida - GONZALEZ, Santiago - KACI, Mohammed - LAMAS, Alejandro - OLIVER, Elena - SALT, Jose - SANCHEZ, Javier -SANCHEZ, Victoria EGI Community Forum

Outline Introduction to Atlas and IFIC Spanish Tier2 Evolution of the Atlas data and computing model Flattening the model to a Mesh Availability and Connectivity Job Distribution Dataset replication IFIC infrastructure and operations Example of Grid and Physics Analysis

ATLAS CENTERS CERN IFIC TIER-2 http://dashb-earth.cern.ch/dashboard/dashb-earth-atlas.kmz

SPAIN CONTRIBUTION TO ATLAS

Spanish ATLAS Tier2 as of February 2012

IFIC Computing Infrastructure Resources as of February 2012 ATLAS CSIC resources (not experiment resources): 25% IFIC users 25% CSIC 25% European Grid 25% Iberian Grid To migrate Scientific application to the GRID EGI

Summary of year 2011 3,579,606 Jobs 6,038,754 CPU consumption hours 13,776,655 KSi2K (CPU time normalised) Supporting 22 Virtual Organizations (VOs)

Prevision 2012 For IFIC For 2012: YEAR 2010 2011 2012 2013 CPU(HS06) 6000 6950 6650 7223 DISK(TB) 500 940 1175 1325 For 2012: We already fulfil the CPU requirements Increasing Disk (planned for April): 230 TB => 4 SuperMicro x 57.6 (2 TB disks)

From Tiered to a Mesh model

Affects IFIC transfers changed to several sources Connectivity Network is a key component in the evolution of the Atlas model Transfers among sites measured by Atlas through FTS log : Matrix of NxN components < 1 MB or > 1 GB files to split transfer and initialization components IFIC is qualified as T2 well connected to T1s and was included as multicloud site-> Direct transfers from/to all ATLAS T1s and process data from many T1s Transfer monitoring https://twiki.cern.ch/twiki/bin/viewauth/Atlas/DataDistributionSharesForT2 y https://twiki.cern.ch/twiki/bin/viewauth/Atlas/T2D#validated_T2Ds T2Ds are the Tier-2 sites that can directly transfer any size of files (especially large files) not only from/to the T1 of the cloud (to which they are associated to) but also from/to the other T1s of the other clouds. T2Ds are candidates for multi-cloud production sites primary replica repository sites (T2PRR) Affects IFIC transfers changed to several sources

Availability Hammercloud: for ATLAS Sites are tested with 'typical' analysis jobs Problematic sites are identified online and automatically blacklisted Protect user jobs from problematic sites Last month IFIC had > 90% availability according to HC (ALPHA SITE) EGI has own different availability tools: Based on ops VO Can have different results IFIC LAST MONTH T1 LFC catalog Movement to T0 http://dashb-atlas-ssb.cern.ch/dashboard/request.py/siteviewhistorywithstatistics?columnid=564&view=shifter%20view

Job Distribution at ES Tier-2 YEAR Tier 2 usage in data processing More job slots used at Tier2s than at Tier1 Large part of MC production and analysis done at Tier2s More Analysis jobs and “group production” at Tier-2s less weight at Tier-1s Production shares to limit “group production” jobs at Tier-1, and run at Tier-2s Analysis share reduced at Tier-1s 2011 More Analysis jobs at IFIC TIER 2 Number of jobs running at Tier0 plus all Tier1s (left) and at all Tier2s (right) from October 2010 to March 2012.

Replication of datasets in ES-Cloud IFIC is alpha T2D site ( good connectivity and availability) -> candidate for more datasets replication ES-Cloud Tier 2 sites getting more datasets Bit more that Tier-1 Pic Data transfer volume in the ES-Cloud sites from January to March 2011 (left) and 2012 (right)

Storage tokens FEBRUARY 2012 LOCAL is for T3 local users, out of T2 Pledges. Numbers reflect actual usable space for Filesystem.

IFIC Storage resources and Network SUN thumpers: X4500 + X4540 Lustre v1.8 New Supermicro servers: SAS disks, capacity 2 TB per disk and 10 Gbe connectivity Srmv2 (STORM) 3 x gridftp servers Data Network based on gigabyte ethernet. 10GBe network

IFIC Lustre Filesystems and Pools With the Lustre release 1.8.1, we have added pool capabilities to the installation. Allows us to partition the HW inside a given Filesystem Better data management Can separate heterogeneous disks in the future Different T2 / T3 ATLAS pools , and separated from other Vos Better Management and Performance 4 Filesystems with various pools: /lustre/ific.uv.es Read Only on WNs and UI. RW on GridFTP + SRM /lustre/ific.uv.es/sw. Software: ReadWrite on WNs, UI (ATLAS USES CVMFS) /lustre/ific.uv.es/grid/atlas/t3 Space for T3 users: ReadWrite on WNs and UI xxx.ific.uv.es@tcp:/homefs on /rhome type lustre. Shared Home for users and mpi applications: ReadWrite on WNs and UI CVMFS FOR ATLAS

Example of Grid & Physics Analysis Distributed Computing and Data Management Tools based on GRID technologies have been used by the IFIC physicists to obtain their results M. Villaplana

Daily user activity in Distributed Analysis An example of Distributed Analysis Input files Work flow: Test the analysis locally - Input files downloaded by DQ2 Submit 1 job to GRID -creating an output file with only the interesting info Input real/simulation data Time: around 20 h Submit 2 job to GRID (at Tier3) - analyzing: particles reconstruction, selections… - Input: output of job 1 - Time around 2 h 1 2

Tier3 infrastructure at IFIC At IFIC the Tier3 resources are being split into two parts: Resources coupled to IFIC Tier2 Grid environment Use by IFIC-ATLAS users Resources are idle, used by the ATLAS community A computer farm to perform interactive analysis (proof) outside the grid framework Reference: ATL-SOFT-PROC-2011-018

Tools for Distributed Analysis For ATLAS users, GRID tools have been developed: For Data management Don Quijote 2 (DQ2) Data info: name, files, sites, number,… Download and register files on GRID,.. Data Transfer Request (DaTri) Extension of Dq2 including quota management Users make request a set of data (datasets) to create replicas in other sites (under restrictions) ATLAS Metadata Interface (AMI) Data info: events number, availability For simulation: generation parameter, … For Grid jobs PanDa Client Tools from PanDa team for sending jobs in a easy way for user Ganga (Gaudi/Athena and Grid alliance) A job management tool for local, batch system and the grid

Daily user activity in Distributed Analysis 1) A python script is created where requirements are defined Application address, Input, Output A replica request to IFIC Splitting 2) Script executed with Ganga/Panda Grid job is sent 3) Job finished successfully, output files are copied in the IFIC Tier3 Easy access for the user Just in two weeks, 6 users for this analysis sent: 35728 jobs, 64 sites, 1032 jobs ran in T2-ES (2.89%), Input: 815 datasets Output: 1270 datasets

User experience Users are mostly satisfied with infrastructure and tools Functionality of dataset replication is transparent to users, but eventually they learn that it is in action: “a job was finally sent to a destination that did not originally had the requested dataset” “I saw later my original dataset has been copied” “ I just realized because I used dq2 previously to check where the dataset was” Replication is effective for official datasets, not for user generated ones: “We see that is was not replicating our homemade data”

Summary Evolution of Atlas data model was presented, and how this affects the IFIC Tier-2. Improvement of the Tier-2 resources usage Changes in operations and infrastructure It was presented a real example on how users are working and their experiences: Dynamic data replication useful and transparent