EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.

Slides:

Advertisements

Similar presentations

S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.

Advertisements

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

Task 3.5 Tests and Integration ( Wp3 kick-off meeting, Poznan, 29 th -30 th January 2002 Santiago González de la.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.

December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.

Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks From ROCs to NGIs The pole1 and pole 2 people.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Romanian SA1 report Alexandru Stanciu ICI.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

INFSO-RI Enabling Grids for E-sciencE Project Gridification: the UNOSAT experience Patricia Méndez Lorenzo CERN (IT-PSS/ED) CERN,

LHC Computing Plans Scale of the challenge Computing model Resource estimates Financial implications Plans in Canada.

F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,

Alex Read, Dept. of Physics Grid Activity in Oslo CERN-satsingen/miljøet møter MN-fakultetet Oslo, 8 juni 2009 Alex Read.

7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.

Brookhaven Analysis Facility Michael Ernst Brookhaven National Laboratory U.S. ATLAS Facility Meeting University of Chicago, Chicago 19 – 20 August, 2009.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.

EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.

T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Wojciech Lapka SAM Team CERN EGEE’09 Conference,

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.

EGEE-III INFSO-RI Enabling Grids for E-sciencE Antonio Retico CERN, Geneva 19 Jan 2009 PPS in EGEEIII: Some Points.

EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.

Alex Read, Dept. of Physics Grid Activities in Norway R-ECFA, Oslo, 15 May, 2009.

INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.

INFSO-RI Enabling Grids for E-sciencE ATLAS DDM Operations - II Monitoring and Daily Tasks Jiří Chudoba ATLAS meeting, ,

Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CRAB: the CMS tool to allow data analysis.

MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.

Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)

ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko

ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.

1 ATLAS-VALENCIA COORDINATION MEETING 9th May 2008 José SALT TIER2 Report 1.- Present Status of the ES-ATLAS-T2 2.- Computing-Analysis issues Overview.

ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)

Presented by: Álvaro Fernandez Casani IFIC – Valencia (Spain) eScience Intrastructure T2-T3 for High Energy Physics Santiago.

Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.

INFSO-RI Enabling Grids for E-sciencE Turkish Tier-2 Site Report Emrah AKKOYUN High Performance and Grid Computing Center TUBITAK-ULAKBIM.

EXPERIENCE WITH ATLAS DISTRIBUTED ANALYSIS TOOLS S. González de la Hoz L. March IFIC, Instituto.

Analysis Facility Infrastructure: ATLAS in Valencia S. González de la Hoz (On behalf of Tier3 Team) IFIC – Institut de Física Corpuscular de València First.

CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.

Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,

Presented by: Santiago González de la Hoz IFIC – Valencia (Spain) Experience running a distributed Tier-2 and an Analysis.

ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.

ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)

Status: ATLAS Grid Computing

Overview of the Belle II computing

Belle II Physics Analysis Center at TIFR

ATLAS Distributed Analysis tests in the Spanish Cloud

Added value of new features of the ATLAS computing model and a shared Tier-2 and Tier-3 facilities from the community point of view Gabriel Amorós on behalf.

Data Challenge with the Grid in ATLAS

Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Readiness of ATLAS Computing - A personal view

Simulation use cases for T2 in ALICE

Status and Plans of the Spanish ATLAS

ATLAS Distributed Analysis tests in the Spanish Cloud

The LHCb Computing Data Challenge DC06

Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for the collisions collected in the first run at LHC Presenter: Santiago González de la Hoz IFIC (Instituto de Física Corpuscular), Centro mixto CSIC-Universitat de València 1 GONZALEZ, S., 1 SALT, J., 1 AMOROS, G, 1 FERNANDEZ, A., 1 KACI, M., 1 LAMAS, A., 1 OLIVER, E., 1 SANCHEZ, J., 1 VILLAPLANA, M., 2 BORREGO, C., 2 CAMPOS, M., 2 NADAL, 2 J., OSUNA, C., 2 PACHECO, A., 3 DEL PESO, J., 3 PARDO, J., 3 GILA, M., 4 ESPINAL, X. 1 IFIC-Valencia, 2 IFAE-Barcelona, 3 UAM-Madrid, 4 PIC-Barcelona

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Outline Introduction Operation of the Tier2 during the LHC beam events  Resources  Operation procedure Data management of the collision events  Data Distribution Policy  Distribution in the storage elements (space tokens) & access to the data The simulated physics events productions  The ATLAS distributed production system  Simulated events production at ES-ATLAS-T2  Contribution of Iberian cloud to the EGEE and ATLAS Distributed Analysis and Local Analysis Facilities  From Raw data to the final analysis phase Conclusions

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Introduction Portugal and Spain –forming the South Western Europe Regional Operations Centre (ROC) within EGEE. Iberian Cloud –are collaborating with the ATLAS computing effort:  storage capacity  computing power –have an ATLAS federated Tier2 infrastructure IFIC IFAE UAM ATLAS federated Tier2 in Spain consists of the three sites: –IFAE-Barcelona –UAM-Madrid –IFIC-Valencia (Tier2 activities coordinator) The Tier1 is at PIC-Barcelona

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Operation of ES-ATLAS-T2 Resources –During the 1 st LHC collisions the ES-T2 provided the hardware resources fulfilling the ATLAS requirements of the MoU. –Disk space is managed by two distributed storage systems:  dCache at IFAE and UAM  Lustre at IFIC –WNs have 2 GB of RAM per CPU core (to run the highly demanding ATLAS production jobs) –In addition each site provides users with a Tier3 infrastructure for data analysis:  a part is based on Grid architecture (batch system)  another part being a standard computing cluster (interactive analysis)

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Operation of ES-ATLAS-T2 Operation procedure (The Usual one) –Tools for automatic installation&configuration (large number of machines)  To install and configure the operating system, Grid middleware and storage system. QUATTOR - tool used at ES-T2 (UAM participating in its development). ATLAS software is installed submitting a Grid job using a centralized procedure from the ATLAS software team. –Monitoring (to assure the needed reliability of the T2 centre)  To detect any possible failure of the services or of the hardware  Two levels of monitoring: Global LHC Grid (Service Availability Monitoring). Internal to the site using tools like Nagios, Ganglia or Cacti. –Management of updates (to avoid to jeopardize the centre availability)  Every site has a pre-production cluster where software updates are tested. –Fair share (submit jobs according to their priority)  CPU usage fairly shared between MC and analysis jobs The % assigned to each role is configured on the scheduler (Maui) It is used by the batch queue system (Torque)

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Data Management of the collision events Data distribution policy (during 2009 beam run period) –Raw data were transferred from CERN to the 10 Tier1s (keeping 3 replicas of them). –Reconstructed data ESD/AOD/dESD were distributed to Tier1s and Tier2s, retaining 10 (13) replicas.  Large number of replicas due to the low luminosity of early data –A large fraction of these data has been recorded on disk  To perform analyses for the understanding of the detector response

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Data Management of the collision events Distribution in the storage element (spacetokens) –Storage in ATLAS is performed by using space tokens  ATLASDATADISK is dedicated for real data  ATLASMCDISK is populated with Monte Carlo production  In addition, space tokens are reserved for physics groups and users Access to the data –Users can submit a job to the Grid, the job will be running at the site where the data are located. –Users can request to replicate some data to the Tier2 site they belong to, in order to access the data in a native mode. Data distribution in ES-T2 on February 2010

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ The simulated events production The ATLAS distributed production system (based on Pilot jobs since January 2008) A database (DB) where jobs to be run are defined as well as their run-time status. An executor (PANDA) which takes the jobs from the DB and manages sending them to the ATLAS computing resources, using pilots jobs. Check the correct environment for running the jobs Have the ability to report free resources on the cluster they are running. Together with DDM transfer the needed data to the site before running the job A distributed data management (DDM) system which stores the produced data on the adequate storages resources and register them into the defined catalogues.

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ The simulated events production Simulated events production at ES-ATLAS-T2 –T2-ES has been actively contributing to this production effort. –An increase is observed for the few last months (massive activity took place in ATLAS). –The five last month:  33k jobs ran at IFAE  95k jobs ran at IFIC  85k jobs ran at UAM  For some months, T2-ES is not contributing because of:  errors related to the storage element (missing file, read/write failures).  errors due to bad installation of the ATLAS software.  downtime needed for upgrades. M. Kaci et al, submitted to Ibergrid10 conference

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ The simulated events production Simulated events production at ES-ATLAS-T2 –Job efficiency = (n#jobs successfully done)/(n#jobs total submitted) –Wall clock times efficiency = (wall clock time spent to run successful jobs)/(total wall clock time used by all the submitted jobs) –Jobs efficiency is around or more than 90% –Wall clock time efficiency ~ 95% M. Kaci et al, submitted to Ibergrid10 conference

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ The simulated events production Contribution of IB-Cloud to EGEE and ATLAS –12-14% in EGEE and 8% in ATLAS for the jobs done. –10% in EGEE and 6% on ATLAS for the wall clock. –Jobs successfully run during 2008, 2009 and first two months in Efficiencies are in parenthesis. –An increase of the simulated events production activity together with an improvement of the measured efficiencies. M. Kaci et al, submitted to Ibergrid10 conference

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Distributed Analysis & Local Facility Analysis on collision events performed on the ES-T2 resources fall in two categories: –Detector and Combined performance –Physics Analysis Typical Analysis –First stage is to submit a job to the Grid (Distributed analysis)  The output files will be stored in the corresponding Tier2 storage –Later the output files can be retrieved for a further analysis on the local facilities of the institute. ATLAS distributed analysis in the ES-T2 Two front-ends were used: Ganga and PanDA Differ in the way they interact with the user and the Grid resources

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Distributed Analysis & Local Facility Ganga has been intensively used by the ATLAS community on ES-T2 –More than 800 jobs submitted via Ganga to the ES-T2 from first collisions to March –User is analyzing RAW, ESD and AOD data Jobs are submitted to PanDa via a simple python client –Number of analysis jobs since first collisions to March 2010 allocated by Panda in the ES-T2. –More than 5000 jobs submitted via Panda

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Distributed Analysis & Local Facility Analysis at local facilities –The outcome for the analysis jobs at the Tier2 using Distributed Analysis is a dataset of n-tuples (DPD) stored in the SE. –Next step is carried out at the local facility:  Retrieving the dataset generated with two different procedures: Request a subscription to the ATLAS DDM system to replicate the dataset to the LOCALGROUPDISK area (space token at the Tier3 facility of the Institute) Use the ATLAS DDM client tools to download the dataset to the local disk space –Every site (UAM, IFAE and IFIC) has a local facility (Tier3) providing CPU and disk resources for the last phase of the analysis:  Some User Interfaces used to perform interactive analysis on the final datasets produced at the distributed analysis  A local batch farm to provide additional computing power for analysis  A PROOF farm for parallel processing of root analysis jobs.

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Distributed Analysis & Local Facility Tier3 Facility at IFIC From the raw data to the plot Following the ATLAS Computing Model, only analysis jobs which read data on disk can be submitted to the Grid. Raw data are recorded on the tape systems of the Tier1s. Raw data are reconstructed resulting ESD and AOD data distributed to the Tier2s disks. In the first step of the analysis (Distributed analysis), AOD are processed in order to have filtered versions (dAOD, DPD, ntuples) with interest events for your institute site. After having these DPDs/ntuples, one starts the core analysis (local facility-Tier3) with the most common data analysis framework (ROOT) to plot the final results

Enabling Grids for E-sciencE EGEE-III INFSO-RI S. González de la Hoz; 5 th EGEE User Forum, HEP, Uppsala (Sweden), 14/04/ Conclusions It has been shown that ES-T2 responded very efficiently at the different stages, from collecting data to final experiments results: –receiving and storing the produced data, thanks to the high availability of its sites and the reliable services provided, –providing easy and quick access to these data to users, –contributing continuously and actively to the simulated physics events production, –providing the required distributed analysis tools to allow users to use the data and produce experimental results, and, –setting up infrastructure for the final analysis stages (Tier3s) managed by the Tier2 personnel.