Grid Computing 4 th FCPPL Workshop Gang Chen & Eric Lançon.

Slides:

Advertisements

Similar presentations

 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.

Advertisements

S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.

Overview of LCG-France Tier-2s and Tier-3s Frédérique Chollet (IN2P3-LAPP) on behalf of the LCG-France project and Tiers representatives CMS visit to Tier-1.

1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.

ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)

Division Report Computing Center CHEN Gang Computing Center Oct. 24, 2013 October 24 ，

José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.

SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.

D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.

ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.

Grid Applications for High Energy Physics and Interoperability Dominique Boutigny CC-IN2P3 June 24, 2006 Centre de Calcul de l’IN2P3 et du DAPNIA.

Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.

Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.

PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.

OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL

BESIII Production with Distributed Computing Xiaomei Zhang, Tian Yan, Xianghu Zhao Institute of High Energy Physics, Chinese Academy of Sciences, Beijing.

Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.

ATLAS in LHCC report from ATLAS –ATLAS Distributed Computing has been working at large scale Thanks to great efforts from shifters.

Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.

D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

Site Report BEIJING-LCG2 Wenjing Wu (IHEP) 2010/11/21.

1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.

Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep , 2014 Draft.

Tim 18/09/2015 2Tim Bell - Australian Bureau of Meteorology Visit.

Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.

CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.

CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.

USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.

Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,

EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.

IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.

High Energy FermiLab Two physics detectors (5 stories tall each) to understand smallest scale of matter Each experiment has ~500 people doing.

PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.

INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.

Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.

Julia Andreeva on behalf of the MND section MND review.

Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.

LHC Computing, CERN, & Federated Identities

Status of WLCG FCPPL project Status of Beijing site Activities over last year Ongoing work and prospects for next year LANÇON Eric & CHEN Gang.

MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.

CC - IN2P3 Site Report Hepix Fall meeting 2010 – Ithaca (NY) November 1st 2010

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)

ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.

Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.

A Computing Tier 2 Node Eric Fede – LAPP/IN2P3. 2 Eric Fede – 1st Chinese-French Workshop Plan What is a Tier 2 –Context and definition To be a Tier 2.

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.

Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.

Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.

IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.

Activities and Perspectives at Armenian Grid site The 6th International Conference "Distributed Computing and Grid- technologies in Science and Education"

The status of IHEP Beijing Site WLCG Asia-Pacific Workshop Yaodong CHENG IHEP, China 01 December 2006.

ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon

ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.

CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.

Alice Operations In France

Status of WLCG FCPPL project

Overview of the Belle II computing

POW MND section.

Data Challenge with the Grid in ATLAS

Readiness of ATLAS Computing - A personal view

Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)

Grid Computing 6th FCPPL Workshop

Presentation transcript:

Grid Computing 4 th FCPPL Workshop Gang Chen & Eric Lançon

Gang Chen/CC/IHEP LHC Grid Computing LHC started to be operational in March LHC started to be operational in March WLCG became the real productive level computing system for the Experiments. WLCG became the real productive level computing system for the Experiments. Collaboration of Grid computing within FCPPL is also a challenge to meet the requirement of LHC. Collaboration of Grid computing within FCPPL is also a challenge to meet the requirement of LHC.

Gang Chen/CC/IHEP Grid organization CERN Lyon Beijing Active collaboration between Lyon-T1 and Beijing T2 mandatory (-from the Eric’s slide for FCPPL2010)

Gang Chen/CC/IHEP Activities in 2010 One person from CC-IN2P3 stays two years at IHEP starting from last summer (Fabio Hernandez) One person from CC-IN2P3 stays two years at IHEP starting from last summer (Fabio Hernandez) Enhance the close collaboration between two partnersEnhance the close collaboration between two partners One person from IHEP visited CC-IN2P3 for three weeks (Jingyan Shi) One person from IHEP visited CC-IN2P3 for three weeks (Jingyan Shi) Exchange of expertise on grid site operationsExchange of expertise on grid site operations Active cooperation between China & France through : Active cooperation between China & France through : Monthly meeting about organizational and operational computing issues on French cloudMonthly meeting about organizational and operational computing issues on French cloud Monthly LCG-France technical meetings to share common operational solutionsMonthly LCG-France technical meetings to share common operational solutions French Cloud conference in November, three persons from IHEPFrench Cloud conference in November, three persons from IHEP Face to Face meetings (Eric Lan ç on, Xiaofei Yan, Gongxing Sun) Face to Face meetings (Eric Lan ç on, Xiaofei Yan, Gongxing Sun) Visits at BeijingVisits at Beijing Workshops in France and JapanWorkshops in France and Japan

Gang Chen/CC/IHEP Activities in 2010 Fine tuning of the network between IHEP and CC- IN2P3. Fine tuning of the network between IHEP and CC- IN2P3. Guillaume Cessieux and Fazhi Qi involvedGuillaume Cessieux and Fazhi Qi involved Operation of the French cloud of ATLAS Operation of the French cloud of ATLAS Remote operation of sites in China, France, Japan, RomaniaRemote operation of sites in China, France, Japan, Romania Monitoring of production, analysis, data transferMonitoring of production, analysis, data transfer Shifts operated by 5 people (Wenjing Wu from IHEP)Shifts operated by 5 people (Wenjing Wu from IHEP) Monitoring of ATLAS Distributed Data Management (DDM) Monitoring of ATLAS Distributed Data Management (DDM) PhD Thesis from Donal Zang (IHEP) in cooperation with main DDM architect (French collaborator)PhD Thesis from Donal Zang (IHEP) in cooperation with main DDM architect (French collaborator) CMS related activities … CMS related activities …

Gang Chen/CC/IHEP Network performance Tuning Problem ： Problem ： CC-IN2P3  IHEP performance is acceptable, but IHEP  IN2P3 was very badCC-IN2P3  IHEP performance is acceptable, but IHEP  IN2P3 was very bad KB/sec with one stream81.94 KB/sec with one stream The large files(>1GB) could not be transferred from IHEP to IN2P3The large files(>1GB) could not be transferred from IHEP to IN2P3

Gang Chen/CC/IHEP Network performance Tuning Contacted with Renater to adjust the network configurationContacted with Renater to adjust the network configuration Performance backed to the normal level on Sept Performance backed to the normal level on Sept IHEP  CC-IN2P3 throughput with single stream can be a few MB/sIHEP  CC-IN2P3 throughput with single stream can be a few MB/s Comparable with CC-IN2P3  IHEPComparable with CC-IN2P3  IHEP Performance asymmetry still persists…Performance asymmetry still persists… Further work is needed in 2011Further work is needed in 2011

Gang Chen/CC/IHEP ATLAS DDM/DQ2 Tracer service ATLAS Distributed Data Management service ATLAS Distributed Data Management service Record relevant information about data Access and Usage on the grid Record relevant information about data Access and Usage on the grid Key and critical component for the ATLAS COMPUTING MODEL Key and critical component for the ATLAS COMPUTING MODEL Automatic and dynamic cleaning of grid storages based on popularityAutomatic and dynamic cleaning of grid storages based on popularity Automatic replication of ‘HOT’ dataAutomatic replication of ‘HOT’ data Both experiment and user activity keep increasing since data taking Both experiment and user activity keep increasing since data taking Evolution of the total space (PB)Total Number of traces* per month (M) *Trace = Grid file access(read/write) operation ~60 traces / second Peak > 300 / second

Gang Chen/CC/IHEP New DDM/ DQ2 Tracer architecture Issues with the old tracer architecture Issues with the old tracer architecture Important contributions from Donal Zang (IHEP) Important contributions from Donal Zang (IHEP) Evaluation of new technologies Evaluation of new technologies Messaging system & NOSQL databasesMessaging system & NOSQL databases Official ATLAS R&D taskforces / request support to CERN-ITOfficial ATLAS R&D taskforces / request support to CERN-IT Definition and validation of the new tracer and monitoring architecture Definition and validation of the new tracer and monitoring architecture oracle HTTP one-by-one insertion stomp bulk insertion real time statistics Monitorin g & API Monitorin g & API Tracer agents statisti c agents … … Old architecture New architecture Scalability issues, Loss of traces, Limited monitoring

Gang Chen/CC/IHEP Good results in production All issues solved ! All issues solved ! >1k traces/second and can scale linearly >1k traces/second and can scale linearly No lost traces No lost traces Almost real time monitoring on thousands of metrics Almost real time monitoring on thousands of metrics Monitoring Plots (based on statistic metrics in Cassandra) Total file size ~90T/hour Average file size ~0.6GFile operation numbers ~60 /second Average transfer rate ~25M/second

Gang Chen/CC/IHEP ATLAS Data transfer speed: Lyon to Beijing Large improvement of transfer speed in last trimester of 2010, thanks to continuous monitoring effort

Gang Chen/CC/IHEP ATLAS Data transfer between Lyon and Beijing > 130 TB of data transferred from Lyon to Beijing in 2010 > 35 TB of data transferred from Lyon to Beijing in 2010

Gang Chen/CC/IHEP CMS Data transfer from/to Beijing ~290 TB transferred from elsewhere to Beijing in 2010 ~110 TB transferred from Beijing elsewhere in 2010

Gang Chen/CC/IHEP Beijing: 10% of Jobs from T2s of FR-cloud Production efficiency : 92.5% (average T2 of FR-cloud : 86%) Half of Beijing resources used for analysis in second part of 2010 ATLAS Beijing in 2010

Gang Chen/CC/IHEP Total Beijing in 2010 About 8.7 million CPU hours provided and 2.4 million jobs completed in the year: About 8.7 million CPU hours provided and 2.4 million jobs completed in the year: Experiments CPU hours Jobs ATLAS5,054,1381,681,391 CMS3,639,866752,886

Gang Chen/CC/IHEP Beijing site

Gang Chen/CC/IHEP Prospect for 2011 More integrated operation of French Cloud More integrated operation of French Cloud Closer monitoring of data transfers to/from IHEP Closer monitoring of data transfers to/from IHEP Foreseen areas of cooperation: Foreseen areas of cooperation: Improvement on the transfer rateImprovement on the transfer rate  to ensure that Beijing remains in the top list of ATLAS T2s Caching technology for software & calibration constants distributionCaching technology for software & calibration constants distribution Virtual Machine testing for deployment on the GridVirtual Machine testing for deployment on the Grid Remote ATLAS control station testsRemote ATLAS control station tests

Gang Chen/CC/IHEP THANK YOU