Applications Area Issues RWL Jones Deployment Team – 2 nd June 2005.

Slides:



Advertisements
Similar presentations
1 ALICE Grid Status David Evans The University of Birmingham GridPP 16 th Collaboration Meeting QMUL June 2006.
Advertisements

Applications Area Issues RWL Jones GridPP13 – 5 th June 2005.
GLite Status Stephen Burke RAL GridPP 13 - Durham.
HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
UK Status for SC3 Jeremy Coles GridPP Production Manager: Service Challenge Meeting Taipei 26 th April 2005.
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Storage Issues: the experiments’ perspective Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 9 September 2008.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
Summary of issues and questions raised. FTS workshop for experiment integrators Summary of use  Generally positive response on current state!  Now the.
Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP Structures and Status Report Jeremy Coles
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
RLS Tier-1 Deployment James Casey, PPARC-LCG Fellow, CERN 10 th GridPP Meeting, CERN, 3 rd June 2004.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
Tier1 Status Report Martin Bly RAL 27,28 April 2005.
Thoughts on Future LHCOPN Some ideas Artur Barczyk, Vancouver, 31/08/09.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
Dan Tovey, University of Sheffield User Board Overview Dan Tovey University Of Sheffield.
Andrea Sciabà CERN CMS availability in December Critical services  CE, SRMv2 (since December) Critical tests  CE: job submission (run by CMS), CA certs.
Testing the UK Tier 2 Data Storage and Transfer Infrastructure C. Brew (RAL) Y. Coppens (Birmingham), G. Cowen (Edinburgh) & J. Ferguson (Glasgow) 9-13.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
ATLAS: Heavier than Heaven? Roger Jones Lancaster University GridPP19 Ambleside 28 August 2007.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
BNL Service Challenge 3 Site Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
OSG Deployment Preparations Status Dane Skow OSG Council Meeting May 3, 2005 Madison, WI.
WLCG Service Report ~~~ WLCG Management Board, 18 th September
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
LCG Storage Workshop “Service Challenge 2 Review” James Casey, IT-GD, CERN CERN, 5th April 2005.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
The GridPP DIRAC project DIRAC for non-LHC communities.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
Summary of SC4 Disk-Disk Transfers LCG MB, April Jamie Shiers, CERN.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
LHCC meeting – Feb’06 1 SC3 - Experiments’ Experiences Nick Brook In chronological order: ALICE CMS LHCb ATLAS.
Flavia Donno CERN GSSD Storage Workshop 3 July 2007
WLCG Service Interventions
LHC Data Analysis using a worldwide computing grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

Applications Area Issues RWL Jones Deployment Team – 2 nd June 2005

General operational issues Interoperability still an issue for future and running experiments –CDF are now planning to use SAM-classic Clearly going to be SAM-only resources for some time –Will need specific support – from experiment? –Integrate accounting –Multiple file catalogues in multiple Grids is a real issue ‘minimal functionality with maximum error propagation’ Model for catalogues varies greatly between experiments V O Services –Documentation on VO creation better? –VOMS now an experiment priority Must be integrated into all aspects Quotas by role/group Monitoring by role/group Need for at least three attributes (e.g. role/group/region) 64-bit support will very soon be an issue for the experiments

Data Movement Data management is still a very big issue Most job failures are DM failures –lcg-gt: crashes if SE or BDII is down –lcg-cp: hangs if there is a time-out –lcg-cr: also hangs, and can leave catalogue in an inconsistent state Bulk transfer tools needed ‘yesterday’ Will we have multiple FTS, and for how long? –Vital we have a storage/SRM for small files as well as large files, at all Tiers

Storage Storage technologies and access patterns –Needs to be known by experiments and by Tiers HEPiX/GDB group to assess this RJ to chair, AS will be invited dCache –ATLAS had some issues concerning usability of BNL dCache SRM with tape As much a problem on the experiment side? (My ignorance) Experiment requirement documents on aspects like this still lacking Castor –Access via Castor grid not unacceptable for production ATLAS kill the system with 7000 transfers a day –An issue if people are adopting it SE stability is becoming a problem

LCG Service Issues Services –Network bottlenecks/server overload Too many connection to the same SE –RLS down Cannot handle large loads –Unknown cause –Jabber server down Slows things, does not loose jobs

WMS Entry in BDII is now configurable –This adds more sites to the experiment BDII Some improvement in LCG submission However, LCG submission still painfully slow –ATLAS adding second submission method (Condor-G) almost doubled the throughput Clearly not saturating resources! –Job submission slows if too many CEs present in the BDII

Integration of Service Challenges Service challenges –Experiment planning at differing stages of readiness Still a need for more planning Late appearance of FTS etc is difficult for experiment planning –Network provisioning is an issue –Monitoring tools still needed PR demonstrator evolving Tools for real work still under discussion

Lancs/ATLAS SC3 Plan TaskStart DateEnd DateResource Demonstrate sustained data transfer (T0-T1-T2) Integrate Don Quixote tools into SC infrastructureMon 27/06/05Fri 16/09/05BD,ATLAS Provision of end-to-end conn (T0-T1) Test basic data movement (CERN-UK)Mon 04/07/05Fri 29/07/05BD,MD,ATLAS,RAL Review of bottlenecks and required actionsMon 01/08/05Fri 16/09/05BD, ATLAS, RAL ATLAS Service Challenge 3 (Service Phase)Mon 19/09/05Fri 18/11/05BD, ATLAS, RAL Review of bottlenecks and required actionsMon 21/11/05Fri 27/01/06BD, ATLAS, RAL Optimisation of networkMon 30/01/06Fri 31/03/06BD,MD,BG,NP, RAL Test of data transfer rates (CERN-UK)Mon 03/04/06Fri 28/04/06BD,MD, ATLAS, RAL Review of bottlenecks and required actionsMon 01/05/06Fri 26/05/06BD, ATLAS, RAL Optimisation of networkMon 29/05/06Fri 30/06/06BD,MD,BG,NP, RAL Test of data transfer rates (CERN-UK)Mon 03/07/06Fri 28/07/06BD,MD, ATLAS, RAL Review of bottlenecks and required actionsMon 31/07/06Fri 15/09/06BD, ATLAS, RAL Plan organised through end of SC3 into SC4

Lancs/ATLAS SC3 Plan TaskStart DateEnd DateResource Optimisation of networkMon 18/09/06Fri 13/10/06BD,MD,BG,NP, RAL Test of data transfer rates (CERN-UK)Mon 16/10/06Fri 01/12/06BD,MD, ATLAS, RAL Provision of end-to-end conn. (T1-T2) Integrate Don Quixote tools into SC infrastructure at LANMon 19/09/05Fri 30/09/05BD Provision of memory-to-memory conn. (RAL-LAN)Tue 29/03/05Fri 13/05/05UKERNA,BD,BG,NP, RAL Provision and Commission of LAN h/wTue 29/03/05Fri 10/06/05BD,BG,NP Installation of LAN dCache SRMMon 13/06/05Fri 01/07/05MD,BD Test basic data movement (RAL-LAN)Mon 04/07/05Fri 29/07/05BD,MD,ATLAS, RAL Review of bottlenecks and required actionsMon 01/08/05Fri 16/09/05BD [SC3 – Service Phase] Review of bottlenecks and required actionsMon 21/11/05Fri 27/01/06BD Optimisation of networkMon 30/01/06Fri 31/03/06BD,MD,BG,NP Test of data transfer rates (RAL-LAN)Mon 03/04/06Fri 28/04/06BD,MD, Review of bottlenecks and required actionsMon 01/05/06Fri 26/05/06BD

Lancs/ATLAS SC3 Plan TaskStart DateEnd DateResource Optimisation of networkMon 29/05/06Fri 30/06/06BD,MD,BG,NP Test of data transfer rates (RAL-LAN)Mon 03/07/06Fri 28/07/06BD,MD Review of bottlenecks and required actionsMon 31/07/06Fri 15/09/06BD Optimisation of networkMon 18/09/06Fri 13/10/06BD,MD,BG,NP Test of data transfer rates (RAL-LAN)Mon 16/10/06Fri 01/12/06BD,MD

Resource Planning This is more of the PMB/CB/T1 Board issue –The UK resources are understood by all to be in adequate in 2007 –New and more worrying is the CERN intention to cut the T0 and T1AF resources This is to cover the missing effort (which is already committed)