Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.

Slides:



Advertisements
Similar presentations
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Advertisements

CERN Castor external operation meeting – November 2006 Olof Bärring CERN / IT.
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Storage Issues: the experiments’ perspective Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 9 September 2008.
16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN.
CASTOR Upgrade, Testing and Issues Shaun de Witt GRIDPP August 2010.
Online Testing TAKS – XL Retest 2008 – 2009 Online Testing Opportunities October 21 st – 24 th 2008 March 3 rd – 6 th 2009 April 28 th –
London Tier 2 Status Report GridPP 12, Brunel, 1 st February 2005 Owen Maroney.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Angela Poschlad (PPS-FZK), Antonio Retico.
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
RAL Site Report Castor Face-to-Face meeting September 2014 Rob Appleyard, Shaun de Witt, Juan Sierra.
CERN - IT Department CH-1211 Genève 23 Switzerland t CASTOR Status March 19 th 2007 CASTOR dev+ops teams Presented by Germán Cancio.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Update on Windows 7 at CERN & Remote Desktop.
Report from CASTOR external operations F2F meeting held at RAL in February Barbara Martelli INFN - CNAF.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CASTOR status Presentation to LCG PEB 09/11/2004 Olof Bärring, CERN-IT.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
CERN IT Department CH-1211 Genève 23 Switzerland t Load Testing Dennis Waldron, CERN IT/DM/DA CASTOR Face-to-Face Meeting, Feb 19 th 2009.
Handling ALARMs for Critical Services Maria Girone, IT-ES Maite Barroso IT-PES, Maria Dimou, IT-ES WLCG MB, 19 February 2013.
GGUS summary (4 weeks) VOUserTeamAlarmTotal ALICE1102 ATLAS CMS LHCb Totals
Tier-1 Andrew Sansum Deployment Board 12 July 2007.
Last update 29/01/ :01 LCG 1Maria Dimou- cern-it-gd Maria Dimou IT/GD CERN VOMS server deployment LCG Grid Deployment Board
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
4 March 2008CCRC'08 Feb run - preliminary WLCG report 1 CCRC’08 Feb Run Preliminary WLCG Report.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
WP1 WP2 WP3 WP4 WP5 COORDINATOR WORK PACKAGE LDR RESEARCHER ACEOLE MID TERM REVIEW CERN 3 RD AUGUST 2010 Magnoni Luca Early Stage Researcher WP5 - ATLAS.
Operational experiences Castor deployment team Castor Readiness Review – June 2006.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
CASTOR Status at RAL CASTOR External Operations Face To Face Meeting Bonny Strong 10 June 2008.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Area Coordinator Report for Operations Rob Quick 4/10/2008.
GDB meeting - July’06 1 LHCb Activity oProblems in production oDC06 plans & resource requirements oPreparation for DC06 oLCG communications.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
ALICE experiences with CASTOR2 Latchezar Betev ALICE.
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
WLCG critical services update Andrea Sciabà WLCG operations coordination meeting December 18, 2014.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
GGUS summary (3 weeks) VOUserTeamAlarmTotal ALICE7029 ATLAS CMS LHCb Totals
WLCG IPv6 deployment strategy
LCG Service Challenge: Planning and Milestones
Service Challenge 3 CERN
Elizabeth Gallas - Oxford ADC Weekly September 13, 2011
Database Readiness Workshop Intro & Goals
Update on Plan for KISTI-GSDC
CASTOR-SRM Status GridPP NeSC SRM workshop
Castor services at the Tier-0
Olof Bärring LCG-LHCC Review, 22nd September 2008
Update from the HEPiX IPv6 WG
Bernd Panzer-Steindel CERN/IT
11/23/2018 3:03 PM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Presentation transcript:

Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006

Jan van Eldik (IT/FIO/FS) 2 Dissemination  Conference contributions (selected from Castor website)  CASTOR status and overview HEPIX, Apr 2006  CASTOR: Operational issues and new Developments Operation of the CERN Managed Storage environment CHEP, Sept 2004  CASTOR/New stager presentation, Storage Resource Sharing with CASTOR NASA/IEEE Mass Storage Conference, Apr 2004  Regular reporting  PEB, GDB, LHCC, IT department  Training  CASTOR2 operation training Dec 2005  External operations workshops Feb 04, Nov 04, Jun 05, Jan 06  Documentation, user guides, admin guides, man pages….  lots of it  various web sites, reflecting organizational updates  in urgent need of restructuring!

Jan van Eldik (IT/FIO/FS) 3 Migrate LHC expts to Castor-2  Main activity since late 2005  Questionnaire, discussions to configure according to needs of experiments  Identify missing functionality, and provide it  enhanced stager_qry command  Agree steps for migration:  initial configuration (and sizes!) of diskpools  staged migrations: production, data export, user analysis  stop usage of castorgridsc.cern.ch (SC3!)  Migrate experiments one-by-one…  but as quickly as possible

Jan van Eldik (IT/FIO/FS) 4 General remarks  From our perspective  More similarities than differences between experimental needs (fortunately)  Castor accessed through experiment frameworks  stager_qry enhancements were required by all  Discussions and presentations helped a lot to avoid confusion, and prevent (many, but not all…) problems  Migration provided good opportunity to clean up  disk caches flushed!  “temporary” SRM mappings…  Confusion about required and provided connectivity resulted in many tickets 

Jan van Eldik (IT/FIO/FS) 5 General remarks (2)  Challenging to communicate effectively with Offline coordinators, production managers, end users, T2 managers  Support line well known by  advertised to all experiments and their users, also part of Grid Support lines  it is being restructured on our side: 1st level:FIO Service Manager on Duty (rota) 2nd level:Castor Service Manager 3rd level: Developers

Jan van Eldik (IT/FIO/FS) 6 Alice  discussions with Latchezar Betev  well understood usage of Castor  first to go, presented on February 2  additional complication: xrootd  was running on (public!) castorgrid nodes  is now running on dedicated Alice machines, with special configurations…  connectivity required for transports to non-Tier1’s  Next: Alice data challenge in July

Jan van Eldik (IT/FIO/FS) 7 LHCb  discussed with Philippe Charpentier, Joel Closier  presented on March 8  require rootd on diskservers  found several problems, fixed by Root and Castor teams  require ‘durable’ SRM endpoint  Castor release 2.1.0, to be deployed soon  user migration still ongoing  only experiment to ask for Windows client (for online needs)

Jan van Eldik (IT/FIO/FS) 8 CMS  discussions with Nick Sinanis  presented on March 9  2 independent Castor-1 stagers  stagecmsprod  stagecms  quite some “SC3 activity” until March/April  migration completed beginning of May  problems:  connectivity required for PHeDex transports  load on diskservers caused by normal(!) user activity

Jan van Eldik (IT/FIO/FS) 9 Atlas  discussions with Gilbert Poulard, Luc Goossens  presented on March 27  most affected by Castor-1 limitations  very frequent crashes  require rootd on diskservers (like LHCb)  require ‘durable’ SRM endpoint (like LHCb)  migration completed end of April  problems  failing hardware for stager database  after that: stager database problems…  load on diskservers caused by normal(!) user activity  we are now preparing the TDAQ data challenge

Jan van Eldik (IT/FIO/FS) 10 Service Challenges  very demanding (and useful!) customer!  simple use of Castor…  single diskpool, with disk resident files, read-only, large files  … but large setups…  SC3 service phase: many imbalanced (old) servers, complicated network topology/castor setup, handcrafted configurations, etc rerun: 16 identical servers on single IP service  SC4 throughput phase: >50 servers on 4 IP services service phase: ~40 servers  … with challenging requirements  SRM endpoints to be registered in GOCDB and site BDII…  SLC4  gridftp performance monitoring using RGMA  management of host certificates still lots of handwork  frequent support requests 24 x 7

Jan van Eldik (IT/FIO/FS) 11 ITDC  IT Data Challenges – Bernd Panzer  Simulate (and verify) Tier-0 activities  many concurrent read + write streams  Pushes Castor to its limits (and beyond)  scheduling and migration policies  new tape infrastructure  test deployment of new software releases  Large setup, with many servers  Lots of feedback…

Jan van Eldik (IT/FIO/FS) 12 Castor-2 usage: view from LSF

Jan van Eldik (IT/FIO/FS) 13 Conclusion  Migration of LHC experiments almost completed…  thanks to the experiments for fruitful discussions  most problems found along the way are fixed  … now their setups have to evolve!  Service Challenges and ITDC very useful  requirements, bug reports, field testing, etc.  Support structure improvements  documentation needs overhaul  support flows being rationalized