GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, 11-12 March 2008.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

GridPP9 – 5 February 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC GridPP2: Data and Storage Management.
J Jensen CCLRC/RAL Storage Middleware GridPP12, Brunel, Jan 2005 (or 0-1 Feb 2005)
Storage Review David Britton,21/Nov/ /03/2014 One Year Ago Time Line Apr-09 Jan-09 Oct-08 Jul-08 Apr-08 Jan-08 Oct-07 OC Data? Oversight.
Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
User Board - Supporting Other Experiments Stephen Burke, RAL pp Glenn Patrick.
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Steve Traylen Particle Physics Department Experiences of DCache at RAL UK HEP Sysman, 11/11/04 Steve Traylen
Jens G Jensen CCLRC/RAL hepsysman 2005Storage Middleware SRM 2.1 issues hepsysman Oxford 5 Dec 2005.
Wahid Bhimji SRM; FTS3; xrootd; DPM collaborations; cluster filesystems.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Project Overview GridPP Storage J Jensen GridPP storage workshop RHUL, April 2010.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 experience Sophie Lemaitre WLCG Workshop.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Matthew Palmer, Cambridge University01/10/2015 First Use of the UK e-Science Grid Overview The Physics Experiences Looking forward Conclusions Matthew.
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
StoRM Some basics and a comparison with DPM Wahid Bhimji University of Edinburgh GridPP Storage Workshop 31-Mar-101Wahid Bhimji – StoRM.
Grid Interoperability Shootout GridPP and NGS UK e-Science All Hands Meeting, Nottingham 2007 J Jensen, G Stewart, M Viljoen, D Wallom, S Young (contact.
Status of SRM 2.2 implementations and deployment 29 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
SRM 2.2: tests and site deployment 30 th January 2007 Flavia Donno, Maarten Litmaath IT/GD, CERN.
SRM 2.2: status of the implementations and GSSD 6 th March 2007 Flavia Donno, Maarten Litmaath INFN and IT/GD, CERN.
Data Management The GSM-WG Perspective. Background SRM is the Storage Resource Manager A Control protocol for Mass Storage Systems Standard protocol:
1 24x7 support status and plans at PIC Gonzalo Merino WLCG MB
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Your university or experiment logo here Storage and Data Management - Background Jens Jensen, STFC.
11 March 2008 GridPP20 Collaboration meeting David Britton - University of Glasgow GridPP Status GridPP20 Collaboration Meeting, Dublin David Britton,
Your university or experiment logo here GridPP Storage Future Jens Jensen GridPP workshop RHUL, April 2010.
GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Documentation (& User Support) Issues Stephen Burke RAL DB, Imperial, 12 th July 2007.
Padova, 5 October StoRM Service view Riccardo Zappi INFN-CNAF Bologna.
The new FTS – proposal FTS status. EMI INFSO-RI /05/ FTS /05/ /05/ Bugs fixed – Support an SE publishing more than.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
GridPP storage status update Joint GridPP Board Deployment User Experiment Update Support Team, Imperial 12 July 2007,
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
SRM v2.2 Production Deployment SRM v2.2 production deployment at CERN now underway. – One ‘endpoint’ per LHC experiment, plus a public one (as for CASTOR2).
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
GridPP2 Data Management work area J Jensen / RAL GridPP2 Data Management Work Area – Part 2 Mass storage & local storage mgmt J Jensen
SRM 2.2: experiment requirements, status and deployment plans 6 th March 2007 Flavia Donno, INFN and IT/GD, CERN.
HEPiX IPv6 Working Group David Kelsey david DOT kelsey AT stfc DOT ac DOT uk (STFC-RAL) HEPiX, Vancouver 26 Oct 2011.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.
User Domain Storage Elements SURL  TURL LFC Domain (LCG File Catalogue) SA1 – Data Grid Interoperation Enabling Grids for E-sciencE EGEE-III INFSO-RI
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
Jean-Philippe Baud, IT-GD, CERN November 2007
gLite->EMI2/UMD2 transition
SRM Developers' Response to Enhancement Requests
SRM2 Migration Strategy
Proposal for obtaining installed capacity
Presentation transcript:

GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, March 2008

Jens Jensen, STFC/RAL Bear with me for a moment View of the past –Achievements –Lessons learned Present –SRM 2 deployment Future –Todo –Really high level stuff

Jens Jensen, STFC/RAL Who we are… GridPP storage community As defined by mailing list, has ~55 members –Covers every UK site –Also in.ie,.nl,.ca,.pl,.it,.de However, not all are equally active… –But thats OK –Isnt it?

Jens Jensen, STFC/RAL Support Developers Dev support Depl. support GridPP supprot community supprot (local) users

Jens Jensen, STFC/RAL Support Developers Dev support Depl. support GridPP supprot community supprot (local) users 1 person…

Jens Jensen, STFC/RAL Support Developers Dev support Depl. support GridPP supprot community supprot users Maybe reality is a little more complicated

Jens Jensen, STFC/RAL Your name appeared among the beneficiaries who will receive a part- payment of US$2.8 million and has been approved already for months. You are requested to get back to me for more direction and instruction on how to receive your fund. We want to hear from you before we can make the transfer Open for questions, goes to Greig and Jens Almost all spam Promising to solve our financial problems They tell us: Storage, size matters

Jens Jensen, STFC/RAL Status

Jens Jensen, STFC/RAL Status

Jens Jensen, STFC/RAL Status 2/3 of sites running DPM –Experimentally on Lustre –(Cambridge, UCL) 1/3 of sites running dCache Tier 1 running CASTOR –(and dCache) Bristol (Jon) running StoRM

Jens Jensen, STFC/RAL Status Finished CCRC 08 Should have SRM2 deployed –At least for Atlas (sites) Need space token descrs Problems with space manager in dCache –And CMS (sites) More static token descrs initially –Information system secondary (tokens static) Still reqd for accounting Many people worked hard to make it a success

Jens Jensen, STFC/RAL Experiences Went well, mostly SRM2 used at RAL –Few odd bugs and issues –E.g P free –Negative file sizes (gridftp 32 bit issue?) Took time to get space token (descr) agreed Who speaks for expts? Using spaces at T2s –OK for DPMers Needs firewall open Endpoint published Spaces set up –Harder for dCache Problems with space mgr But running on same port

Jens Jensen, STFC/RAL Lessons No way to get through to everyone –Needs some effort at sites (to do what we need) –Workshop at NeSC was a success Storage is more difficult than you'd think –Particularly the occasional peaks –Implementation specific optimisations –Locating the problem – complex implementations Need to manage risks more carefully –GridPP2: surprising number of risks happened!

Jens Jensen, STFC/RAL risks Risks...(dating back to Dec06-Feb07, needs revision)

Jens Jensen, STFC/RAL Special Achievements Beyond the call of duty Recognised internationally Or special benefits to users

Jens Jensen, STFC/RAL Information Systems Information collected globally Used for accounting Users locate resources

Jens Jensen, STFC/RAL Information Systems Much work done on information system backends in GridPP –GIP plugin easier –DPM (Graeme, then Greig) –dCache debug (owned by SARA then DESY) –CASTOR Disk servers – Tier 1 CASTOR, LSF, tape robot – RAL Storage Oracle databases – RAL DB group

Jens Jensen, STFC/RAL Special Achievements Accounting –Space available and used –Resource overview and selection –(or non-selection) Numerous subtle issues with space What is used? Available? Can info be relied on for selection? Subtle implementation issues Long propeller head discussions

Jens Jensen, STFC/RAL SRM/SRB interoperation using gLite Pretend SRB is a Classic SE Classic SE still supported by gLite FTS FTS SRB Disk storage SRM GridFTP SRM selects pool node… Disk storage GridFTP Disk storage GridFTP LFC

Jens Jensen, STFC/RAL Achievements - FTS monitoring

Jens Jensen, STFC/RAL Achievements – standards SRM 2.2 is now an OGF standard –Collaboration between SRM developers –…and WLCG –New challenges ahead GLUE –Contributed to GLUE SE schema –1.3, also some for 2.0

Jens Jensen, STFC/RAL What Keeps the Unreasonable (Wo)Man Awake at Night? CUS – Campaign for Usable Storage Fabric Staff...!! Coordination

Jens Jensen, STFC/RAL What is Usable Storage Users: we want usable storage Deployment: storage is usable if its being used Not necessarily… Identified (currently) 13 areas –Somewhat overlapping –But that is normal

Jens Jensen, STFC/RAL What is Usable Storage Robust –Doesnt fall over Measure uptime (for some definition of uptime) Good performance Requests per second, concurrent users –Can be tested – DESY did this for dCache Can be tested! (Dave Newbold for CASTOR, ScotGrid for DPM and dCache) –(Also tests the SRM itself)

Jens Jensen, STFC/RAL What is Usable Storage Good Overall Data Performance Tests the data movers and networks –Experiments are good at this –Also 3 rd party transfers, and to tape –Optimisations Ensures resource availability –Concurrent users (other experiments, same expt) Ancient available/used metrics –Load balancing, dynamic alloc.

Jens Jensen, STFC/RAL What is Usable Storage Monitored. Accountable. –See when something goes wrong Reliable accounting data Minimise downtime Maintainable –Ease upgrade, installation and configuration Minimise downtime Tested (prior to release)

Jens Jensen, STFC/RAL What is Usable Storage Standards compliant and interoperable –Provides SRM 2.2 / GLUE 1.3 / GridFTP –Extensive test suite available Secure –Access control, secure implementations Supported –Upstream: developers Publishing metadata in current schema Usable by applications (interfaces)

Jens Jensen, STFC/RAL Challenges Services Capabilities Scale, Performance Economy, Sustainability Middleware State of the Art Users Challenges

Jens Jensen, STFC/RAL Users Applications Culture, History Customer mgmt Usability Users

Jens Jensen, STFC/RAL Services Trust Availability Accounting Discovery Services

Jens Jensen, STFC/RAL State of the Art Web Services Virtualisation Media State of the Art

Jens Jensen, STFC/RAL Middleware Stability Applications Maintenance Support Ease of install And Config Middleware

Jens Jensen, STFC/RAL Scale, Performance Staging Transfer ratesSize of files Number of files Volume Scale, Performance

Jens Jensen, STFC/RAL Sustainability, Economy Scale TrustDynamic Agreement Cost Model Economy

Jens Jensen, STFC/RAL Capabilities Content Access Curation SECURITY Capabilities

Jens Jensen, STFC/RAL Conclusion Lots of things achieved Lots of stuff to do –Somehow always harder than expected –Doesnt asymptotically tend to zero –Plus there are regular peaks so it doesnt even converge Storage is important! should not be underestimated Good community to go forward into GridPP3