Data Transfer Efficiency - leave no byte unchurned Jens Jensen Rutherford Appleton Laboratory GridPP26, U Sussex, March 2011.

Slides:



Advertisements
Similar presentations
Applications Area Issues RWL Jones GridPP13 – 5 th June 2005.
Advertisements

Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Workshop goals Promote learning: –exchange info; stimulate ideas for cooperation; add to collective knowledge base Help NDIIPP/JISC plan the future: –Bring.
Wahid Bhimji SRM; FTS3; xrootd; DPM collaborations; cluster filesystems.
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Defining France Grilles resource allocation strategy Gilles Mathieu, IN2P3 Computing Centre France Grilles International Advisory Committee – March 2011.
Project Overview GridPP Storage J Jensen GridPP storage workshop RHUL, April 2010.
Applications Area Issues RWL Jones Deployment Team – 2 nd June 2005.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
Constellation Technologies Providing a support service to commercial users of gLite Nick Trigg.
Filesytems and file access Wahid Bhimji University of Edinburgh, Sam Skipsey, Chris Walker …. Apr-101Wahid Bhimji – Files access.
EMI INFSO-RI European Middleware Initiative (EMI) Standardization and Interoperability Florida Estrella (CERN) Deputy Project Director.
Test Organization and Management
EGI: A European Distributed Computing Infrastructure Steven Newhouse Interim EGI.eu Director.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 experience Sophie Lemaitre WLCG Workshop.
Heads in the cloud? GSM-WG at OGF31, Taipei Jens Jensen, RAL.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Grid Interoperability Shootout GridPP and NGS UK e-Science All Hands Meeting, Nottingham 2007 J Jensen, G Stewart, M Viljoen, D Wallom, S Young (contact.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow GridPP28 17 th Apr 2012 GridPP28.
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
Your university or experiment logo here Storage and Data Management - Background Jens Jensen, STFC.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
UK NGI Operations John Gordon 15 th May NGS continuation NGI Security Monitoring VOMS Helpdesk I am reacting to some issues highlighted by Jeremy.
Your university or experiment logo here GridPP Storage Future Jens Jensen GridPP workshop RHUL, April 2010.
Storage, Networks, Data Management Report on Parallel Session OSG Meet 8/2006 Frank Würthwein (UCSD)
Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow – Jamie K. Ferguson – University.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
Area Security Area Thoughts Jens Jensen OGF 26, May 2009 Chapel Hill, NC.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Your university or experiment logo here The European Landscape John Gordon GridPP24 RHUL 15 th April 2010.
INFSO-RI Enabling Grids for E-sciencE GridICE: Grid and Fabric Monitoring Integrated for gLite-based Sites Sergio Fantinel INFN.
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
EMI INFSO-RI EMI Roadmap to Standardization and DCI Collaborations Alberto Di Meglio (CERN) Project Director.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Operational Architecture of PL-Grid project M.Radecki,
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
The GridPP DIRAC project DIRAC for non-LHC communities.
EMI INFSO-RI European Middleware Initiative (EMI) Alberto Di Meglio (CERN)
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,
GridPP storage status update Joint GridPP Board Deployment User Experiment Update Support Team, Imperial 12 July 2007,
European Middleware Initiative (EMI) Alberto Di Meglio (CERN) Project Director.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
The GridPP DIRAC project DIRAC for non-LHC communities.
Ian Bird LCG Project Leader Status of EGEE  EGI transition WLCG LHCC Referees’ meeting 21 st September 2009.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Business Engagement Program for SMEs Javier Jiménez Business Development.
EGI and Data Scientists: Demand Sy Holsinger EGI.eu Senior Strategy and Policy Officer EGI Community Forum November 2015, Bari EDISON – Education.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
EMI INFSO-RI Patrick Fuhrmann EMI Data area leader At the EGI Technical Forum 2011, in Lyon EMI-Data The second year.
Championing e-infrastructure Gillian Sinclair, NGS Liaison Officer Claire Devereux, NGI Manager.
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
EMI INFSO-RI /04/2011What's new in EMI 1: Kebnekaise What’s new in EMI 1 Kathryn Cassidy (TCD)‏ EMI NA2.
EGEE-II INFSO-RI Enabling Grids for E-sciencE WLCG File Transfer Service Sophie Lemaitre – Gavin Mccance Joint EGEE and OSG Workshop.
Activities and Perspectives at Armenian Grid site The 6th International Conference "Distributed Computing and Grid- technologies in Science and Education"
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
User Domain Storage Elements SURL  TURL LFC Domain (LCG File Catalogue) SA1 – Data Grid Interoperation Enabling Grids for E-sciencE EGEE-III INFSO-RI
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
Gene Oleynik, Head of Data Storage and Caching,
Vincenzo Spinoso EGI.eu/INFN
Introduction to Data Management in EGI
EGI UMD Storage Software Repository (Mostly former EMI Software)
Presentation transcript:

Data Transfer Efficiency - leave no byte unchurned Jens Jensen Rutherford Appleton Laboratory GridPP26, U Sussex, March 2011

Background GridPP’s data grid –Distributed Storage Elements –Data movers (FTS, PhEDEx et al) –Catalogues (usu. replica) e-Infrastructure (aka cyberinfrastructure) (Presentation at ISGC)

The Data Grid WLCG is primarily a data grid –Computation can (in principle) be redone Jobs go to where data is –Moving a job is quicker than moving data

Premature Optimisation is the Root of All Evil

Postmature non-optimisation is the root of some evil The role of infrastructure code –Scientist as a programmer –“Bad” code moves up the stack? –“Bad” code improves over time? Doofers stay in prod’n

Efficiencaciousness Goals Service Availability Performance Grows as needed Robust (no SPoF?) People (Effective) support Training Expertise Availability of…

Approaches Philosophy –Get it done – WLCG –Get it done right – EGI? –Do It Perfectly The First Time… Evolutionary (control system) vs revolutionary –Proactive vs reactive

Efficiencaciousness Issues Failures –Sites – BDII, network –Elements – storage –Components – disk servers Timeouts DDoS

Efficiencaciousness Issues Overall effort –Funded, contributed, external Availability of expertise –Single Point of Knowledge Decoherence 2 nd Law of Thermodynamics Learning from incidents

Efficiencaciousness Issues Primary communication –Sites –Users: large VOs, small VOs, single users –PMB Secondary –WLCG –NGS

Efficiencaciousness Issues Sites –There Is Always A Bottleneck Somewhere –Site dependent –Usage dependent Information –Freshness –Accuracy (“spped is substute fo accurcy”)

Efficiencaciousness Issues Usage patterns –C.f. Wahid’s talk yesterday –WAN vs LAN (WN) traffic Technology –In the narrow sense (drives, controllers) –And the wider sense: dist’d filesystems Support: Upstream (EGI), Fabric

Efficiencaciousness Issues Overheads –Complexity of use of stack (see next) –Infrastructure is complex –But Complexity Has To Go Somewhere Time-to-production –Testing, troubleshooting, monitoring, tweaking, tuning

With apologies to the OSI stack

PROGRESS Particular Pain Point Principle

Progressing Forward What is progress How to measure progress

The Good News We’ve come a long way Don’t think there is a skills gap –But some SPoKs

Graeme’s talk “Get the best out of what we can afford to buy” Proactive sites better Standards are good

E[GM]I involvement EMI data roadmap –Support for dCache, DPM, StoRM –Support for standards (NFS4, CDMI) But then –StoRM=INFN, dCache=DESY, DPM=CERN

The Cloud View Supplement resources with on-demand Agile CDMI is superset of SRM –But using ReST+JSON, not SOAP

(Open) Standards Standards promote interoperation and stability Interoperation Multiple (independent) implementations –Both Java and (C or C++)

The Case for Non-HEP Data Benefit from non-HEP data –Outreachy stuff –Benefit to society (eg saving lives) NGI interop (at compute) Others…

SUMMARY

Efficiencaciousness Goals Service Availability Performance Grows as needed Robust (no SPoF?) People (Effective) support Training Expertise Availability of…