Use Cases. Summary Define and understand slow transfers – Identify weak links, narrow down the source – Understand what perfSONAR measurements mean wrt.

Slides:



Advertisements
Similar presentations
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Advertisements

WLCG Operations and Tools TEG Monitoring – Experiment Perspective Simone Campana and Pepe Flix Operations TEG Workshop, 23 January 2012.
WLCG Interaction Stefan Roiser LHCb Computing Operations Workshop 27 Jan ‘15.
Update on OSG/WLCG perfSONAR infrastructure Shawn McKee, Marian Babik HEPIX Spring Workshop, Oxford 23 rd - 27 th March 2015.
Integrating Network and Transfer Metrics to Optimize Transfer Efficiency and Experiment Workflows Shawn McKee, Marian Babik for the WLCG Network and Transfer.
PerfSONAR in ATLAS/WLCG Shawn McKee, Marian Babik ATLAS Jamboree / Network Section 3 rd December 2014.
Proximity service Main idea – provide “glue” between experiments and sonar topology – mainly map sonars to storages and vice versa – determine existing.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Summary of issues and questions raised. FTS workshop for experiment integrators Summary of use  Generally positive response on current state!  Now the.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Transport Layer 4 2: Transport Layer 4.
Integration Program Update Rob Gardner US ATLAS Tier 3 Workshop OSG All LIGO.
Network and Transfer WG Metrics Area Meeting Shawn McKee, Marian Babik Network and Transfer Metrics Kick-off Meeting 26 h November 2014.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP35, Liverpool 11 Sep 2015.
The production deployment of IPv6 on WLCG David Kelsey (STFC-RAL) CHEP2015, OIST, Okinawa 16 Apr 2015.
3: Transport Layer3-1 Where we are in chapter 3 Last time: r TCP m Reliable transfer m Flow control m Connection management r principles of congestion.
Network and Transfer Metrics WG Meeting Shawn McKee, Marian Babik Network and Transfer Metrics WG Meeting 8 th April 2015.
WLCG Service Report ~~~ WLCG Management Board, 1 st September
WLCG Collaboration Workshop 7 – 9 July, Imperial College, London In Collaboration With GridPP Workshop Outline, Registration, Accommodation, Social Events.
Update on OSG/WLCG Network Services Shawn McKee, Marian Babik 2015 WLCG Collaboration Workshop 12 th April 2015.
MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons 10/12/2014.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
Transport Layer3-1 TCP throughput r What’s the average throughout of TCP as a function of window size and RTT? m Ignore slow start r Let W be the window.
Update on WLCG/OSG perfSONAR Infrastructure Shawn McKee, Marian Babik HEPiX Fall 2015 Meeting at BNL 13 October 2015.
Julia Andreeva, CERN IT-ES GDB Every experiment does evaluation of the site status and experiment activities at the site As a rule the state.
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
Network and Transfer Metrics WG Meeting Shawn McKee, Marian Babik Network and Transfer Metrics WG Meeting 18 h March 2015.
LHCb T2D sites A.Tsaregorodtsev, CPPM. Why T2D sites for LHCb  The T2D concept introduced in 2013  to allow non-T1 country sites to controbute storage.
WLCG Network and Transfer Metrics WG After One Year Shawn McKee, Marian Babik GDB 4 th November
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Network and Transfer WG perfSONAR operations Shawn McKee, Marian Babik Network and Transfer Metrics WG Meeting 28 h January 2015.
WLCG Service Report ~~~ WLCG Management Board, 7 th September 2010 Updated 8 th September
Update on Network and Transfer Metrics WG Shawn McKee, Marian Babik GDB 8 th October 2014.
PerfSONAR Update Shawn McKee/University of Michigan LHCONE/LHCOPN Meeting Cambridge, UK February 9 th, 2015.
Julia Andreeva on behalf of the MND section MND review.
PerfSONAR for LHCOPN/LHCONE Update Shawn McKee/University of Michigan LHCONE/LHCOPN Meeting Amsterdam, NL October 28 th, 2015.
Data Placement Intro Dirk Duellmann WLCG TEG Workshop Amsterdam 24. Jan 2012.
PIC port d’informació científica EGEE – EGI Transition for WLCG in Spain M. Delfino, G. Merino, PIC Spanish Tier-1 WLCG CB 13-Nov-2009.
Network Awareness and perfSONAR Why we want it. What are the challenges? Where are we going? Shawn McKee / University of Michigan OSG AHG - US CMS Tier-2.
WLCG Latency Mesh Comments + – It can be done, works consistently and already provides useful data – Latency mesh stable, once configured sonars are stable.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Next Steps after WLCG workshop Information System Task Force 11 th February
WLCG Operations Coordination Andrea Sciabà IT/SDC 10 th July 2013.
Sep 17, 20081/16 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Sep 17, 2008 Gabriele Garzoglio.
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.
WLCG Service Report ~~~ WLCG Management Board, 17 th February 2009.
Using Check_MK to Monitor perfSONAR Shawn McKee/University of Michigan North American Throughput Meeting March 9 th, 2016.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPIX, BNL 13 Oct 2015.
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
WLCG Accounting Task Force Update Julia Andreeva CERN GDB, 8 th of June,
GGUS summary (4 weeks) VOUserTeamAlarmTotal ALICE5016 ATLAS CMS6118 LHCb Totals
WLCG Operations Coordination report Maria Dimou Andrea Sciabà IT/SDC On behalf of the WLCG Operations Coordination team GDB 12 th November 2014.
Campana (CERN-IT/SDC), McKee (Michigan) 16 October 2013 Deployment of a WLCG network monitoring infrastructure based on the perfSONAR-PS technology.
Site notifications with SAM and Dashboards Marian Babik SDC/MI Team IT/SDC/MI 12 th June 2013 GDB.
WLCG Transfers monitoring EGI Technical Forum Madrid, 17 September 2013 Pablo Saiz on behalf of the Dashboard Team CERN IT/SDC.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
Operations Coordination Team Maria Girone, CERN IT-ES GDB, 11 July 2012.
Accounting Review Summary and action list from the (pre)GDB Julia Andreeva CERN-IT WLCG MB 19th April
PerfSONAR operations meeting 3 rd October Agenda Propose changes to the current operations of perfSONAR Discuss current and future deployment model.
Shawn McKee, Marian Babik for the
WLCG Network Discussion
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
WLCG Accounting Task Force Update Julia Andreeva CERN WLCG Workshop 08
Alerting/Notifications (MadAlert)
WLCG and support for IPv6-only CPU
Presentation transcript:

Use Cases

Summary Define and understand slow transfers – Identify weak links, narrow down the source – Understand what perfSONAR measurements mean wrt. transfer systems concepts (throughput) Uniform way to access and integrate existing network measurements – Define topology in a common way (map to sites, map storage to sonars) – Make it possible to create a cost matrix (from multiple sources) – Integration with experiment’s tools/systems Coordinated response to the network performance issues (ATLAS) – Define procedure to assign ownership and responsibility for throughput issues Baseline existing links (full mesh), help commission new ones Details at: – I/edit

TCP throughput following Mathis, et al (1997) where, MSS – maximum segment size C – constant – lumpsum for several terms (TCP implementation, ACK strategy, loss mechanism) RTT – round-trip time (latency) p – packet-loss

TCP throughput and perfSONAR iperf3/iperf tracepath/PMTU, bwping, traceroute owping/owdelay one way latency, jitter owping/owdelay one way packet-loss

WG structure Define and understand slow transfers – perfSONAR commissioning: support unit, follow up with sites, debugging issues (submitting bug reports to ESNet) – perfSONAR central configuration and mesh management – Training – FTS performance study (Saul) Uniform way to access and integrate existing network measurements – Define topology in a common way – proximity service – Common API – OSG Datastore, publishing results to MQ – Integration –LHCb perfSONAR to DIRAC pilot project Coordinated response to the network performance issues (ATLAS) – WLCG Network Throughput SU and underlying procedure Baseline existing links (full mesh), help commission new ones – Establishing WLCG-wide meshes – Running core networking meshes (LHCOPN/LHCONE) to help debug/test new sites Status report for each topic at every meeting, minutes sent to all WG members

Status Define and understand slow transfers – perfSONAR commissioning: support unit, follow up with sites, debugging – Done – perfSONAR central configuration and mesh management - Done – Training - official one available, one session as part of the WG in March, another one planned – FTS performance study (Saul) – on-going - more details today Uniform way to access and integrate existing network measurements – Define topology in a common way – proximity service - prototype available – Common API – OSG Datastore, publishing results to MQ - OSG Datastore status today, MQ publisher works fine in testing – Integration – LHCb perfSONAR to DIRAC pilot project (started) - ATLAS, CMS, ALICE TBD Coordinated response to the network performance issues (ATLAS) – WLCG Network Throughput SU and underlying procedure - Done, some cases already investigated – RAL, SARA, but we need to get more cases from experiments Baseline existing links (full mesh), help commission new ones – Establishing WLCG-wide meshes (started) – Running core networking meshes (LHCOPN/LHCONE) to help debug/test new sites (Done)

Next steps WG review in Sept/Oct Define and understand slow transfers – perfSONAR commissioning – validate 3.5RC – perfSONAR central configuration and mesh management (Done) – Training – common session with ESNet (as pre-GDB) – FTS performance study (Saul) – TBD today Uniform way to access and integrate existing network measurements – Define topology in a common way – proximity service – evolve to production service – Common API –OSG Datastore TBD today, MQ publishing – discuss SLA with OSG and move it to ITB and then production service – Integration – LHCb: perfSONAR to DIRAC pilot project, TBD for ATLAS, CMS, ALICE Coordinated response to the network performance issues (ATLAS) – WLCG Network Throughput SU and underlying procedure - need to get more cases from experiments; review contacts in the SU Baseline existing links (full mesh), help commission new ones – Establishing WLCG-wide meshes – Running core networking meshes (LHCOPN/LHCONE) to help debug/test new sites (Done)