Connect. Communicate. Collaborate PERT Performance Enhancement & Response Team Toby Rodwell, DANTE Joint Techs, New Mexico USA 7 February 2006
Connect. Communicate. Collaborate Agenda Background PERT Organisation PERT Systems Experience to date The future for the PERT
Connect. Communicate. Collaborate Motivation Historically, long distance circuits (the “wide-area”) have been the bottleneck in a network In recent years, the capacity of long distance circuits has significantly increased End-to-end performance bottle-necks may now occur at any point in a system – end-system (application, OS, hardware), LAN or WAN As such, it is becoming more and more difficult for a non-expert end-user to diagnose their network performance issues
Connect. Communicate. Collaborate Origins of the PERT Conception of the PERT … Jan 01 Internet2 Meeting –A support structure to investigate and resolve problems in the performance of applications over computer networks –Comparable to CERT structure Realization of the PERT … Dec 2002 TERENA meeting –6 European NRENs (GARR, SWITCH, CESnet, HEAnet and UKERNA) and DANTE committed to a practical trial of a basic PERT
Connect. Communicate. Collaborate The PERT Today Permanently staffed (8x5) –CARNET, CESNET, FCCN, GARR, HUNGARNET, PSNC, RENATER, SWITCH PERT Ticket System (PTS) PERT Knowledgebase (PERT-KB) PERT documentation –PERT Troubleshooting Guide –Performance User Guide and Best Practice Guide
Connect. Communicate. Collaborate PERT Organization PERT Managers –Toby Rodwell (DANTE), Simon Leinen (SWITCH) Case Managers –Duty Case Managers (weekly changing) –Special Case Managers Subject Matter Experts –Unfunded volunteers –From NRENs, academia and industry –Work on an ‘as and when’ basis
Connect. Communicate. Collaborate PERT Systems PERT Knowledgebase (PERT KB) – –Updated on an ongoing basis PERT Ticket System (PTS) – –Currently Version 1 –Design of Version 2 about to start
Connect. Communicate. Collaborate Experiences to date Low number of PERT cases –Plan to open up access to the PERT Most of these concerned with low throughput over large BDP (Bandwidth Delay Product) paths –Often US to Europe –Even packet loss of 0.002% has significant impact
Connect. Communicate. Collaborate General Findings For long distance, high data rate transfers –Default TCP settings insufficient (buffers too small) –Jumbo packets are important (GEANT (and now GEANT2) promoting the Abilene recommendation of 9000 byte IP packets) Other common performance problem –Mismatched Ethernet duplex settings –(Not seen so far by GN2 PERT, but a known problem)
Connect. Communicate. Collaborate The Future of the PERT Involve more volunteer Subject Matter Experts Gain more experience –Investigating more cases –Using new tools (perfSONAR) Disseminate the lessons learned Promote the PERT as a NOC function Disestablish the centralised PERT
Connect. Communicate. Collaborate Any Questions? Toby Rodwell, DANTE
Connect. Communicate. Collaborate Contacting the PERT To date, only authorised Primary customers are able to submit a request for help –Primary Customers are NRENS and some pan- European projects The PERT has recommended that access be widened to allow anyone to submit a case (the DCM would filter and prioritise requests) –Under consideration
Connect. Communicate. Collaborate Case Managers Special Case Managers (SCMs) –For complex or drawn out cases a Special Case Manager (SCM) will take over and manage a case through to its closure –Work on a ‘best efforts’ basis Duty Case Managers –Weekly-changing –Receive requests, assess eligibility, open new cases, set priorities and start investigations –Responsible for progressing all cases without an SCM, or where the SCM is unavailable