Grid High-Performance Networking Research Group Jon Crowcroft University of Cambridge, UK Nagi Rao ORNL, USA Volker Sander Forschungszentrum Jülich, Germany
Agenda for Meeting Brief Charter Review Brain-storming about Top-N things documents - Existing "Top ten things network engineers wish grid programmers knew" document - Open discussion about things grid programmers wish network engineers knew -Discussion about future directions - What happens after the “top ten" documents? - Presentations by -David Martin -Michael Welzl - Reevaluate level of interest within the Grid community - Identify people willing to actively work within the group - Identify and approve new milestones - Discuss relation to other groups and organizations
GHPN History Three Birds-of-a-Feather sessions GGF-2 (July 2001) 59 people attended Broad discussion Enough interest to pursue towards Working/Resource Group GGF-3 (October 2001) General agreement that the group should be a GGF Research Group Network measurement would be the responsibility of the NM-WG Agreement on the importance of a liaison to external groups GGF-4 (February 2002) QoS BOF with several presentations Preparation for RG charter Officially established as a Research Group
GHPN Research Group GHPN being a Research Group means Indefinite Lifespan GHPN is supposed to be a long term approach Recommendation documents are not in the scope of the group, but GHPN intends to produce Informational Experimental, or community practice documents Might propose the creation of focused working groups (reflected by the Charter) QoS Resource Management …
GHPN Charter The Grid High-Performance Networking Research Group focuses on the relationship between network research and Grid application and infrastructure development. The objective of GHPNRG is to bridge the gap between the networking and grid research communities. It accomplishes its goal by serving as a forum for information exchange on advances and requirements in both fields, as well as by providing a focal point for liaison activities between the GGF and the various networking standards bodies. Specific topics of interest include, but are not limited to: End-to-end performance High-performance transport protocols Emerging network technologies The interface between Grid applications and network services Deployment of new technologies on the Internet and overlay networks
GHPN Charter The GHPNRG provides a forum for such topics until sufficient maturity and interest to both communities is reached that naturally results in the formation of a separate WG or RG to pursue them further. The discussions of GHPN are carried out both during the meetings and on the GHPN mailing list. Two specific goals of the GHPNRG are identifying: - grid application requirements and implementations that are not supported or understood by the networking community and - advanced networking features that are not being utilized by grid applications.
Top Ten Things Network Engineers wish Grid Programmers knew Authored by Jon Crowcroft Presented during GGF-5 Intention: Initiate a Discussion This document gives a list ongoing efforts It is supposed to engage people to join them Not an GWD document yet, but supposed to be input for a GWD-I document Background: Based on IETF view The document is a list of topics and references It needs more explanatory text A conclusion for GRID Application Developers will be added
Top Ten Things Network Engineers wish Grid Programmers knew Topic 1: Congestion Control vs. QoS It is about sharing capacity (Fairness) Without QoS, congestion control is not optional! If Grid applications break these features, some ISPs might disconnect them Things might relax soon AIMD is not the only solution to fair, new features are coming up (e.g. ECN) but do not create the saw-tooth Note: Reliable Multicast for data replication follows the idea of TCP Topic 2: Routing Fast forwarding is there Clusters can do 10Gbps Firewalls can do better than they currently do Faster convergence (possible) Policies are hard! Influence what class of packets gets routed where to Getting the global view is tough
Topic 3: Packet sizes Packet sizes do have an impact on the experienced service Tendency was to push up the MTU (Jumbo Frames) The GRID is a global environment; MTU is that of the weakest link Path MTU discovery automates this Multicast MSS is a real problem Sub-IP packet size (Cells) might be a consideration Topic 4: Overlays Routing overlay du jour is the Resilient Overlay Networks (RON) from MIT Improves the robustness and availability of Internet paths between hosts RON nodes monitor the functioning and quality of the Internet paths among themselves, and use this information to decide whether to route packets directly over the Internet or by way of other RON nodes, optimizing application- specific routing metrics. Auto-magic way to build VPNs P2P are slightly different. Problems with locality and metrics Not the tool for low-latency file access Top Ten Things Network Engineers wish Grid Programmers knew
Topic 5: QoS Would be a nice thing, even with 64 Lambdas (10GigE at cluster sites) QoS is not just Policing and Scheduling, it is also AAA QBSS is a good idea (fairness) Topic 6: Multicast Tier1 routing works, multi-domain is getting better There are candidates for reliable multicast Topic 7: Operating Systems There are some good tuning guides New performance features (self selecting NICs, zero copy stack) Topic 8: Layer 2 Considerations Layer 2 reliability and flow control makes life hard for IP and TCP engineers Non-Broadcast Multiple Access (NBMA) Networks are hard Signaling even harder Top Ten Things Network Engineers wish Grid Programmers knew
Topic 9: Light vs. Heavyweight Protocols One might think of writing a new application level protocol based on raw IP Packet templates can reduce code complexity Put transport code to kernel Topic 10: Macroscopic Traffic and System Considerations Self similarity does not really matter Traffic phase effects do matter Flash crowds Discussion indicated two things two add Add topic 0: Firewalls OS: Enrichment of protocol APIs (better fit to protocol parameters) Top Ten Things Network Engineers wish Grid Programmers knew
Questions/Comments during the Meeting at GGF-5 Is it fair that application developers cannot treat the network as a black box? The network is (at its current stage) not transparent! QoS could be a solution to this, i.e the network becomes a manageable Grid resource Why is routing import to Grid developers? It is important if you want HA People are working on this When you ask Grid developers about their need: they often do not know it… The intention of this RG is to address this issue We want to minimize the required knowledge about network capabilities What should applications do when something goes wrong? Better feedback mechanisms are needed (difference between ISPs and NRNs) Latency vs. Bandwidth Bottlenecks
Top Ten Things Grid Programmers wish Network Engineers knew Topic 1: There are end-systems which source the traffic Protocol specification vs. API specification Grid programmers care about APIs with a well defined semantic Topic 2: Automated Socket-buffer tuning is currently not convenient In current implementations you are bound to a factor of two Sockets might be internally used, so no interface is exposed MPI Topic 3: The Grid is about services!!!! We already deal with SLAs What is about network services? Service differentiation Policies Integration of Control Plane management Grid Applications need Premium and Bandwidth on Demand services
Top Ten Things Grid Programmers wish Network Engineers knew Topic 4: The myth of overprovisioned networks GRID application heavily consume bandwidth QBSS is intended to consume whatever is available Flexible, adaptive Bandwidth on Demand is needed Goodput vs. Capacity (i.e. Bootlenecks) Security Firewalls Topic 5: Firewalls are hard! Which ports should I use? A single point of entry might become a bottleneck Port usage Guide GRID-aware firewalls Grid-ftp Firewalls do impact the level of service
Top Ten Things Grid Programmers wish Network Engineers knew Topic 6: Better instrumentation needed Event services Events are common practice in Grid environments Adaptive applications can help to optimize their level service the economic use of the underlying resource Localization of bottlenecks and points of errors Pacing What is the best way to effectively use a guaranteed bandwidth? Topic 7: VOs vs. VPNs If GRIDs are built for VOs, what is about overlay structures that supports this overlay? Does this affect the security model? VPNs with a particular level of service?
Top Ten Things Grid Programmers wish Network Engineers knew Topic 8: Specification vs. deployment Things have to be available to become of use The Grid is heterogeneous End-to-end services are of interest Topic 9: Advance Reservation is important Network services should be reservable in advance Is the network a partitioned resource? Liaison to the NSIS IETF WG? The Next Steps in Signaling Working Group is responsible for standardizing an IP signaling protocol with QoS signaling as the use case for NSIS Intelligent networks try to handle everything on the fly Topic 10: Grid Programmers are concerned about their application Application developers do not want to be networking expert Application developers do not want to have the knowledge about routing Volunteers for writing this document George Brett, Jim Ferguson, Thilo Kielmann, and Volker Sander Anyone else?
Getting Involved GHPN-RG is part of the data area Mailing list is “subscribe ghpn-wg” to Webpage is Join mailing list and participate Goal is to improve communication between cultures. Need volunteers to help!
Future Directions Focused working tasks derived from the top-ten docs? API, Net100 Port usage document, firewall issues QoS in the context of Grid RM Presentations by David Martin Michael Welzl Reevaluate level of interest within the Grid community Identify people willing to actively work within the group Identify and approve new milestones Discuss relation to other groups and organizations