National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 and Logistical Networking Jim Ferguson, Sr. Technical Program Manager National Center for Supercomputing Applications University of Illinois at Urbana-Champaign 16 th APAN Meetings – Busan, South Korea – August 28, 2003
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Outline What is Web100? Web100 Diagnostic Tools Web100 Distribution ( How this relates to Logistical Networking Web100 userland Library
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications The Web100 Project When there is a problem, just ask TCP TCP has an ideal vantage point –Can identify the bottleneck subsystem –Already measures the network – Can measure the application – Can adjust itself
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Components Kernel Instrument Set (KIS) Diagnostic Tools Autotuning Widely distributed Open Source TCP ESTATS MIB Promote vendor adoption
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Diagnostic Tools User mode tools to prove core functionality Template for future tool developers KIS validation Portable library to hide OS details
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications TCP Autotuning Supercedes today’s controls for experts –Eliminate primary cause of the “wizard gap”. –New TCP buffer management model. –TCP just gets it right without controls. –Paper submitted for publication (M.Mathis, J.Heffner).
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Software Distribution Open source –Linux kernel patches –User mode Tools –Contributed Software Active user/developer support by the Web100 team Download from !
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications IETF Standards TCP extended statistics MIB Adds detailed per connection statistics Standard TCP instruments and controls Most recent draft submitted early March – tsvwg-tcp-mib-extension-03.txt –TSV WG work item –Key vendors are already participating
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Kernel Instrument Set (KIS) TCP Instruments prototyped in Linux 2.4. Simple API via /proc. Instrument groups: –Options and State –IP Traffic and Throughput –Triage –Congestion Events –Network Path properties –TCP API Usage –TCP Parameters –WAD Parameters
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Why on Logistical Networks? Autotuned, fast, safe TCP for data transfers. Works underneath ‘data management’ layer of logistical networking schemes. Allows quick diagnosis of TCP anomalies via access to TCP statistics. Autotuned, fast, safe TCP for data transfers.
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Userland Library Provides useful abstraction and common inter-application functionality Base abstraction is the agent (from SNMP terminology). Other abstractions include the group, variable, and connection. Includes taking snapshots and generating log files
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Userland Library Cont’d Will handle different operating systems’ methods for exposing Web100 variables (Linux’s /proc vs. BSD’s sysctl(3)) Userland 2.0’s library’s programming interface will change, but old applications will not break as 1.3 and 2.0 may coexist on the same computer
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Userland GUI Access, display, and control (where applicable) values of Web100 variables List connections and related process info Triage pie chart shows the source of congestion: Sender, Receiver, Path Uses GTK2/GTK1 (common to all standard Linux distributions)
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Userland GUI Cont’d Display: –continually list values of all variables –graph those of interest –triage analysis per connection Control: –toggle auto-tuning per connection –set tunable variables: LimCwnd, LimRwin
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Thank You! Please with any questions or problems with the Userland feature and improvement suggestions welcome!