Brian L. Tierney, Dan Gunter BLTierney@lbl.gov, DKGunter@.lbl.gov NetLogger Summarizer and the nlioperf tool Brian L. Tierney, Dan Gunter BLTierney@lbl.gov, DKGunter@.lbl.gov Lawrence Berkeley National Laboratory http://dsd.lbl.gov/NetLogger Center for Enabling Distributed Petascale Science http://www.cedps-scidac.net
NetLogger Toolkit We have developed the NetLogger Toolkit (short for Networked Application Logger), which includes: tools to make it easy for distributed applications to log interesting events at every critical point NetLogger client library (C, Java, Perl, Python) tools for host and network monitoring event visualization tools that allow one to correlate application events with host/network events NetLogger event archive and retrieval tools (new) NetLogger combines network, host, and application-level monitoring to provide a complete view of the entire system. Open Source, available at http://dsd.lbl.gov/NetLogger/
NetLogger Analysis: Key Concepts NetLogger visualization methodology is based on time correlated and object correlated events. In order to associate a group of events into a “lifeline”, you must assign an “Event ID” to each NetLogger event Sample Event ID: file name, block ID, frame ID, etc.
Time-based summarization Imagine you want detailed instrumentation of a simple loop that reads data, processes it, then writes out some result. DO for i=1, N log(“read.start”, ...) read_data() log(“read.end”, ...) process_data() log(“write.start”, ...) write_data() log(“write.end”, ...) DONE
Time-based summarization Reporting only a periodic summary (mean, sd) of times and values between “start” and “end” events (of a given type and ID). .. to here Go from here.. Orders of magnitude data reduction with, for many purposes, no loss in explanatory power. Since the original instrumentation is still there, you can “turn on” and off the full detail in a running program.
Sample Use: GridFTP Bottleneck Detector GridFTP today By default only logs single throughput number GridFTP throughput might be limited by Sending/receiving disk NFS/AFS mounted partition? network currently no way to tell Experimental GridFTP enhancement Use NetLogger summarization library to monitor all I/O streams
nlioperf Started out as a test tool for the NetLogger summarization library Actually quite useful test tool Supports multiple TCP streams file input/output TCP buffer tuning bottleneck detection multiple levels of logging full, interval summary, final summary only
nlioperf: network vs disk performance
Bottleneck Detection: WAN Transfer
Ideas Add NetLogger summarizer to iperf? Add disk testing to iperf
More Information NetLogger 4.0beta just released includes new summarization Library: http://dsd.lbl/gov/NetLogger email: DKGunter@lbl.gov or BLTierney@lbl.gov