CEDPS: Center for Enabling Distributed Petascale Science Brian Tierney Lawrence Berkeley National Laboratory
2 CEDPS in a Nutshell Center for Enabling Distributed Petascale Science CEDPS – “seds” (silent P) DOE SciDAC Center for Enabling Technology July 1, 2006 – June 30, 2011, $2.4M/yr Collaboration between 5 sites Argonne National Laboratory Fermi National Laboratory Lawrence Berkeley National Laboratory USC Information Sciences Institute University of Wisconsin Madison Three focus areas Moving data to compute resources Moving compute services to data sites Troubleshooting and diagnosis tools
3 The Petascale Data Challenge DOE facilities generate many petabytes of data (2 petabytes = all U. S. academic research libraries!) Massive data U U U U U DOE facilities Remote users (at labs universities, industry) need data! Rapid, reliable access key to maximizing value of $B facilities U Remote distributed users U U
4 Reliable: recover from many failures Predictable: data arrives when scheduled Secure: protect expensive resources & data Scalable: deal with many users & much data Bridging the Divide (1): Move Data to Users When & Where Needed C B A Fast: >10,000x faster than usual Internet “Deliver this 100 Terabytes to locations A, B, C by 9am tomorrow”
5 Flexible: easy integration of functions Secure: protect expensive resources & data Scalable: deal with many users & much data Bridging the Divide (2): Allow Users to Move Computation Near Data A Science services: provide analysis functions near data source “Perform my computation F on datasets X, Y, Z” Y Z X F
6 Instrument: include monitoring points in all system components Monitor: collect data in response to problems Diagnose: identify the source of problems Bridging the Divide (3): Troubleshoot End-to-End Problems C B A “Why did my data transfer (or remote operation) fail?” Identify & diagnose failures & performance problems
CEDPS Troubleshooting Work
8 The Troubleshooting Problem Large production Grids (OSG, TeraGrid, etc.) report a high failure rate 10-30% of jobs submitted to the Grid fail mostly authentication errors and disk space problems Users don’t always notice, as jobs may be automatically resubmitted and may succeed the next time Troubleshooting in this environment is very difficult Current Approach Log into all hosts used (if possible) ‘grep’ various log files looking for problems Inconsistent logging levels Multiple file formats Often a tedious and time consuming problem
9 Our Approach Discover broken services before users do Deploy monitoring software that can perform alerts on errors Run test jobs to detect problems When need to do log analysis Use a unified approach to logging for applications and middleware Collect logs centrally Develop automatic analysis tools for anomaly detection
10 Logging Solution: “NetLogger” Methodology NetLogger (short for “Networked Application Logger”) Methodology for troubleshooting distributed applications Troubleshooting = detection and analysis of faults and performance issues Key Components: common log format: name=value pairs precision ISO-format timestamp and synchronized clocks (via NTP) wrap all “interesting” actions with ‘start’ and ‘end’ log events interesting = anything that might fail or run slow use unique event names e.g.: org. event=org.globus.gridFTP.authn.start include “event ID” in each log message allows correlation of sets of events collect logs in a centralized database
11 Example Log: GridFTP ts= T18:39: Z event=org.globus.gridFTP.start prog=GridFTP localhost=myhost remoteHost=somehost.gov:56010 serverMode=inetd guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 ts= T18:39: Z event=org.globus.gridFTP.authn.start DN=“/DC=org/DC=doegrids/OU=People/CN=Somebody” guid=1DDF1F3D-A677-4DBC-8C4E- 6A8A3B252AE3 ts= T18:39: Z event=org.globus.gridFTP.authn.end DN=“/DC=org/DC=doegrids/OU=People/CN=Somebody” msg=“ successfully authorized” localUser=uscmspool381 guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 status=0 ts= T18:39: Z event=org.globus.gridFTP.transfer.start file=/tmp/myfile tcpBufferSize=128KB dataBlockSize= numStreams=1 numStripes=1 destHost= guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 ts= T18:45: Z event=org.globus.gridFTP.transfer.end file=/tmp/myfile bytesTransferred= guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 status=0 ts= T18:45: Z event=org.globus.gridFTP.end guid=1DDF1F3D-A677-4DBC- 8C4E-6A8A3B252AE3 status=226
12 No need to invent something new for this syslog-ng tool fills all requirements Open source, runs on all major OSes Fault tolerant, secure (via stunnel), scalable, easy to configure, etc. Large user base Can filter logs based on level and content Arbitrary number of sources and destinations Can act as a proxy, tunnel thru firewalls Execute programs: Send , load database, etc. Built-in log rotation Timezone support and Fully qualified host names Log Collection
13 Log collection using syslog-ng
14 Sample Site Deployment For Grid Middleware
15 Troubleshooting Architecture: Site Archive
Log Summarization Library: New Component of the NetLogger Toolkit
17 NetLogger Toolkit Client library that to make it easy to generate logs C, Java, Perl, Python First released in 1994 Summarizer new in NetLogger 4.0 release, Nov Open Source available at Visualization Support scripts to convert logs to R, gnuplot, Excel format collection of sample R and gnuplot scripts Database tools tools to load logs into mySQL
18 New NetLogger Feature: Time-based summarization DO for i=1, N log(“read.start”,...) read_data() log(“read.end”,...) process_data() log(“write.start”,...) write_data() log(“write.end”,...) DONE Example: you want detailed I/O instrumentation of a simple loop that reads data, processes it, then writes out some result.
19 Go from here.. Time-based summarization Full logging can produce many gigabytes of logs! Summarized Logs report only a periodic summary (mean, sd) of times and values between “start” and “end” events (of a given type and ID). Orders of magnitude data reduction with, for many purposes, no loss in explanatory power. Since the original instrumentation is still there, you can “turn on” and off the full detail in a running program.
20 Sample Use: GridFTP Bottleneck Detector GridFTP today Only logs single throughput number Throughput might be limited by Sending/receiving disk NFS/AFS mounted partition? Network Currently no way to tell New GridFTP enhancement Use log summarization library to monitor all I/O streams
21 Full vs Summary Results Bottleneck = disk read
22 More Information General CEDPS information: Log Summarizer: New GridFTP with NetLogger Log Summarizer: Contact us if you need troubleshooting help!
Extra Slides
24 NetLogger Summarizer vs ‘R’ summarized logs
25 Logging “Best Practices” Recommendations Practices All logs should contain a unique event name and an ISO- format timestamp All system operations that might fail or experience performance variations should be wrapped with start and end events. All logs from a given execution thread should be tagged with a globally unique ID (or GUID), such as a Universal Unique Identifiers (UUIDs) Log format Logs should be composed of lines of ASCII name=value pairs Example: ts= T18:48: Z event=org.globus.gridFTP.transfer.start prog=GridFTP-v4.2 guid=1DDF1F3D-A677-4DBC-8C4E-6A8A3B252AE3 file=filename src.host=H1 src.port=P1 dst.host=H2 dst.port=P2
26 Event Names Use a '.' as a separator and go from general to specific Same as Java class names First part of name should be used as a unique namespace (e.g.: org.globus) Use start/end suffixes whenever possible Helps immensely with troubleshooting Examples org.globus.gridFTP.start org.globus.gridFTP.authn.start org.globus.gridFTP.authn.end org.globus.gridFTP.transfer.start org.globus.gridFTP.transfer.end org.globus.gridFTP.end –org.globus.MDS.response.start –org.globus.MDS.query.start –org.globus.MDS.query.end –org.globus.MDS.write.net.start –org.globus.MDS.write.net.end –org.globus.MDS.response.end
27 Globally Unique IDs Use the ‘guid’ or ‘id’ reserved name to allow correlation of a set of events together event=org.globus.gridFTP.authn.start id=27023 event=org.globus.gridFTP.authn.end id=27023 event=org.globus.gridFTP.transfer.start id=27023 event=org.globus.gridFTP.transfer.end id=27023 Can use standard unix/windows program ‘uuidgen’ to generate globally unique ID e.g.: A5A563CD-D80C-4E58-9ECD-79C6B611E122
28 GridFTP: network vs disk performance
29 Syslog-ng Deployment for OSG