Brian L. Tierney, Dan Gunter

Slides:



Advertisements
Similar presentations
GPW2005 GGF Techniques for Monitoring Large Loosely-coupled Cluster Jobs Brian L. Tierney Dan Gunter Distributed Systems Department Lawrence Berkeley National.
Advertisements

Using NetLogger and Web100 for TCP analysis Data Intensive Distributed Computing Group Lawrence Berkeley National Laboratory Brian L. Tierney.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
©Brooks/Cole, 2003 Chapter 7 Operating Systems Dr. Barnawi.
F Fermilab Database Experience in Run II Fermilab Run II Database Requirements Online databases are maintained at each experiment and are critical for.
Oracle9i Database Administrator: Implementation and Administration
PPOUG, 05-OCT-01 Agenda RMAN Architecture Why Use RMAN? Implementation Decisions RMAN Oracle9i New Features.
CEDPS: Center for Enabling Distributed Petascale Science Brian Tierney Lawrence Berkeley National Laboratory
Office of Science U.S. Department of Energy Troubleshooting Data Movement Dan Gunter LBNL.
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.
Scalable Integrated Performance Analaysis of Multi-Gigabit Networks Ezra Kissel, U. Delaware Ahmed El-Hassany, Guilherme Fernandes, Martin Swany, Indiana.
SOS EGEE ‘06 GGF Security Auditing Service: Draft Architecture Brian Tierney Dan Gunter Lawrence Berkeley National Laboratory Marty Humphrey University.
11 SYSTEM PERFORMANCE IN WINDOWS XP Chapter 12. Chapter 12: System Performance in Windows XP2 SYSTEM PERFORMANCE IN WINDOWS XP  Optimize Microsoft Windows.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
The future of MINC Robert D. Vincent
Research and Educational Networking and Cyberinfrastructure Russ Hobby, Internet2 Dan Updegrove, NLR University of Kentucky CI Days 22 February 2010.
Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
NetLogger GGF Distributed Application Analysis and Debugging using NetLogger v2 Lawrence Berkeley National Laboratory Brian L. Tierney.
1 Measuring Congestion Responsiveness of Windows Streaming Media James Nichols Advisors: Prof. Mark Claypool Prof. Bob Kinicki Reader: Prof. David Finkel.
BNL’s Network diagnostic tool IPERF was used and combined with different strategies to analyze network bandwidth performance such as: -Test with iperf.
George Kola Computer Sciences Department University of Wisconsin-Madison DiskRouter: A Mechanism for High.
LEGS: A WSRF Service to Estimate Latency between Arbitrary Hosts on the Internet R.Vijayprasanth 1, R. Kavithaa 2,3 and Raj Kettimuthu 2,3 1 Coimbatore.
Local Monitoring at SARA Ron Trompert SARA. Ganglia Monitors nodes for Load Memory usage Network activity Disk usage Monitors running jobs.
1 Figure 10-4: Intrusion Detection Systems (IDSs) HOST IDSs  Protocol Stack Monitor (like NIDS) Collects the same type of information as a NIDS Collects.
Using Event Viewer Event Levels Creating Custom Views Windows Logs Monitoring Performance.
The Million Point PI System – PI Server 3.4 The Million Point PI System PI Server 3.4 Jon Peterson Rulik Perla Denis Vacher.
Paperless playlist for broadcasting unit. Concept Main idea is to remove the printed paper playlist of the channel and replace it with software The software.
NetLogger Using NetLogger for Distributed Systems Performance Analysis of the BaBar Data Analysis System Data Intensive Distributed Computing Group Lawrence.
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Roll Out I2 Members Meeting.
London Connected Systems User Group – Feb “Instrument and Diagnose your BizTalk Solution in an efficient Way” Saravana Kumar BizTalk Server MVP.
Grid measurements. Global Grid Forum Areas: APME, ARCH, DATA, GRID SEC, ISP, P2P, SRM Working / Research Groups GridFTP WG Grid High-Performance.
Join us on Twitter: #AU2013 Building Well-Performing Autodesk® AutoCAD® Applications Albert Szilvasy Software Architect.
- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,
9/29/04 GGF Random Thoughts on Application Performance and Network Characteristics Distributed Systems Department Lawrence Berkeley National Laboratory.
Globus online Delivering a scalable service Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory.
CS Introduction to Operating Systems
iperf a gnu tool for IP networks
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Tango Administrative Tools
University of Washington Computer Programming I
CSE451 I/O Systems and the Full I/O Path Autumn 2002
Lecture 25 More Synchronized Data and Producer/Consumer Relationship
Working at a Small-to-Medium Business or ISP – Chapter 7
Network Administration CNET-443
Working at a Small-to-Medium Business or ISP – Chapter 7
End-to-End Monitoring and
Computer science By/ Midhat Mohiey. Introduction to Programming using C ++ 2.
Virus Attack Final Presentation
Topic 5: Communication and the Internet
11/12/2018 6:58 PM © 2004 Microsoft Corporation. All rights reserved.
Oracle9i Database Administrator: Implementation and Administration
Concepts From Alice Switching to Java Copyright © Curt Hill.
Working at a Small-to-Medium Business or ISP – Chapter 7
Outline Problem DiskRouter Overview Details Real life DiskRouters
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Lecture 2 The Art of Concurrency
Why Threads Are A Bad Idea (for most purposes)
EPICS: Experimental Physics and Industrial Control System
A General Approach to Real-time Workflow Monitoring
Chapter 5 The Redo Log Files.
Creating and Managing Folders
Why Threads Are A Bad Idea (for most purposes)
Anant Mudambi, U. Virginia
Why Threads Are A Bad Idea (for most purposes)
Summer 2002 at SLAC Ajay Tirumala.
Using NetLogger and Web100 for TCP analysis
Presentation transcript:

Brian L. Tierney, Dan Gunter BLTierney@lbl.gov, DKGunter@.lbl.gov NetLogger Summarizer and the nlioperf tool Brian L. Tierney, Dan Gunter BLTierney@lbl.gov, DKGunter@.lbl.gov Lawrence Berkeley National Laboratory http://dsd.lbl.gov/NetLogger Center for Enabling Distributed Petascale Science http://www.cedps-scidac.net

NetLogger Toolkit We have developed the NetLogger Toolkit (short for Networked Application Logger), which includes: tools to make it easy for distributed applications to log interesting events at every critical point NetLogger client library (C, Java, Perl, Python) tools for host and network monitoring event visualization tools that allow one to correlate application events with host/network events NetLogger event archive and retrieval tools (new) NetLogger combines network, host, and application-level monitoring to provide a complete view of the entire system. Open Source, available at http://dsd.lbl.gov/NetLogger/

NetLogger Analysis: Key Concepts NetLogger visualization methodology is based on time correlated and object correlated events. In order to associate a group of events into a “lifeline”, you must assign an “Event ID” to each NetLogger event Sample Event ID: file name, block ID, frame ID, etc.

Time-based summarization Imagine you want detailed instrumentation of a simple loop that reads data, processes it, then writes out some result. DO for i=1, N log(“read.start”, ...) read_data() log(“read.end”, ...) process_data() log(“write.start”, ...) write_data() log(“write.end”, ...) DONE

Time-based summarization Reporting only a periodic summary (mean, sd) of times and values between “start” and “end” events (of a given type and ID). .. to here Go from here.. Orders of magnitude data reduction with, for many purposes, no loss in explanatory power. Since the original instrumentation is still there, you can “turn on” and off the full detail in a running program.

Sample Use: GridFTP Bottleneck Detector GridFTP today By default only logs single throughput number GridFTP throughput might be limited by Sending/receiving disk NFS/AFS mounted partition? network currently no way to tell Experimental GridFTP enhancement Use NetLogger summarization library to monitor all I/O streams

nlioperf Started out as a test tool for the NetLogger summarization library Actually quite useful test tool Supports multiple TCP streams file input/output TCP buffer tuning bottleneck detection multiple levels of logging full, interval summary, final summary only

nlioperf: network vs disk performance

Bottleneck Detection: WAN Transfer

Ideas Add NetLogger summarizer to iperf? Add disk testing to iperf

More Information NetLogger 4.0beta just released includes new summarization Library: http://dsd.lbl/gov/NetLogger email: DKGunter@lbl.gov or BLTierney@lbl.gov