Summer 2002 at SLAC Ajay Tirumala.

Slides:



Advertisements
Similar presentations
August 10, Circuit TCP (CTCP) Helali Bhuiyan
Advertisements

Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
Iperf Tutorial Jon Dugan Summer JointTechs 2010, Columbus, OH.
Reducing the Energy Usage of Office Applications Jason Flinn M. Satyanarayanan Carnegie Mellon University Eyal de Lara Dan S. Wallach Willy Zwaenepoel.
TCP. Learning objectives Reliable Transport in TCP TCP flow and Congestion Control.
Installing and running COMSOL on a Windows HPCS2008(R2) cluster
File Systems (2). Readings r Silbershatz et al: 11.8.
The Transport Layer.
Lect3..ppt - 09/12/04 CIS 4100 Systems Performance and Evaluation Lecture 3 by Zornitza Genova Prodanoff.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
IRODS performance test and SRB system at KEK Yoshimi KEK Building data grids with iRODS 27 May 2008.
Slide 1 DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS OF THE ISCSI PROTOCOL FOR SCSI OVER TCP/IP By Anshul Chadda (Trebia Networks)-Speaker Ashish Palekar.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
Sockets process sends/receives messages to/from its socket
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
CSE679: Computer Network Review r Review of the uncounted quiz r Computer network review.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
1 BWdetail: A bandwidth tester with detailed reporting Masters of Engineering Project Presentation Mark McGinley April 19, 2007 Advisor: Malathi Veeraraghavan.
Iperf Quick Mode Ajay Tirumala & Les Cottrell. Sep 12, 2002 Iperf Quick Mode at LBL – Les Cottrell & Ajay Tirumala Iperf QUICK Mode Problem – Current.
Measuring End-to-end Bandwidth with Iperf using Web100 Presented by Warren Matthews (SLAC) on behalf of Ajay Tirumala (U of Illinois), Les Cottrell (SLAC)
Compound TCP in NS-3 Keith Craig 1. Worcester Polytechnic Institute What is Compound TCP? As internet speeds increased, the long ‘ramp’ time of TCP Reno.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Slide 1/29 Informed Prefetching in ROOT Leandro Franco 23 June 2006 ROOT Team Meeting CERN.
iperf a gnu tool for IP networks
TCP - Part II.
DMET 602: Networks and Media Lab
Tiny http client and server
Block 5: An application layer protocol: HTTP
Exams hints The exam will cover the same topics covered in homework. If there was no homework problem on the topic, then there will not be any exam question.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Approaches towards congestion control
The Transport Layer (TCP)
Distributed File Systems
COMP 431 Internet Services & Protocols
Speaker : Che-Wei Chang
The transfer performance of iRODS between CC-IN2P3 and KEK
File System Implementation
Chapter 3 outline 3.1 Transport-layer services
Transport Protocols over Circuits/VCs
How can a detector saturate a 10Gb link through a remote file system
Utilization of Azure CDN for the large file distribution
TCP Westwood(+) Protocol Implementation in ns-3
Transport Layer Unit 5.
(bandwidth control) Jeff Boote Internet2
Understanding Throughput & TCP Windows
Lecture 19 – TCP Performance
File Sharing Sharing of files on multi-user systems is desirable
Automatic TCP Buffer Tuning
Directory Structure A collection of nodes containing information about all files Directory Files F 1 F 2 F 3 F 4 F n Both the directory structure and the.
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
IT351: Mobile & Wireless Computing
Chapter 6 TCP Congestion Control
Distributed File Systems
Switching Techniques.
Distributed File Systems
Persistence: hard disk drive
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Performance Issues in WWW Servers
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Chapter 15: File System Internals
Measuring End-to-end Bandwidth with Iperf using Web100
Distributed File Systems
Update : about 8~16% are writes
Anant Mudambi, U. Virginia
Distributed File Systems
Review of Internet Protocols Transport Layer
Sending data to EUROSTAT using STATEL and STADIUM web client
Presentation transcript:

Summer 2002 at SLAC Ajay Tirumala

Summer 2002 at SLAC – Ajay Tirumala Main Projects Measuring disk throughputs on remote hosts considering parameters like File System Read[write]-block size Sequential/random reads[writes] Committing sequence for writes File sizes Iperf QUICK mode A new algorithm which reduces the time for measuring end-to-end bandwidth And thus also the network traffic generated Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Disk Throughputs File Systems NFS uses client’s main-memory as cache. Data can be lost during reads/writes. So, need to perform small sized reads and commit often. AFS uses session semantics Local disk is the cache UFS – default file system for Solaris fwrites write to the disk buffer, committed to disk on fsync, buffer is full or when disk caching is disabled EXT – most popular file system for Linux Layer below the VFS Has the concept of pre-allocation (allotting upto 8 adjacent file blocks when a block is requested). Mount option available for greater write speeds (with lesser consistency). Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Disk Reads First read will necessitate a disk-read in most cases A memory read will indicate minimal memory activity a very large memory since the tests are performed with an interval of days. Second read (performed immediately after first read) will generally be read from memory unless disk caching is disabled Since there is a good probability that even the first read can be from memory, we consider disk writes as the primary metric for disk speeds. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Disk writes Commit modes – used fsync to commit files to disk Plain (no commit) Commit each write Commit at end – Most indicative of the disk bandwidth achievable Block sizes For local disks use large block sizes (1-2 MB) For remote writes, 64KB/128KB will suffice File sizes Using a large file size (2GB) increased the throughput in some cases. Default was 64MB. Caution: NFS may not return error during fwrites, it may return an error only on an fsync Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Possible areas to investigate Could consider different disk subsystems like RAID Analysis of parallel disk-transfers using BBCP. Initial tests have indicated that in cases where disk is the limiting factor, using single thread is the best option. Algorithm to estimate disk speeds without using large writes*. Manufacturers’ specs lose meaning with Network File Systems and even for local file systems with multiple disks. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Iperf QUICK Mode Problem Current TCP apps cannot detect when they are out of slow-start Bandwidth measurement apps have to run for a considerable time to counter the effects of slow-start. Solution Use Web100 to detect the end of slow-start Measure bandwidth for a small period after slowstart (say 1s). This should save about 90% of estimation time and traffic generated. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Detecting end of Slow-start Outline Determine a sampling period for Congestion Window Detect the absence of exponential increase every RTT Handle pathological cases Connection may not get out of slow-start Multiple slow-starts Connection may have a very small bandwidth-delay product. E.g. localhost transfers, with latency in nano-seconds. At present, it handles Reno and Vegas It should handle Net100/Floyd stacks with minor modifications. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

The Quick mode Algorithm Initialize Iperf sockets and initialize Web100 connection for the for the Iperf socket. Start Web100 data collection thread This will indicate when the connection is definitely out of slow-start Detect the end of slow-start in the data transfer thread If congestion window does not stabilize, do NOT report QUICK mode results Measure bandwidth for 1s (or user specified time) after slow-start Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Salient results Slow-starts can be From 0.2 seconds for low-latency networks Up to 5 sec for long haul high bandwidth networks. Maximum gains here by using Iperf in QUICK mode. Unless, we use it in quick mode, we can never be sure that the connection is out of slow-start Differs with throughputs for running Iperf for 20s by less than 10% Even performed some tests on dialup links (as receiver) with good results. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Web100 experiences A must use tool (I’m a fan) User-APIs can be improved Behaves well for a sampling time of 20ms. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Possible areas to investigate Integrate with BW tests. Perform tests with slow-senders. Empirical estimates immediately after slow-start : Using RTT and rate of increase of congestion window. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Links Disk : http://www-iepm.slac.stanford.edu/bw/disk_res.html Iperf Quick mode : http://www-iepm.slac.stanford.edu/bw/iperf_res.html Documentation and results of tests with all IEPM-BW managed nodes available from these links. Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Summer 2002 at SLAC – Ajay Tirumala Other stuff… Miniperf is a small Iperf-like program written to Monitor user-specified Web100 variable(s) Allows setting window sizes and test times Can include parallel thread functionality Generate graphs (rate based, sum based) Generate HTML Created a single Iperf version to run on IPv4/v6 (Web100)/(no Web1000). Summer 2002 at SLAC – Ajay Tirumala Aug 23rd, 2002

Thank you!!!