1 CSTS WG CSTS WG Prototyping for Forward CSTS Performance Boulder November 2011 Martin Karch.

Slides:



Advertisements
Similar presentations
Martin Suchara, Ryan Witt, Bartek Wydrowski California Institute of Technology Pasadena, U.S.A. TCP MaxNet Implementation and Experiments on the WAN in.
Advertisements

TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
1 Review Notes concerning Review Notes concerning Forward Frame Service & Process Data Operation/Procedure
CCNA3: Switching Basics and Intermediate Routing v3.0 CISCO NETWORKING ACADEMY PROGRAM Switching Concepts Introduction to Ethernet/802.3 LANs Introduction.
Dead Reckoning Objectives – –Understand what is meant by the term dead reckoning. –Realize the two major components of a dead reckoning protocol. –Be capable.
Spring 2000CS 4611 Introduction Outline Statistical Multiplexing Inter-Process Communication Network Architecture Performance Metrics.
TCP Vegas: New Techniques for Congestion Detection and Control.
24-1 Chapter 24. Congestion Control and Quality of Service (part 1) 23.1 Data Traffic 23.2 Congestion 23.3 Congestion Control 23.4 Two Examples.
Fundamentals of Computer Networks ECE 478/578
2005/12/06OPLAB, Dept. of IM, NTU1 Optimizing the ARQ Performance in Downlink Packet Data Systems With Scheduling Haitao Zheng, Member, IEEE Harish Viswanathan,
Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
Performance Improvement of TCP in Wireless Cellular Network Based on Acknowledgement Control Osaka University Masahiro Miyoshi, Masashi Sugano, Masayuki.
Dynamic Adaptive Streaming over HTTP2.0. What’s in store ▪ All about – MPEG DASH, pipelining, persistent connections and caching ▪ Google SPDY - Past,
Measurements of Congestion Responsiveness of Windows Streaming Media (WSM) Presented By:- Ashish Gupta.
CStream: Neighborhood Bandwidth Aggregation For Better Video Streaming Thangam Vedagiri Seenivasan Advisor: Mark Claypool Reader: Robert Kinicki 1 M.S.
1 Modeling and Emulation of Internet Paths Pramod Sanaga, Jonathon Duerig, Robert Ricci, Jay Lepreau University of Utah.
Introduction Future wireless systems will be characterized by their heterogeneity - availability of multiple access systems in the same physical space.
Leveraging Multiple Network Interfaces for Improved TCP Throughput Sridhar Machiraju SAHARA Retreat, June 10-12, 2002.
Data Communication and Networks
All rights reserved © 2006, Alcatel Accelerating TCP Traffic on Broadband Access Networks  Ing-Jyh Tsang 
ISO Layer Model Lecture 9 October 16, The Need for Protocols Multiple hardware platforms need to have the ability to communicate. Writing communications.
ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 8.
TCP Throughput Collapse in Cluster-based Storage Systems
Raj Jain The Ohio State University R1: Performance Analysis of TCP Enhancements for WWW Traffic using UBR+ with Limited Buffers over Satellite.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 26, 2001.
S3C2 – LAN Switching Addressing LAN Problems. Congestion is Caused By Multitasking, Faster operating systems, More Web-based applications Client-Server.
Optimizing UDP-based Protocol Implementations Yunhong Gu and Robert L. Grossman Presenter: Michal Sabala National Center for Data Mining.
Understanding the Performance of TCP Pacing Amit Aggarwal, Stefan Savage, Thomas Anderson Department of Computer Science and Engineering University of.
TCP Lecture 13 November 13, TCP Background Transmission Control Protocol (TCP) TCP provides much of the functionality that IP lacks: reliable service.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
A test-bed investigation of QoS mechanisms for supporting SLAs in IPv6 Vasilios A. Siris and Georgios Fotiadis University of Crete and FORTH Heraklion,
SELECTIVE ACKNOWLEDGEMENT (SACK) DUPLICATE SELECTIVE ACKNOWLEDGMENT
Chapter 12 Transmission Control Protocol (TCP)
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
High TCP performance over wide area networks Arlington, VA May 8, 2002 Sylvain Ravot CalTech HENP Working Group.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
2000 년 11 월 20 일 전북대학교 분산처리실험실 TCP Flow Control (nagle’s algorithm) 오 남 호 분산 처리 실험실
CS 164: Slide Set 2: Chapter 1 -- Introduction (continued).
ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 7.
Data and Computer Communications Chapter 11 – Asynchronous Transfer Mode.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
TCP Trunking: Design, Implementation and Performance H.T. Kung and S. Y. Wang.
TCP-Cognizant Adaptive Forward Error Correction in Wireless Networks
Forward Error Correction vs. Active Retransmit Requests in Wireless Networks Robbert Haarman.
The Delay-Bandwidth Product “Keeping the pipe full” 1.
Using Heterogeneous Paths for Inter-process Communication in a Distributed System Vimi Puthen Veetil Instructor: Pekka Heikkinen M.Sc.(Tech.) Nokia Siemens.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
An Energy Efficient MAC Protocol for Wireless LANs, E.-S. Jung and N.H. Vaidya, INFOCOM 2002, June 2002 吳豐州.
Ασύρματες και Κινητές Επικοινωνίες Ενότητα # 11: Mobile Transport Layer Διδάσκων: Βασίλειος Σύρης Τμήμα: Πληροφορικής.
Queuing Delay 1. Access Delay Some protocols require a sender to “gain access” to the channel –The channel is shared and some time is used trying to determine.
Development of a QoE Model Himadeepa Karlapudi 03/07/03.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
TCP Traffic Characteristics—Deep buffer Switch
3GPP2 TSG-C SWG1.2 Maui, HI C R2/C of 12 TSG-C WG3/SWG1.2 joint meeting 12/07/05 BCMCS FEC evaluation simulation requirements.
Storage System Optimization. Introduction Storage Types-DAS/NAS/SAN The purposes of different RAID types. How to calculate the storage size for video.
Connect communicate collaborate Performance Metrics & Basic Tools Robert Stoy, DFN EGI TF, Madrid September 2013.
Karn’s Algorithm Do not use measured RTT to update SRTT and SDEV Calculate backoff RTO when a retransmission occurs Use backoff RTO for segments until.
TCP - Part II.
Topics discussed in this section:
Transport Protocols over Circuits/VCs
Lecture 19 – TCP Performance
CS Lecture 2 Network Performance
Chapter-5 Traffic Engineering.
Review of Internet Protocols Transport Layer
Achieving reliable high performance in LFNs (long-fat networks)
Presentation transcript:

1 CSTS WG CSTS WG Prototyping for Forward CSTS Performance Boulder November 2011 Martin Karch

2 Prototyping for Fwd CSTS Performance Structure of the Presentation Background and Objective Measurement Set-Up Results Summary

3 Background and Objective Reports from NASA: Prototyping experimental Synchronous Forward Frame Service (CLTU Orange Book) CLTU Service approach limits throughput seriously Target rate (25 mbps) could only be reached if - 7 frames of 220 bytes length - 1 data parameter of one single CLTU No radiation reports for individual CLTU (only the complete one) No investigation yet available Why can throughput not be reached when frames are transferred in a single CLTU? What is the cost of acknowledging every frame?

4 Background and Objective Based on the Reports: Suggest blocking of several frames into one data parameter for Potential future Forward Frame CSTS Process Data Operation/Data Processing Procedure (FW) Objective of Prototyping Verify the blocking of data items significantly increases the throughput Investigate if bottleneck is in service provisioning (actual protocol between user and provider) Results shall support selection of most appropriate approach for the CSTS FW Forward Specification Measurements are made for Protocol Performance

5 Measurement Set-Up 2 machines equipped with Xeon 4C X GHz/1333MHz/8MB 4GB memory Linux SLES bit Isolated LAN 1 Gbit cable connection (no switch) SGM (SLE Ground Models) NIS (Network Interface System)

6 Measurement Set-Up Provider SGM SLE Ground Models Simulation Environment SGM changed such that Receiving Thread puts CLTUs on a Queue for Radiation A ‘Radiation Thread’ removes CLTUs and discards them No further simulation of Radiation process (radiation duration) User NIS Network Interface System Simulation Environment NIS is modified to Create (as fast as possible) CLTU operation objects Immediately passes them to SLE API for transmission No interface to a Mission Control System (MCS)

7 Measurement Set-Up Basis for all Steps: SGM based provider NIS based user Step 1 Measurements: Variation of CLTU length - Simulates sending many small CLTUs - In one TRANSFER DATA Invocation (1 st approximation) Step 2 Measurements: SLE API modified - Aggregate configurable number of CLTU (SEQUENCE OF Cltu) - With minimum annotation (CLTU Id, sequence count) - Send return when last data unit is acknowledged

8 Step1 / Measurement 1 SGM + NIS model optimisedyes SLE API optimisedno Nagle + delayed ackon RTT0.1 ms Linear curve Proportional to CLTU size Constant Processing Time Independent of CLTU size

9 Step1 / Measurement 2 SGM + NIS model optimisedyes SLE API optimisedyes Nagle + delayed ackon RTT0.1 ms

10 Step1 / Measurement 3 SGM + NIS model optimisedyes SLE API optimisedyes Nagle + delayed ackoff RTT0.1 ms

11 Step1 / Measurement 4 SGM + NIS model optimisedyes SLE API optimisedyes Nagle + delayed ackoff RTT400 ms Processing Time still constant Transfer-time increased

12 Step1 / Measurement 5.1 Msmnt 5.1: Reference Measurement for Measurements with variations of RTT using IPerf Msmnt 5.2: Measurements using SGM + NIS

13 Step1 / Measurement 5.2 SGM + NIS model optimisedyes SLE API optimisedyes Nagle + delayed ackoff RTTvariable Shows influence of transmission time only Delay is dominating factor As expected (1/RTT) Ratio Msmnt/Iperf = (1544) Ratio Msmnt/Iperf = (1000)

14 Step1 / Measurement 5 (2) Operates with Maximum Send and Receive Buffer Question: How big must the window size be to achieve similar throughput values like above ( for the example of 40 Mbit/sec) Maximum Data Rate = Buffer size/RTT Window sizeRTT [ms] CLTU Size [byte] CLTU CountData [byte] Data Rate [Megabit/s] Send Duration [s]CLTU Rate [#/s] Send Time per CLTU [ms] 64 KB ,720, “ ,720, “ ,720, “ ,720, “ ,000, “ ,000, “ ,000, “ ,000, MB ,200, MB ,200, MB ,200,

15 Step 1 Measurements Summary Linear increase of data rate with CLTU length sending as fast as possible no network delay Constant Processing Time Best results with Optimised Code - 5 to 10 % performance increase (optimised SLE API only) Nagle and Delayed Ack. switched off - (factor 2.5 lower when Nagle Alg. and Delayed Ack. are both on) No network delay Network delay 200 ms (400 RTT) Performance decrease of a factor of 400 compared to Measurement 2 (the best one) Maximum Data Rate = Buffer size/RTT We have to take care on the size of the CLTU

16 What is the Cost of Confirmed Operations Data unit size = 8000 byte CLTU: Mbps RAF: Mbps ( Frame size 8000 byte, 1 frame/buffer) Increase by 53% Data unit size = 2000 byte CLTU:53,36 Mbps RAF: Mbps ( Frame size 2000 byte, 1 frame/buffer) Increase by 60%

17 Effects of Buffering (RAF) Frame SizeFrames/BufferMbpsFrame/secmsec/frame , , , , , , , , Concatenation of 80 frames of 100 byte into a buffer back-to- back and then passed to the API as one frame: Mbps Frame size = 2000 byte, 1 Frame/buffer: Mbps

18 Same in Graphical Presentation

19 RAF Measurment Configuration Frame Generator SLE Service Provider Application SLE API Communication Server SLE API SLE Service User Application frame transfer buffer TCP (local) TCP frame

20 Cost of ASN.1 Encoding (RAF) Result of profiling for RAF, frame size 100 byte, 80 frames per buffer: Encoding of Transfer Buffer including all contained frames: 6.42% Encoding of Transfer Buffer Invocation alone: 2.31% Effects might be caused by increased interactions / interrupts, etc.

21 Summary of Observations Size of the data unit transferred has a significant impact Almost constant end to end processing time independent of buffer size Liner increase of net bitrate with data unit size Large impact on network delay due to TCP (expected) Significant additional cost of using confirmed operations Buffering of frames vs, transfer in individual frames 4 frames of 2K per buffer vs single 2K frames: factor 1.9 BUT: throughput for a single large data unit is much larger than buffer of same size containing multiple small units ASN.1 encoding for worst case test accounts for 6.4% of overall local processing time