Optimised Memory Transfer & Flow Control for High Speed Networks - Codito Technologies Pvt. Ltd. - D Y Patil College of Engineering.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

The Transmission Control Protocol (TCP) carries most Internet traffic, so performance of the Internet depends to a great extent on how well TCP works.
Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Chabot College Chapter 2 Review Questions Semester IIIELEC Semester III ELEC
Laboratório de Teleprocessamento e Redes1 Unix Network Programming Prof. Nelson Fonseca
Multi-Layer Switching Layers 1, 2, and 3. Cisco Hierarchical Model Access Layer –Workgroup –Access layer aggregation and L3/L4 services Distribution Layer.
CS162 Section Lecture 9. KeyValue Server Project 3 KVClient (Library) Client Side Program KVClient (Library) Client Side Program KVClient (Library) Client.
1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State Vsevolod V. Panteleenko Computer Science & Engineering.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak.
CS335 Principles of Multimedia Systems Multimedia Over IP Networks -- I Hao Jiang Computer Science Department Boston College Nov. 6, 2007.
OSMOSIS Final Presentation. Introduction Osmosis System Scalable, distributed system. Many-to-many publisher-subscriber real time sensor data streams,
EE 122: Router Design Kevin Lai September 25, 2002.
RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.
OPERATING SYSTEMS Introduction
1 I/O Management in Representative Operating Systems.
5/12/05CS118/Spring051 A Day in the Life of an HTTP Query 1.HTTP Brower application Socket interface 3.TCP 4.IP 5.Ethernet 2.DNS query 6.IP router 7.Running.
FreeBSD Network Stack Performance Srinivas Krishnan University of North Carolina at Chapel Hill.
Christopher Bednarz Justin Jones Prof. Xiang ECE 4986 Fall Department of Electrical and Computer Engineering University.
UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
Protocols for Wide-Area Data-intensive Applications: Design and Performance Issues Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi, Brian.
ISO Layer Model Lecture 9 October 16, The Need for Protocols Multiple hardware platforms need to have the ability to communicate. Writing communications.
1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.
LWIP TCP/IP Stack 김백규.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
Lecture 3 Review of Internet Protocols Transport Layer.
High Performance Computing & Communication Research Laboratory 12/11/1997 [1] Hyok Kim Performance Analysis of TCP/IP Data.
LWIP TCP/IP Stack 김백규.
Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath.
Optimizing UDP-based Protocol Implementations Yunhong Gu and Robert L. Grossman Presenter: Michal Sabala National Center for Data Mining.
1.4 Open source implement. Open source implement Open vs. Closed Software Architecture in Linux Systems Linux Kernel Clients and Daemon Servers Interface.
MaxNet NetLab Presentation Hailey Lam Outline MaxNet as an alternative to TCP Linux implementation of MaxNet Demonstration of fairness, quick.
Chapter 12 Transmission Control Protocol (TCP)
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
Network Programming Eddie Aronovich mail:
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Remote Shell CS230 Project #4 Assigned : Due date :
High Performance Network Virtualization with SR-IOV By Yaozu Dong et al. Published in HPCA 2010.
Data Transport Challenges for e-VLBI Julianne S.O. Sansa* * With Arpad Szomoru, Thijs van der Hulst & Mike Garret.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
Forwarding.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
Design and Implementation of Spacecraft Avionics Software Architecture based on Spacecraft Onboard Interface Services and Packet Utilization Standard Beijing.
Prentice HallHigh Performance TCP/IP Networking, Hassan-Jain Chapter 13 TCP Implementation.
Reduced Communication Protocol for Clusters Clunix Inc. Donghyun Kim
1.4 Open source implement. Open source implement Open vs. Closed Software Architecture in Linux Systems Linux Kernel Clients and Daemon Servers Interface.
Data Transport Challenges for e-VLBI Julianne S.O. Sansa* * With Arpad Szomoru, Thijs van der Hulst & Mike Garret.
Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – , India Multimedia Retrieval Architecture.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
Intro to Distributed Systems Hank Levy. 23/20/2016 Distributed Systems Nearly all systems today are distributed in some way, e.g.: –they use –they.
11 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
Linux Optimization Kit Many developers need to get a performance increase from their Linux OS Linux OK allows users to achieve higher performance.
Reddy Mainampati Udit Parikh Alex Kardomateas
Transport Protocols over Circuits/VCs
Design and Implementation of Spacecraft Avionics Software Architecture based on Spacecraft Onboard Interface Services and Packet Utilization Standard Beijing.
Transport Layer Unit 5.
LINUX System : Lecture 7 Lecture notes acknowledgement : The design of UNIX Operating System.
Xen and the Art of Virtualization
Review of Internet Protocols Transport Layer
ECE 671 – Lecture 8 Network Adapters.
Presentation transcript:

Optimised Memory Transfer & Flow Control for High Speed Networks - Codito Technologies Pvt. Ltd. - D Y Patil College of Engineering

Current Trends in High Speed Networks Total network bandwidth triples every 12 months CPU Processing power doubles every 18 months Memory performance increases by 10% every 12 months

Software Challenges in High Speed Networks Data Movement Reasons For Copying. Copying as a bottleneck. Checksum Flow Control Algorithm Large Bandwidth Delay Product

Software Requirements Specification Product Functions Framework for Data Copy Elimination Flow Control Algorithm for Optimal Bandwidth Utilization User Characteristics Constraints Applications Max Memory Limit per Process CPU Speed/Network Speed

Characterizing Network I/O

Zbuf Framework Assumptions: Memory is not limited. Hardware Checksumming support would further enhance our implementation. Application program may reuse the buffer but not the contents of the buffer. Hence we perform explicit exchange of buffers.

Zbuf Framework Modules: Zbuf Allocator API Calls Send Module Receive Module

Zbuf Allocator Process A address space zalloc zbuf User Domain Kernel Domain Memory Zbuf Zone zbuf User/Kernel sharing of memory !!

Zbuf Zone Zbuf Allocator User Domain Kernel Domain Memory Maintaining Zbufs In User Space Zbuf Table zd (Zbuf Descriptor) zbuf Size < MTU Zbuf TableZbuf Indirect Table zbuf Size > MTU

API Calls int zalloc(size_t size) int z_send(int sockfd,int zd,size_t len, int flags ) int z_recv(int sockfd,int zd,size_t len, int flags ) size_t zfread(int zd,size_t size, size_t nmemb, FILE *fp) size_t zfwrite(int zd, size_t size, size_t nmemb, FILE *fp)

Send Module User Domain Kernel Domain Process A address space Socket Queue zbuf Explicit exchange of buffers !! zbuf z_send Zbufs are allocated in MTU-sized units. Check-summing ? Device Driver Interface

Receive Module Zbuf Allocator Zbuf Port Registration !! Socket Queue z_recv User Domain Kernel Domain Process A address space Device Driver Interface zbufs are grouped together. zbuf

Flow Control Algorithms for High-Speed Networks BW-delay product: 1Gb/s WAN * 100 msec RTT = 100Mb TCP Reno: Re-convergence to optimal b/w takes a long time !!

Packet-Pair Flow Control Feedback Based Adaptive Rate Control Scheme. Estimates n/w state by sending data packets in pairs. Adapts to changes in n/w state within one RTT. Assumptions: Network consists of FQ servers. Routing table updates are infrequent. Packets are of same size.

Packet Pair Server 1SourceBottleneck Sink Rate estimate RTT Bottleneck rate

Packet Pair Probes Inter ack spacing estimates bottleneck service rate. Pipeline depth is determined within one RTT. Sending rate is modified according to inter ack spacing.

Design Strategy State Machine Startup Queue Priming Normal Transmission Buffer Management Strategy Retransmission Strategy

State Chart Diagram for Packet Pair

Testing Strategy

Testing Strategy – II Bottom up Integration for Zbuf Framework Smoke Testing for Packet Pair

Zbuf Framework Statistics

Conclusion The Zbuf framework is scalable across MTU sizes. The Zbuf framework is scalable across multiprocessing environs. Use of Packet Pair algorithm on high speed network would result in optimal utilization of bandwidth.