FreeBSD Network Stack Performance Srinivas Krishnan University of North Carolina at Chapel Hill.

Slides:



Advertisements
Similar presentations
2000 SRM Associates, Ltd. Windows NT/2000 Performance and Capacity Key Metrics Jerry L. Rosenberg SRM Associates, Ltd.
Advertisements

Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
CS162 Section Lecture 9. KeyValue Server Project 3 KVClient (Library) Client Side Program KVClient (Library) Client Side Program KVClient (Library) Client.
Chapter 7 Protocol Software On A Conventional Processor.
Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
04/14/2008CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying the textbook.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Introduction Future wireless systems will be characterized by their heterogeneity - availability of multiple access systems in the same physical space.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Processes CSCI 444/544 Operating Systems Fall 2008.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
04/16/2010CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying an earlier edition.
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
1 I/O Management in Representative Operating Systems.
G Robert Grimm New York University Receiver Livelock.
COM S 614 Advanced Systems Novel Communications U-Net and Active Messages.
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
I/O Systems I/O Hardware Application I/O Interface
1 Lecture 20: I/O n I/O hardware n I/O structure n communication with controllers n device interrupts n device drivers n streams.
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.1. Principles of I/O.
High Performance Computing & Communication Research Laboratory 12/11/1997 [1] Hyok Kim Performance Analysis of TCP/IP Data.
© 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –
Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath.
Design and Implementation of a Multi-Channel Multi-Interface Network Chandrakanth Chereddi Pradeep Kyasanur Nitin H. Vaidya University of Illinois at Urbana-Champaign.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
Eliminating Receive Livelock in an Interrupt-Driven Kernel J. C. Mogul and K. K. Ramakrishnana Presented by I. Kim, 01/04/13.
Srihari Makineni & Ravi Iyer Communications Technology Lab
Optimised Memory Transfer & Flow Control for High Speed Networks - Codito Technologies Pvt. Ltd. - D Y Patil College of Engineering.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Mid Term review CSC345.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 13: I/O Systems Overview I/O Hardware Application.
Using Uncacheable Memory to Improve Unity Linux Performance
Device Driver Concepts Digital UNIX Internals II Device Driver Concepts Chapter 13.
What is an Operating System? Various systems and their pros and cons –E.g. multi-tasking vs. Batch OS definitions –Resource allocator –Control program.
LECTURE 12 NET301 11/19/2015Lect NETWORK PERFORMANCE measures of service quality of a telecommunications product as seen by the customer Can.
Exploiting Task-level Concurrency in a Programmable Network Interface June 11, 2003 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
L1/HLT trigger farm Bologna setup 0 By Gianluca Peco INFN Bologna Genève,
COS 318: Operating Systems Virtual Memory Design Issues.
16 th IEEE NPSS Real Time Conference 2009 IHEP, Beijing, China, 12 th May, 2009 High Rate Packets Transmission on 10 Gbit/s Ethernet LAN Using Commodity.
Introduction to Operating Systems Concepts
Chapter 13: I/O Systems.
Module 12: I/O Systems I/O hardware Application I/O Interface
Zero-copy Receive Path in Virtio
Process Management Process Concept Why only the global variables?
CS 286 Computer Organization and Architecture
Chapter 13: I/O Systems.
CSCI 315 Operating Systems Design
So far… Text RO …. printf() RW link printf Linking, loading
Main Memory Background Swapping Contiguous Allocation Paging
I/O Systems I/O Hardware Application I/O Interface
Operating System Concepts
13: I/O Systems I/O hardwared Application I/O Interface
CS703 - Advanced Operating Systems
Mid Term review CSC345.
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Net301 LECTURE 10 11/19/2015 Lect
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Mr. M. D. Jamadar Assistant Professor
ECE 671 – Lecture 8 Network Adapters.
Module 12: I/O Systems I/O hardwared Application I/O Interface
Presentation transcript:

FreeBSD Network Stack Performance Srinivas Krishnan University of North Carolina at Chapel Hill

Outline Introduction Unix network stack improvements Bottlenecks Memory Copies Interrupt Processing Zero Copy Implementation Receive Live Lock Solution

Introduction NIC IP Queue Packet Transport + Network Socket Queue Soft Interrupt Memory Copy Kernel Processing User Processing

Network Stack Reinvented Van Jacobson Net Channels Create a High Speed Channel from NIC to User space Push all processing to the user space Applying E2E “truly” Preserve cache coherency for multi- processor systems BETTER INTERRUPT PROCESSING

Network Stack Reinvented Ulrich Drepper’s Asynchronous Network I/O Asynchronous sockets True Zero Copy No Locking Event Channels BETTER MEMORY PROCESSING

Reduce Memory Copies Sending Side Copy from User Buffer to Kernel Buffer Copy from Kernel Buffer to Device Buffer Receive Side Copy from Device Buffer to Kernel Buffer Copy from Kernel Buffer to Socket Buffer

Zero Copy Send write Userspace Pages RAM Page Sized chunks External mbuf DMA into Driver Buffer NIC

Zero Copy Read NIC Packet Kernel Buffer DMA Kernel Space User Space User Buffer read(fd, buf, s)

Zero Copy Allocate an External Mbuf Pool NIC MTU has to be >= 4K Intel Pro1000 NIC with Jumbo Frames 3Com NIC turn on DMA Buffer and stitch the data together Added Overhead

Page Flipping Check Mbuf len Page Size ! 1 Page Use copyout Use vm_pgmovecovm_pgmoveco (……) Kernel Page User Page read(….) Atleast 1 Page

Preliminary Results 1500 bytes MTU (Iostat trace) for 10 mins

Processing Interrupts Main Processing Hard Interrupt from NIC to driver Soft Interrupt from IP Queue to processing Reduce user level and interrupt thread processing Problem: Receive Live Locks

Receive Live Lock Send large stream of UDP packets > receiver buffer capacity CPU spent processing network packets Goodput = 0

Implementation Design NIC IP Queue Packet Transport + Network Socket Queue Driver Queue Scheduler

Components All UDP packets are queued in driver queue Scheduler is triggered with the arrival of first UDP packet Checks the queue every n ms (currently 1-2ms) Schedules packet departure rate based on timestamps

Driver Queue Algorithm Set maximum rate and average rates Driver Queue maintains Average Queue Length (Weighted over time) Current Rate of transfer Time stamp of packets

Algorithm (cont) If current_rate > average rate Drop N packets such that current_rate == average_rate If current_rate > max rate (Spike) Drop all packets Reduce Time Wait in Queue If Current Queue Size < threshold Schedule packet exit such that rate == average_rate Appends an exit time to each packet

Pros and Cons Easy implementation requires no scheduling changes Reduces CPU utilization in worst case by ~25% Low Overhead Introduces added jitter

Experimental Setup Receive UDP Data Intel Pro1000 Nics Send UDP Data Intel Pro1000 Nics Iostat Trace Netstat trace Custom queue stats

Queue Stats At the Receiver Collect Average Queue Size CPU Utilization Packet Drops Total Number of packets processed

Receive Live Lock

Receive Live Lock (soln)

Future Work Feedback from Socket Queue and IP queue such that Weighted Average computed over all 3 queues Drop at driver before DMA Driver buffer not large enough to keep weighted queue size Feedback from Driver Queue Scheduler to driver to drop

Questions ?