TCPivo A High-Performance Packet Replay Engine Wu-chang Feng Ashvin Goel Abdelmajid Bezzaz Wu-chi Feng Jonathan Walpole.

Slides:



Advertisements
Similar presentations
Simulation of Feedback Scheduling Dan Henriksson, Anton Cervin and Karl-Erik Årzén Department of Automatic Control.
Advertisements

IP puzzles, probabilistic networking, and other projects at Wu-chang Feng Louis Bavoil Damien Berger Abdelmajid Bezzaz Francis Chang Jin Choi.
Lecture 12 Page 1 CS 111 Online Devices and Device Drivers CS 111 On-Line MS Program Operating Systems Peter Reiher.
Transparent Checkpoint of Closed Distributed Systems in Emulab Anton Burtsev, Prashanth Radhakrishnan, Mike Hibler, and Jay Lepreau University of Utah,
Geoff Salmon, Monia Ghobadi, Yashar Ganjali, Martin Labrecque, J. Gregory Steffan University of Toronto.
1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State Vsevolod V. Panteleenko Computer Science & Engineering.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
UC Berkeley 1 A Disk and Thermal Emulation Model for RAMP Zhangxi Tan and David Patterson.
G Robert Grimm New York University Disco.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Jonathan.
Accurate and Efficient Replaying of File System Traces Nikolai Joukov, TimothyWong, and Erez Zadok Stony Brook University (FAST 2005) USENIX Conference.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
Supporting Time-sensitive Application on a Commodity OS Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow, Jonathan Walpole Presented by Wen Sun Some Slides.
Supporting Time-Sensitive Applications on a Commodity OS by Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow, Jonathan Walpole Jimi Watson.
Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented by Reinette Grobler.
Performance Evaluation of Load Sharing Policies on a Beowulf Cluster James Nichols Marc Lemaire Advisor: Mark Claypool.
Supporting Time-sensitive Application on a Commodity OS By Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow, Jonathan Walpole Presenter: Shuping Tien.
CS533 Concepts of Operating Systems Class 10 Fine Grain Timing.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
CS533 Concepts of Operating Systems Class 18 Fine Grain Timing.
Protocol Implementation An Engineering Approach to Computer Networking.
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
TCP Servers: Offloading TCP/IP Processing in Internet Servers
Hosting Virtual Networks on Commodity Hardware VINI Summer Camp.
Supporting Time-Sensitive Applications on a Commodity OS A. Goel, L. Abeni, C. Krasic, J. Snow and J. Walpole Proceedings of USENIX 5th Symposium on Operating.
LiNK: An Operating System Architecture for Network Processors Steve Muir, Jonathan Smith Princeton University, University of Pennsylvania
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
OPERATING SYSTEMS Goals of the course Definitions of operating systems Operating system goals What is not an operating system Computer architecture O/S.
Efficient Packet Classification with Digest Caches Francis Chang Wu-chang Feng Wu-chi Feng Kang Li.
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
Real-Time Linux Evaluation NASA Glenn Research Center Kalynnda Berens Richard Plastow
Reference: Ian Sommerville, Chap 15  Systems which monitor and control their environment.  Sometimes associated with hardware devices ◦ Sensors: Collect.
Srihari Makineni & Ravi Iyer Communications Technology Lab
Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Oindrila.
Security Architecture and Design Chapter 4 Part 1 Pages 297 to 319.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Supporting Time Sensitive Application in a Commodity OS Ashvin Goel, Luca, Charles, Jim, Jonathan Walpole Oregon Graduate Institute, Portland Presented.
In a multitasking environment : what is a timeslice (quantum)? what is a context switch? When a large number of processes (jobs) are multitasking the actual.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
An Efficient Threading Model to Boost Server Performance Anupam Chanda.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
Time Management.  Time management is concerned with OS facilities and services which measure real time.  These services include:  Keeping track of.
Operating Systems: Summary INF1060: Introduction to Operating Systems and Data Communication.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
January 8, 2001 SPC Tutorial 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS Agenda 9:00 SPC Hardware -- John DeHart 9:45 SPC Software -- John DeHart 10:30.
16 th IEEE NPSS Real Time Conference 2009 IHEP, Beijing, China, 12 th May, 2009 High Rate Packets Transmission on 10 Gbit/s Ethernet LAN Using Commodity.
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
Soft Timers : Efficient Microsecond Software Timer Support for Network Processing - Mohit Aron & Peter Druschel CS533 Winter 2007.
Real-Time Performance of Linux “A Measurement-Based Analysis of the Real-Time Performance of Linux” (L. Abeni, A. Goel, C. Krasic, J. Snow, J. Walpole)
Current Generation Hypervisor Type 1 Type 2.
Andy Wang COP 5611 Advanced Operating Systems
Course Introduction Dr. Eggen COP 6611 Advanced Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
Final Review CS144 Review Session 9 June 4, 2008 Derrick Isaacson
Chapter 6: CPU Scheduling
CS510 Operating System Foundations
Andy Wang COP 5611 Advanced Operating Systems
Chapter 6: CPU Scheduling
Andy Wang COP 5611 Advanced Operating Systems
Supporting Time-Sensitive Applications on a Commodity OS
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
Andy Wang COP 5611 Advanced Operating Systems
Low Overhead Interrupt Handling with SMT
Cluster Computers.
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

TCPivo A High-Performance Packet Replay Engine Wu-chang Feng Ashvin Goel Abdelmajid Bezzaz Wu-chi Feng Jonathan Walpole

Motivation Many methods for evaluating network devices Simulation Device simulated, traffic simulated ns-2, IXP network processor simulator Model-based emulation Actual device, traffic synthetically generated from models IXIA traffic generator Trace-driven emulation Actual device, actual traffic trace Particularly good for evaluating functions that rely on actual address mixes and packet interarrival/size distributions

Goal of work Packet replay tool for trace-driven evaluation Accurate High-performance Low-cost Commodity hardware Open-source software Solution: TCPivo Accurate replay above OC-3 rates Pentium 4 Xeon 1.8GHz Custom Linux kernel with ext3 Intel Mbs ~$2,000

Challenges Trace management Getting packets from disk Timer mangement Time-triggering packet transmission Scheduling and pre-emption Getting control of the OS Efficient sending loop Sending the packet

Trace management problem Getting packets from disk Requires intelligent pre-fetching Most OSes support transparent pre-fetch via fread() Default Linux fread()latency reading trace

Trace management in TCPivo Double-buffered pre-fetching mmap()/madvise()with sequential access hint

Timer management problem Must accurately interrupt OS to send packets Approaches Polling loop Spin calling gettimeofday()until time to send High overhead, accurate usleep() Register timer interrupt Low overhead, potentially inaccurate Examine each approach using fixed workloads 1 million packet trace Constant-interarrival times d=70 msec, d=2500 msec

Timer management problem Polling loop d=70 msec 78% User-space CPU utilization d=2500 msec 99% User-space CPU utilization

Timer management problem usleep() d=70 msec 40% User-space CPU utilization d=2500 msec 4% User-space CPU utilization

Timer management in TCPivo “Firm timers” Combination of periodic and one-shot timers in x86 PIT (programmable interval timer) APIC (advanced programmable interrupt controller) Use PIT to get close, use APIC to get the rest of the way Timer reprogramming and interrupt overhead reduced via soft timers approach Transparently used via changes to usleep()

Timer management in TCPivo Firm timers d=70 msec 19% User-space CPU utilization d=2500 msec 1% User-space CPU utilization

Scheduling and pre-emption problem Getting control of the OS when necessary Low-latency, pre-emptive kernel patches Reduce length of critical sections Examine performance under stress I/O workload File system stress test Continuously open/read/write/close an 8MB file Memory workload (see paper)

Scheduling and pre-emption problem Firm timer kernel without low-latency and pre- emptive patches I/O Workload, d=70msec

Scheduling and pre-emption in TCPivo Firm timer kernel with low-latency and pre-emptive patches I/O Workload, d=70msec

Efficient sending loop in TCPivo Zeroed payload Optional pre-calculation of packet checksums

Putting it all together On the wire accuracy d=70msec workload at the sender Point-to-point Gigabit Ethernet link Measured inter-arrival times of packets at receiver

Software availability TCPivo Formerly known as NetVCR before an existing product of the same name forced a change to a less catchier name. Linux 2.4 Firm timers Andrew Morton's low-latency patch Robert Love's pre-emptive patch Linux 2.5 Low-latency, pre-emptive patches included High-resolution timers via 1ms PIT (No firm timer support)

Open issues Multi-gigabit replay Zero-copy TOE SMP Accurate, but not realistic for evaluating everything Open-loop (not good for AQM) Netbed/PlanetLab? Requires on-the-fly address rewriting

Questions?