Improving the Performance of the Linux Network Subsystem King Fahd University of Petroleum and Minerals (KFUPM) INFORMATION AND COMPUTER SCIENCE DEPARTMENT.

Slides:



Advertisements
Similar presentations
Topics to be discussed Introduction Performance Factors Methodology Test Process Tools Conclusion Abu Bakr Siddiq.
Advertisements

Reducing Network Energy Consumption via Sleeping and Rate- Adaption Sergiu Nedevschi, Lucian Popa, Gianluca Iannaccone, Sylvia Ratnasamy, David Wetherall.
Hadi Goudarzi and Massoud Pedram
Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
IT Systems Multiprocessor System EN230-1 Justin Champion C208 –
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Sporadic Server Scheduling in Linux Theory vs. Practice Mark Stanovich Theodore Baker Andy Wang.
Diagnosing Wireless TCP Performance Problems: A Case Study Tianbo Kuang, Fang Xiao, and Carey Williamson University of Calgary.
1 On Handling QoS Traffic in Wireless Sensor Networks 吳勇慶.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Jonathan.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
K. Salah1 A Methodology for Successful Voice over IP Deployment An Approved Research Proposal Submitted To DEANSHIP OF SCIENTIFIC RESEARCH KFUPM Research.
Eliminating Receive Livelock in an Interrupt-driven Kernel Jeffrey C. Mogul K. K. Ramakrishnan AT&T Bell Laboratories
Performance Evaluation
K. Salah1 An Analytical Tool to Assess Readiness of Existing Networks for Deploying IP Telephony K. Salah & M. Almashari Department of Information and.
Measuring Performance Chapter 12 CSE807. Performance Measurement To assist in guaranteeing Service Level Agreements For capacity planning For troubleshooting.
Evaluating System Performance in Gigabit Networks King Fahd University of Petroleum and Minerals (KFUPM) INFORMATION AND COMPUTER SCIENCE DEPARTMENT Dr.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Comparative Operating Systems Fall 2001 An Examination of Embedded Linux as a Real Time Operating System Mark Mahoney.
G Robert Grimm New York University Receiver Livelock.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.
Windows Server 2008 Chapter 11 Last Update
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
CSE598C Virtual Machines and Their Applications Operating System Support for Virtual Machines Coauthored by Samuel T. King, George W. Dunlap and Peter.
Draft-constantine-ippm-tcp-throughput-tm-00.txt 1 TCP Throughput Testing Methodology IETF 76 Hiroshima Barry Constantine
Buffer Management for Shared- Memory ATM Switches Written By: Mutlu Apraci John A.Copelan Georgia Institute of Technology Presented By: Yan Huang.
Practical TDMA for Datacenter Ethernet
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
Xen I/O Overview. Xen is a popular open-source x86 virtual machine monitor – full-virtualization – para-virtualization para-virtualization as a more efficient.
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
The Functions of Operating Systems Interrupts. Learning Objectives Explain how interrupts are used to obtain processor time. Explain how processing of.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
1 A New Approach to File System Cache Writeback of Application Data Sorin Faibish – EMC Distinguished Engineer P. Bixby, J. Forecast, P. Armangau and S.
Eliminating Receive Livelock in an Interrupt-Driven Kernel J. C. Mogul and K. K. Ramakrishnana Presented by I. Kim, 01/04/13.
Platform Architecture Lab USB Performance Analysis of Bulk Traffic Brian Leete
A Cyclic-Executive-Based QoS Guarantee over USB Chih-Yuan Huang,Li-Pin Chang, and Tei-Wei Kuo Department of Computer Science and Information Engineering.
1 Presented By: Eyal Enav and Tal Rath Eyal Enav and Tal Rath Supervisor: Mike Sumszyk Mike Sumszyk.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Oindrila.
1 Recommendations Now that 40 GbE has been adopted as part of the 802.3ba Task Force, there is a need to consider inter-switch links applications at 40.
Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
Full and Para Virtualization
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.
Lecture 12 Page 1 CS 111 Online Using Devices and Their Drivers Practical use issues Achieving good performance in driver use.
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
Exploiting Task-level Concurrency in a Programmable Network Interface June 11, 2003 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Network Weather Service. Introduction “NWS provides accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
1 © NOKIA FILENAMs.PPT/ DATE / NN Performance Study of a SMPP Traffic Generator Tool Kai Wu Nokia NMP/MSW Supervisor: professor Jorma Virtamo.
Mohit Aron Peter Druschel Presenter: Christopher Head
OPERATING SYSTEMS CS 3502 Fall 2017
Empirically Characterizing the Buffer Behaviour of Real Devices
A Study of Group-Tree Matching in Large Scale Group Communications
April 6, 2001 Gary Kimura Lecture #6 April 6, 2001
Operating Systems (CS 340 D)
Chapter 6: CPU Scheduling
Department of Computer Science University of California, Santa Barbara
Lecture 2 Part 3 CPU Scheduling
Buffer Management for Shared-Memory ATM Switches
IP Control Gateway (IPCG)
ECE 671 – Lecture 8 Network Adapters.
Presentation transcript:

Improving the Performance of the Linux Network Subsystem King Fahd University of Petroleum and Minerals (KFUPM) INFORMATION AND COMPUTER SCIENCE DEPARTMENT Dr. K. Salah April 22, 2007 Dhahran, Saudi Arabia

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Agenda Introduction Receive-livelock Phenomenon Existing Schemes Previous Work. Why Hybrid Scheme? Problem Statement Project Objectives Equipment Project Phases and Scheduling Benefits and Utilizations Budget Summary

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Introduction High-Speed Network devices are widely deployed Gigabit Ethernet Technology supports 1 Gb/s and 10 Gb/s raw bandwidth Network performance has been shifted to servers and end hosts The high bandwidth increase can negatively impact the OS performance due to the interrupt overhead caused by the incoming gigabit traffic. As interrupt handling has more priority over other processing, this leads to receive-livelock phenomenon

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Typical Architecture Model

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Packet Arrival Rate - Slow Protocol Stack Applications Network traffic Host system

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Packet Arrival Rate - Fast Protocol Stack Applications Host system X Network traffic X

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Receive-livelock Phenomenon (Source: K. K. Ramakrishnan,1993) Throughput MLFRR Ideal Acceptable Livelock Offered load

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Existing Schemes Normal Interruption Interrupt Disabling and Enabling Polling –Pure Polling vs. NAPI Polling Interrupt Coalescing (IC) Hybrid Scheme

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Interrupt Disabling and Enabling The idea of pure interrupt disable-enable scheme is to have the interrupts of incoming packets turned off or disabled as long as there are packets to be processed by kernel’s protocol stack, i.e., the protocol buffer is not empty. When the buffer is empty, the interrupts are turned on again or re- enabled. –Any incoming packets (while the interrupts are disabled) are DMA’d quietly to protocol buffer without incurring any interrupt overhead.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Polling Disable interrupts of incoming packets altogether and thus eliminating interrupt overhead completely. OS periodically polls its host system memory (i.e., protocol processing buffer or DMA Rx Ring) to find packets to process. –In general, exhaustive polling is rarely implemented. Polling with quota is usually the case whereby only a maximum number of packets is processed in each poll in order to leave some CPU power for application processing. Two drawbacks for polling. –First, unsuccessful polls can be encountered as packets are not guaranteed to be present at all times in the host memory, and thus CPU power is wasted. –Second, processing of incoming packets is not performed immediately as the packets get queued until they are polled. Selecting the polling period is crucial. –Very frequent polling can be detrimental to performance as significant overhead can be encountered at each poll. –On the other hand, if polling is performed infrequently, packets may encounter long delays.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Pure Polling vs. NAPI Polling

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Pure Polling vs. NAPI Polling

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Shortcomings of NAPI Rotten Packets –When NAPI re-enables interrupts, there is the possibility of a packet or more would sneak in during that time and go undetected until a fresh packet arrives. These packets are known as “Rotten packets”. Poor Performance with CPU-bound Applications –NAPI was reported not to perform well for hosts that heavily loaded with CPU-bound applications. This is caused from scheduling polling using Linux softIRQs whereby CPU-bound user applications compete with softIRQs for CPU, and therefore softIRQs (and NAPI) would get less chance to run.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Interrupt Coalescing Most network adapters or NICs are manufactured to have interrupt coalescing. In IC, the NIC generates a single interrupt for a group of incoming packets. –This is opposed to normal interruption mode in which the NIC generates an interrupt for every incoming packet. Two schemes to mitigate the rate of interrupts –Count-based IC NIC generates an interrupt when a predefined number of packets has been received. –Time-based IC NIC waits a predefined time period before it generates an interrupt. During this time period multiple packets can be received.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Hybrid Scheme A combination of –Interrupt Disabling and Enabling & –Polling

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Why?

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Problem Statement In this research we intend –to implement a novel hybrid interrupt-handling scheme that improves the performance of Linux networking subsystem and overcome the shortcomings of NAPI. –to prove experimentally that our proposed scheme outperforms NAPI under different system configurations and load conditions.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Project Objectives Devise a novel scheme for Linux platform to enhance packet reception of links at Gigabit speed. –The scheme is expected to outperform in terms of latency, throughput, and CPU availability the scheme of NAPI currently implemented in the latest Linux 2.6. –The novel scheme should architect a proper solution to measure and forecast the traffic rate. –Also the novel scheme should work for a host with single and multiple interfaces. –More importantly, the scheme should work for SMP (Symmetric Multi- Processing) architecture where the host’s motherboard has multiple processors.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Project Objectives (cont’d) Find solutions to shortcomings and open issues of NAPI (other than latency, throughput, and CPU availability). These shortcomings include rotten packets and poor network performance when the system is heavily loaded with CPU-bound applications. Devise a novel generic benchmark for Linux hosts to measure find the switching point (cliff point).

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Project Objectives (cont’d) Develop a testbed of an experiment to examine and compare the performance of the new modified Linux version to latest Linux NAPI. –The experiment takes into account numerous and different test conditions and variables. Linux host with single and multiple network interfaces Different types of input traffic (bursty, constant, Poisson) Different packet sizes Various types of system loads including CPU-bound and I/O bound applications Hosts with single and multiple processors (i.e. SMP). The experiment should follow guidelines of testing and benchmarking laid out in RFC2544.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Experimental Equipment

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Project Phases and Scheduling Phase I: (Period of six months) –This is primarily a Linux network stack re-design and modification phase Phase II: (Period of twelve months) –This phase is concerned with the testbed and experimental setup as well as running performance evaluation of NAPI and our proposed hybrid scheme. Phase III: (Period of six months) –This phase is concerned with the performance of our hybrid scheme for hosts with SMP support.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Phase I 1.Devise an appropriate technique to measure in real-time the traffic arrival rate. This task includes the following subtasks: Perform extensive review to measure and forecast the arrival traffic rate. Devise a forecast technique that has the following requirements: (1) computationally simplified and optimized with minimal overhead and operations, (2) accurate in terms of being comparable to actual data rate, (3) stable in terms of ignoring short traffic spikes, and (4) responsive in terms of following changes in actual traffic rate. Examine the effectiveness of the proposed technique to forecast the traffic arrival rate and compare it with other proposed techniques in the literature. The technique must be appropriate for different type of traffics including bursty traffic with empirical packet sizes. Discrete Event Simulation (DES) will be used to assess the performance and effectiveness of our proposed technique. Plot, analyze, and compare performance of proposed technique for forecasting arrival traffic rate. Determine (using simulation and fine tuning of parameters) the minimum and maximum values (i.e., confidence interval) of forecasted/estimated traffic rate. These values will be used as the upper and lower thresholds of the cliff point and will be used by the hybrid scheme for switching between interrupt disable-enable and polling. Also they will be used to prevent frequent oscillation and switching between the scheme of interrupt disable-enable and polling, and thereby minimizing the overall overhead.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Phase I – cont’d 2.Understand thoroughly Linux kernel and the complex NAPI code. This would require the following subtasks: Understand and perform extensive review and study of Linux 2.6 network stack (NAPI) and the NIC network drivers. Set up a utility called cscope or kscope to navigate and browse the actual Linux code and understand it thoroughly. Identify exactly what code needs to be changed in both Linux kernel as well as the network driver Identify how different the code should be to support single processor and multi- processor host, i.e., SMP. 3.Investigate open known issues or shortcomings with NAPI (other than expected latency at low traffic rate) and critique proposed solutions in the literature. These shortcomings include: rotten packets and poor network performance under heavy CPU-bound applications. More importantly, investigate how our proposed solution of hybrid scheme will resolve these known open issues.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Phase II 4.Modify, test, and recompile the code of Linux 2.6 to implement our proposed hybrid scheme and the scheme to forecast the traffic arrival rate. In addition the code has to handle solutions to rotten packets and the problem of poor performance of network stack under a system heavily loaded with CPU-bound applications. 5.Learn how to use the IXIA 400T traffic generator/analyzer. Configure simple experiment of generating and receiving packets. 6.Identify the proper cliff point for the system. This can be accomplished only by determining the interrupt overhead and protocol processing time. The interrupt overhead and protocol processing time will be determined using measurement. –Using IXIA or some other technique, devise a generic and useful way to measure interrupt overhead. Determine the distribution of the interrupt overhead. –Using IXIA or some other technique, devise a way to measure protocol processing at OS level. Determine the distribution of kernel’s protocol processing.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Phase II – cont’d 7.Using IXIA 400T and a PC with Linux 2.6 and NAPI enabled, measure and plot the following performance metrics: Packet forwarding latency Packet forwarding throughput CPU utilization with packet forwarding 8.The above experiment will consider the following different configurations and conditions: Different packet sizes Traffic distribution: Poisson vs. bursty Traffic reception and transfer on a single NIC Traffic reception and transfer on multiple NICs 9.Using IXIA 400T and a PC with our proposed hybrid scheme, do the same performance measurements as in Task 7 and Task Plot and compare performance of NAPI and our proposed hybrid scheme. Make proper conclusions. 11.Compare and evaluate the performance of our solutions for NAPI shortcomings of rotten packets and poor network performance under CPU-bound applications. Consider performance conditions and configurations of Task 7 and Task 8.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Phase III 12.Examine the performance impact described for previous tasks of (Task 6-11) under Linux support for SMP with dual processors motherboard. –Compare SMP performance to the performance when using only a single processor. This is a huge phase, as six tasks are to be carried out again. Its is to be noted according to RFC 2544 recommendations that in order to obtain a reported value for a single performance point, a test has to be repeated at least 20 times and the reported value must be the average of these 20 recorded values. Also the recommendations and guidelines state that the test has to run at least 20 minutes for obtaining one single reported value. 13.Ensure that the novel scheme preservers the order of packets, i.e., there is no need for packet re-ordering. 14.Prepare and deliver the final report

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Work Plan

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Personal Requirement The project team will consist of the primary investigator and two graduate students (PhD or MS degree candidates). The graduate students will be a computer science/engineering graduate and will work under the supervision and guidance of the PI.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Benefits and Utilization contribute to the advancement of open-source operating systems (as that of Linux) by providing a step-up version that improves the performance of its networking subsystem to suit Gigabit network traffic. –This will lead to having better Linux-based routers, firewalls, servers, and proxies. utilize previously theoretical work of [24] to devise a new hybrid interrupt handling scheme to improve the networking performance of Linux or any operating systems. polling, and thereby minimizing the overall overhead. provide adequate solutions to NAPI shortcomings of the current Linux 2.6 networking subsystem.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Benefits and Utilization -- cont’d prove and demonstrate that the proposed hybrid scheme is a big enhancement in terms of performance form current versions when considering many different configurations and load conditions. provide an algorithm and computationally optimized technique to forecast the traffic arrival rate. Such an algorithm or technique should have no or minimal impact on Linux performance. provide a generic methodology and benchmark to identify the switching point. Research community at large can benefit substantially from the experimental work in terms of methodology, testbed, experimental setup and configuration. The experimental methodology and techniques can be employed for similar systems to conduct performance comparison.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Benefits and Utilization -- cont’d major beneficiaries may include almost all Saudi companies, as well as governmental and non-governmental institutions, that show keen interest in using Linux. –GbE deployment –Linux wide popularity will benefit KFUPM in general and the department of Information and Computer Science in particular. –It is anticipated that a modified version of Linux that best suits Gigabit traffic will carry the name of KFUPM and the ICS department on it. –KFUPM can be seen as an active contributor to open-source code and community. results of general interest to the research community will be published at key international conference, such as these of IEEE and ACM. Also it is anticipated that this research work will lead to publications in refereed reputable journals. No network traffic generators or analyzers at KFUPM. –Such a project can definitely lay the ground for further research and development by having such equipment available. The equipment can be utilized for research. –Also the IT center at the university can use such equipment for diagnosing and troubleshooting network problems related to performance bottlenecks.

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Budget

Project Seminar 2007 Dhahran, Saudi Arabia KFUPM, ICS Department K. Salah Summary In this research we intend to improve the performance of Linux networking subsystem and overcome the shortcomings of NAPI. The project will be of great benefit to research and open-source community and KUFPM, and the public at large