An open source user space fast path TCP/IP stack and more…

Slides:



Advertisements
Similar presentations
COMS W6998 Spring 2010 Erich Nahum
Advertisements

NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
L3 + VXLAN Made Practical
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
OpenFlow overview Joint Techs Baton Rouge. Classic Ethernet Originally a true broadcast medium Each end-system network interface card (NIC) received every.
Internet Control Protocols Savera Tanwir. Internet Control Protocols ICMP ARP RARP DHCP.
An Overview of Software-Defined Network Presenter: Xitao Wen.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
NIOS II Ethernet Communication Final Presentation
1 Network Packet Generator Characterization presentation Supervisor: Mony Orbach Presenting: Eugeney Ryzhyk, Igor Brevdo.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
ECE 526 – Network Processing Systems Design IXP XScale and Microengines Chapter 18 & 19: D. E. Comer.
An Overview of Software-Defined Network Presenter: Xitao Wen.
Building a massively scalable serverless VPN using Any Source Multicast Athanasios Douitsis Dimitrios Kalogeras National Technical University of Athens.
Christopher Bednarz Justin Jones Prof. Xiang ECE 4986 Fall Department of Electrical and Computer Engineering University.
QualNet 2014/05/ 尉遲仲涵. Outline Directory Structure QualNet Basic Message & Event QualNet simulation architecture Protocol Model Programming.
Small Form Computing A bump in the wire. The questions ● What can we do with an inexpensive small computer? ● Can we make it a part of a seamless wireless.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
Examining TCP/IP.
© 2010 IBM Corporation Plugging the Hypervisor Abstraction Leaks Caused by Virtual Networking Alex Landau, David Hadas, Muli Ben-Yehuda IBM Research –
Networking Colin Alworth May 26, Quick Review IP address: four octets Broadcast addresses –IP addresses use all 1’s for the host bits, and whatever.
Hour 7 The Application Layer 1. What Is the Application Layer? The Application layer is the top layer in TCP/IP's protocol suite Some of the components.
1.4 Open source implement. Open source implement Open vs. Closed Software Architecture in Linux Systems Linux Kernel Clients and Daemon Servers Interface.
1 TCP/IP, Addressing and Services S. Hussain Ali M.S. (Computer Engineering) Department of Computer Engineering King Fahd University of Petroleum and Minerals.
Fast NetServ Data Path: OpenFlow integration Emanuele Maccherani Visitor PhD Student DIEI - University of Perugia, Italy IRT - Columbia University, USA.
Application Block Diagram III. SOFTWARE PLATFORM Figure above shows a network protocol stack for a computer that connects to an Ethernet network and.
CHAPTER 5 TCP/IP PROTOCOLS. P ROTOCOL STANDARDS Protocols are formal rules of behavior When computers communicate, it is necessary to define a set of.
An initial study on Multi Path Routing Over Multiple Devices in Linux 2.4.x kernel Towards CS522 term project By Syama Sundar Kosuri.
© 2010 Cisco and/or its affiliates. All rights reserved. 1 Ruchir Jain Customer Support Engineer CCIE R&S (26911) With,
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Christopher Bednarz Justin Jones Prof. Xiang ECE 4986 Fall Department of Electrical and Computer Engineering University.
Linux Operations and Administration Chapter Eight Network Communications.
Ethernet Bomber Ethernet Packet Generator for network analysis
 Program Abstractions  Concepts  ACE Structure.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Introduction to Mininet, Open vSwitch, and POX
Trickles :A stateless network stack for improved Scalability, Resilience, and Flexibility Alan Shieh,Andrew C.Myers,Emin Gun Sirer Dept. of Computer Science,Cornell.
Presented by: Xianghan Pei
Network Layer 3 Application Presentation Session Transport Network Data Link Physical OSI Model.
Software Defined Networking and OpenFlow Geddings Barrineau Ryan Izard.
Atrium Router Project Proposal Subhas Mondal, Manoj Nair, Subhash Singh.
Implementing Cisco IP Routing (ROUTE v2.0)
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
LISA Linux Switching Appliance Radu Rendec Ioan Nicu Octavian Purdila Universitatea Politehnica Bucuresti 5 th RoEduNet International Conference.
Scope/Goals of TLDK What is the scope of TLDK? High performance TCP/UDP Testing via CSIT needs to have a set of unit tests for validation Centralized or.
Quality and Value for the Exam 100% Guarantee to Pass Your Exam Based on Real Exams Scenarios Verified Answers Researched by Industry.
What is CRKIT Framework ? Baseband Processor :  FPGA-based off-the-shelf board  Control up to 4 full-duplex wideband radios  FPGA-based System-on-Chip.
InterVLAN Routing 1. InterVLAN Routing 2. Multilayer Switching.
TLDK Transport Layer Development Kit
TLDK overview Konstantin Ananyev 05/08/2016.
Overlay Network Engine (ONE)
BESS: A Virtual Switch Tailored for NFV
Programmable Overlays with VPP
100% Exam Passing Guarantee & Money Back Assurance
Cisco Real Exam Dumps IT-Dumps
Cisco Real Exam Dumps IT-Dumps
Internet Control Message Protocol (ICMP)
Procket’s IPv6 Implementation
Virtio Keith Wiles July 11, 2016.
Open vSwitch HW offload over DPDK
Implementing an OpenFlow Switch on the NetFPGA platform
Reprogrammable packet processing pipeline
Empowering OVS with eBPF
Top #1 in China Top #3 in the world
The Router Plugins system architecture
OSI Reference Model Kashif Ishaq.
NetFPGA - an open network development platform
Multicasting Unicast.
Virtual Private Network
Presentation transcript:

An open source user space fast path TCP/IP stack and more…

Enter OpenFastPath! A TCP/IP stack that lives in user space is optimized for scalability, throughput and latency uses Open Data Plane (ODP) to access network hardware works with Data Plane Development Kit (DPDK) runs on ARM, x86, MIPS, PPC hardware runs natively, in a guest or in the host platform A modular protocols library for termination and forwarding usecases providing a framework extensible with new protocols augments existing or implements missing HW acceleration The OpenFastPath project is a true open source project uses well known open source components open for all to participate – no lock-in to HW or SW Nokia, ARM and Enea key contributors

Features implemented Fast path protocols processing: Command line interface Layer 4: UDP, TCP, ICMP protocols Packet dumping and other debugging Layer 3 Statistics, ARP, routes, and interface printing ARP/NDP Configuration of routes and interfaces with VRF support IPv4 and IPv6 forwarding and routing IP and ICMP implementations PASS Ixia conformance tests IPv4 fragmentation and reassembly VRF for IPv4 IGMP and multicast IP and UDP implementations have been optimized for performance -> linear scalability Layer 2: Ethernet, VLAN GRE and VXLAN Tunneling Routes and MACs are in sync with Linux TCP impl. optimization is in progress Integrated with NGINX webserver Integration with Linux IP stack through TAP interface Binary compatibility with Linux applications - no recompilation is needed to use OFP.

OpenFastPath System View User Termination or Forwarding Socket Egress API Socket Hook API Host OS (Linux) User ConfCode Init API OpenFastPath (OFP) Netlink pkt_cnt = odp_pktio_recv(pktio, pkt_tbl, OFP_PKT_BURST_SIZE); or buf = odp_schedule(&queue, ODP_SCHED_WAIT); Route tables TAP PKTIO Interface Management Ingress API Slow path ODP/DPDK Linux OFP HW Application User/Default Dispatcher Here is a block diagram view of the OFP system. Here we can see the PKTIO module that handles communication to and from the linux slowpath as well as packet egress. We can also see a block called user/default dispatcher. This block implements the dispatcher functionality that reads packets through the ODP API’s. It’s been placed outside OFP to give the user control over which API to use to get packets form ODP. Depending on the underlying ODP implementation and HW, scheduler, burst or polling mode can be selected Works together with the Linux IP stack ODP SW DPDK ODP/DPDK FW/HW Ctrl HW / NICs Packets

OpenFastPath multicore System View Dispatcher 1 Ingress API Socket callback /Hook API User Termination or Forwarding A Init API PKTIO Dispatcher 2 Ingress API User Termination or Forwarding B OpenFastPath (OFP) (SMP multicore library) …. Dispatcher N Ingress API User Termination or Forwarding X PKTIO Socket callback /Hook API Socket callback /Hook API Single thread context Host OS (Linux) User ConfCode Netlink Route tables TAP PKTIO Slow path …. Ok, now let’s look at a multicore OFP system. One core (#0) is required for Linux system calls, mainly for CLI and route copy and for communication with Linux kernel using TUN/TAP interface. An additional Linux core might be needed for slowpath if there are a lot of slowpath traffic. Other cores are allocated by ODP for fast path processing. User Conf Code is a management thread that is running on the Linux core. Is started by ODP and shares same memory as the fastpath cores. OFP is a multithreaded multicore application so there is one instance of OFP that run across all data plane cores. However there is a separate independent dispatcher threads to allow different dispatchers on each core On the cores allocated to fastpath processing, ODP starts only one thread where the dispatcher, OFP and the user application code runs. This is the case when using the hook or callback APIs. If non-blocking legacy socket APIs is used then you can have both in the same thread. A NGiNX worker process works over OFP by scheduling packets, processing them and then consuming them through non-blocking APIs like: select(), read(), ... ODP SW DPDK …. Core 0 Core 1 Core 2 Core N ODP/DPDK Linux OFP HW Application ODP/DPDK FW/HW NICs

Ingress Packet Processing Loopback to VXLAN IP, UDP, TCP, … classified by HW IPv4/v6 IPv4/v6 local hook API UDP input BSD Socket API Callback API IPv4/v6 forward hook API TCP input VXLAN IPv4 Reassembly Transport(L4) classifier IPv4 GRE GRE hook API Ingress API Ethernet VLAN ICMP NDP Pre-classified L2 L3 L4 User API Packets Information Fallback to slowpath for unknown traffic This is a view of the OFP ingress packet flow showing the different modules involved in the packet processing. The dark red boxes represent OFP application API’s or BSD socket APIs OFP has the capability to leverage HW classification functionality. Pre-classfied packets will simply bypass the stages that have already been done in HW enabling higher throughput. This can be seen in the top left corner of the picture. Packets with unsupported protocols or protocol extensions are sent to the linux slow path. Notice the relative simplicity compared to the complexity of the Linux TCP/IP stack IPv4/v6 routing IGMP Update MAC table IPv4/v6 output ARP Send ARP request

Egress Packet Processing UDP output TCP output IPv6 output BSD Socket or Egress API IPv4 output IPv4 Fragmentation Ethernet VLAN ICMP error Pre-classified L2 L3 L4 User API Packets Information IPv4 GRE tunneling VXLAN The egress packet flow is even leaner to maximize throughput. It also supports BSD Socket APIs or OFP packet APIs

Optimized OpenFastPath socket APIs New zero-copy APIs optimized for single thread run-to-completion environments UDP Send: Optimized send function with a packet container (packet + meta-data) Receive: A function callback can be registered to read on a socket. Receives a packet container and socket handle TCP Accept event: A function callback can be registered for TCP accept event. Receives socket handle. Receive: A function callback can be registered to read on socket. Receives a packet container and a socket handle Standard BSD Socket interface For compatibility with legacy Linux applications Traditional BSD socket communication typically involves a copy operation from the IP stack to the application. This has a major performance impact and to address this we have implemented new zero-copy API’s which are optimized for the type of run to completion environment ODP provides. This is done through a callback API that allows the application to register callbacks for UDP and TCP receive as well as TCP accept event functionality. The callback function receives an ODP packet container and a socket handle directly without a copy operation. UDP send has also been optimized as a zero copy API. For legacy application the standard BSD sockets can still be used with good performance but not as good as through the call back API’s

OFP Performance on ARM Benchmarking target is to measure external to host throughput, latency, jitter, and packet loss values of the DUT. Test app will be run starting from core number 4. With variable number 1 to 4 of Rx/Tx-queues which interrupt handling is mapped to corresponding core. Using a properly configured Ixia test environment, conduct the UDP baseline testing scenarios with various defined network frame size configurations to assess and profile baseline network impact, i.e., 64, 128, 256, 512, 1024, 1248, 1518 byte frame sizes (corresponding MTU size). Sending traffic incrementally to different IP addresses and ports. Opt. Monitor scheduling latency by running Cyclictest on the time when traffic flow is on. Opt. Monitor CPU load by running top on the time when traffic flow is on. UDP Echo test: Receiver, One IP address and as many UDP ports as cores. Sender, As many sender IP addressees as ports/cores and UDP ports range 2048 – 3072. IP Forward test: Receiver IP address range 4048 and 1000 UDP ports continously. Sender has always same IP address and UDP port.

OFP Performance on x86 x20 Intel Xeon E5-2697 v3 processor (turbo disabled) Two 82599 NICs with modified netmap ixgbe 4.1.5 driver (12 rx/tx queue pairs) totaling 4x10Gbps ports

NGINX – OFP TCP/IP integration Standard TCP/IP OFP TCP/IP NGINX worker process NGINX worker process NGINX worker process NGINX worker process NGINX worker process NGINX worker process Context switch OFP BSD Socket API OFP TCP/IP stack library Kernel TCP/IP Locks PKT IO API ODP/DPDK Avoid context switches Avoid locks Streamlined packet path Better scalability, throughput and latency NIC Hardware NIC Hardware RX/TX queues RX/TX queues RX/TX queues Packet flows

Binary compatibility with Linux applications Standard TCP/IP Libraries to LD_PRELOAD OFP TCP/IP Application binary Overload Socket API with OFP Socket API Application binary OpenFastPath library libofp_netwrap_crt OpenFastPath library ODP/DPDK library ./ofp_netwrap.sh ./binary Linux TCP/IP ODP/DPDK Library libofp Start and configure OFP TCP/IP stack NIC Hardware libofp_netwrap_proc NIC Hardware Packet flows

What’s next? - Get involved! Download the source code from: https://github.com/OpenFastPath/ofp Check us out at www.openfastpath.org to get more information about the project Subscribe to Mailing-list: http://www.openfastpath.org/mailman/listinfo Ping us on our freenode chat: #OpenFastPath

For additional information, please visit www.openfastpath.org