LISA Linux Switching Appliance Radu Rendec Ioan Nicu Octavian Purdila Universitatea Politehnica Bucuresti 5 th RoEduNet International Conference
Overview ● What is LISA ● VLAN switching and tagging ● Linux switching support ● LISA vs Linux switching ● LISA's architecture ● Implementation and optimizations bits ● Performance ● Future plans
LISA's initial goals ● Performant layer 2 and 3 packet switching using a standard PC architecture ● To resolve Linux VLAN scalability issues ● To resolve performance with broadcast packets on both trunk and access ports ● Basic VLAN switching features: VLAN switching, VLAN tagging, inter-VLAN routing ● CISCO like configuration and user interface ● Started as a graduate project last year
What we want LISA to become ● Framework for layer 2 protocols prototyping and analysis – current Linux bridge module is not easily extendable – network kernel programming is not for the faint of heart – the framework should hide away all of Linux networking internals and provide a clean API ● New features beeing implemented – STP – VTP – LTP
VLAN switching and tagging ● Not rocket science, but there are some scalability issues ● Broadcast/multicast packets need to be sent to multiple ports ● Some packets need data processing for tagging or untagging ● Copying and cloning of packets are necessary in certain scenarios
Linux VLAN switching ● The bridge module – Combines several network devices into a classic switch – Provides a virtual interface, which can be assigned an IP address – The virtual interface behaves like a physical interface connected to one of the switch's ports ● The 802.1Q module – One virtual interface is created for each VLAN – Each virtual interface sends and receives untagged packets – When packets are sent or received through a real device, 8021q tags are appropriately processed
Configuration example ● eth0 in trunk mode, allowing access to VLANs 1 and 2 ● eth1 in trunk mode, allowing access to VLANs 1 and 3 ● eth2 in access mode, in VLAN 2 ● eth3 in access mode, in VLAN 3 ● Routing between VLANs 2 and 3 – /24 on VLAN 2 – /24 on VLAN 3
Configuration example (2) vconfig add eth0 1 vconfig add eth0 2 vconfig add eth1 1 vconfig add eth1 3 brctl addbr br1 brctl addif br1 eth0.1 brctl addif br1 eth1.1 brctl addbr br2 brctl addif br2 eth0.2 brctl addif br2 eth2 brctl addbr br3 brctl addif br3 eth1.3 brctl addif br3 eth3 ifconfig br netmask ifconfig br netmask
Drawbacks ● One virtual interface is necessary for each VLAN of each trunk mode port ● If all 4094 VLANs must be switched between two trunk ports, a total of 8188 virtual interfaces are needed for VLAN support only ● When a packet is flooded / multicast on both trunk mode and access mode ports, the same tagging / untagging operations take place several times
Overview of LISA's architecture (userspace)
Overview of LISA's architecture (kernelspace)
LISA's switching engine ● LISA hooks into the NAPI packet reception code between generic packet handler processing and protocol specific packet handler ● Switching is done by directly manipulating the socket buffers and queuing them appropriately to the output network interfaces queues ● Virtual interfaces are seen as real interfaces in userspace, but all processing is done in LISA's switching engine – fast, scalable approach
Packet flow Interface queue (ingress) Generic packet handlers (libpcap,etc.) LISA Switching Engine Interface queue (egress) Linux L3 processing Virtual Interfaces Interface queue (egress)... Switching Non-switch member
Minimum data copying optimizations ● If a packet needs to be sent to multiple egress ports and do not needs processing then clone it ● First send the packet to all the ports that have the same tagging as the incoming packet ● If a packet needs to be processed, copy it only if the socket buffer is used by someone else ● A copied packet doesn’t need to be cloned again before sending it to the next outgoing port if there is only one port to be sent to
Packet post-processing ● To minimize tagging/untagging, cloning and copying operations while(more ports in list) { if(there is a "previous port") { clone the socket buffer; send the socket buffer to "previous port"; } the current port becomes "previous port"; } if(there is a "previous port") { send the socket buffer to "previous port"; } else { discard the socket buffer; }
Testing topology ● Test1, Test2 and Test3 are all Dual Xeon / 2.8 GHz, with Intel 6300 ESB chipset and two BCM5721 (Broadcom Tigon 3) network adapters. ● LiSA is a Commel Systems LE 564 embedded system – Via embedded x86 processor at 533MHz, PCI bus
Packet rate results
CPU usage results
Throughput results
Why is it so slow? ● Each packet crosses the PCI bus twice ● A 33MHz PCI bus bandwith is 1GBs ● The PCI bus has no separate address and command lines ● DMA transfers are performed in bus master mode ● The main memory is shared between the CPU and DMA transfers ● When transmitting packets, each packet must be acknowledged by the CPU before another packet can be sent
Future plans ● Integrate VTP, STP and LCP modules which are now beeing developed ● Support zero-copying in the switching engine via the scatter-gather capabilities of modern NICs ● Test the performance improvements when using new bus technologies like PCI-Express ● Test the scalability of the switching engine on SMP systems ● Implement more layer 2 protocols
For more information... Thank you ! Questions?