A Simplified, Cost-Effective MPLS Labeling Architecture for Access Networks Harald Widiger1, Stephan Kubisch1, Daniel Duchow1, Thomas Bahls2, Dirk Timmermann1 1University of Rostock, Germany 2Siemens Communications, Greifswald, Germany My Name is Harald Widiger. I am with the University of Rostock in Germany, and today, I want to present „A simplified and cost-effective MPLS Labeling Architecture for Access Networks”
Outline Access Network Architecture Multi Protocol Label Switching The MPLS-User Network Interface Implementation and Simulation Results Conclusion To do so, I first want to show the Access Network Architecture, we are dealing with, and the needs that occur from future requirements. Then I will present the Multi Protocol Label Switching as base of our development. Before i explain the System Architecture of The MPLS-UNI we developed and present both implementation and simulation results. My presentation will finish with a short conclusion. 4/24/2019 University of Rostock
Access Network Environment Need for Differential Services, increased QoS Derived from Information within each Frame MPLS-UNI to create space for information Actual Access Networks aggregate dsl-connections of residential users and Ethernet connections of business customers. The concentration is done via different aggregation levels. In the future, Customers demand higher bandwidths like VDSL or even VDSL2 and differential services. And of course, ISPs want to offer more services and they want to increase the QoS they offer for different customers. Which service applies to whom must be decided. This decision depended on information within each frame In Order to insert this information a the Access Stage to the frames, we developed the MPLS-User Network Interface, which is inserted as an FPGA between the Central Switching unit and the Broadband Remote Access Server. At that position of the Access Network Architecture, today a throughput of 4 Gbps is required both in upstream and in downstream direction. 4/24/2019 University of Rostock
Multi Protocol Label Switching (MPLS) Encapsulation Scheme Meant for fast routing purposes Here: Simply a container to carry information Path of a Frame through an MPLS switched Network What is MPLS-User Network Interface? MPLS means Multi Protocol Label Switching and it is an Encapsulation Scheme. It is a protocol meant for fast routing purposes. Based on so called MPLS-Labels, within mpls-switched networks, a data frame is routed very fast through an MPLS switched Network. Next At an Ingress Point of that Network Label Edge Routers insert a Label Stack to each incoming frame. Next Then, the frame traverses the MPLS switched network on the base of the label stack. Next When the frame leaves the network another LER at the egress Point removes the MPLS Label Stack. However, we do not primarily utilize the MPLS Labels for Routing purposes but as a container to carry any information In fact we want to carry information, that helps other network instances to make, for example, QoS decisions for every frame. 4/24/2019 University of Rostock
MPLS-Encapsulation MPLS Label Stack usually between layer 2 and layer 3 header We use encapsulation scheme by Martini Usually, the MPLS-Label Stack is inserted between the Ethernet- and the IP- Header. But we decided to use the “Encapsulation Methods for Transport of Layer 2 Frames Over IP and MPLS Networks” that was proposed by martini. Here the complete Frame is encapsulated by another Ethernet header and the MPLS-Label stack. The method has the great advantage, that at the egress point of an MPLS-Network, header and Label Stack can simply be removed to get back the original frame. The inserted Label Stack is structured as given at the right hand side of the diagram. It consists of two 32 Bit Labels, of which each label can carry 20 bit of a desired information. The other 12 bit are reserved for the protocol. With 40 bit, a huge number of different services can be distinguished. Even if the 40 bits are not sufficient, more labels can be added to the stack. 4/24/2019 University of Rostock
MPLS-User Network Interface (MPLS-UNI) MPLS Label Stack container to carry information No complete LER implementation with an LDP running is necessary Possibility to implement the whole system in Hardware Primary Functionality: Upstream direction insert an MPLS Label Stack Downstream direction remove MPLS Label Stacks As mentioned before, Instead of using MPLS for switching and routing, it is intended to use the MPLS label stack to carry information. As we do not want to make switching decisions on the MPLS labels we do not have to implement a complete LER and no LDP is required. However, even in a design without an LDP, the use of MPLS for switching and routing is not excluded. As we reduced the functionality, the system can be realized with a simplified and cost-effective hardware solution for an MPLS User Network Interface (UNI) as I introduce today. The module’s purpose is to add an MPLS label stack to all incoming frames in upstream direction and to forward them to the providers' core networks. In downstream direction, the label stacks of all incoming frames are removed from the frame 4/24/2019 University of Rostock
MPLS-UNI Architecture T MPLS-UNI module is a hardware solution providing the advantages a hardware solution can offer: It performs its tasks with a great throughput! and with wirespeed! The whole System and was implemented in VHDL. Key In upstream direction, the module extracts a key from the headers of each incoming frame. It can consist of different fields, I will describe on the next slide. Key Depending on the extracted key, a memory is searched. Key The frame is MPLS-labeled with the information retrieved from the memory. This means, an MPLS Label Stack is inserted into the frame. In order to process a data rate of 1 Gbit per second, the MPLS-UNI has to work with a frequency of at least 125 MHz. As you can see, multiple parallel data paths can be used. To reach a throughput of four Gbit per second, as we required it in the Access Network, four parallel data paths are implemented. The functionality in downstream direction is simpler than in upstream direction. Here, the MPLS-labeled Ethernet frames are unlabeled and forwarded. No memory lookups are necessary, as the original frame is encapsulated as payload in the MPLS - Frames 4/24/2019 University of Rostock
Framebuffer with Key Parser The Framebuffer we implemented stores the whole frame in a FIFO and extracts a key from each frame and stores that key in another buffer The Key Parser is capable of parsing different fields of incoming Ethernet frames. Possible fields are source and destination MAC address, VLAN tags, Ethertype, source and destination IP address, and the DSCP field in the IP header. Any combination of these fields can be configured at the time of compilation. By selection of the parsed keys at the time of compilation, hardware resources are saved. Example: 352 vs. 721 slices between minimum and maximum key. Size of other modules depends on the key size, too. Together with the actual key, information about the number of stored frames and the size of the unused part of the frame buffer is presented to the memory arbiter. Stores frames and parsed keys Key is configurable at time of compilation Reduction of required hardware resources in the MPLS-UNI itself 4/24/2019 University of Rostock
Memory Arbitration As you could see in the MPLS-UNI architecture, in the example architecture, there are four different and independent data paths competing for memory accesses. The scheduling of those accesses is accomplished by a memory arbiter. We implemented a least laxity first first (LLF) Scheduler to prioritize the memory accesses. The LLF is usually used for process scheduling in operating systems. Here it assigns the memory access to the key with the smallest so called slack. the smallest slack is computed as difference between the deadline to meet and the computation time required. The parameters are presented to the arbiter by the frame buffers. By using the LLF, we can achieve a fair scheduling between all data paths. 4/24/2019 University of Rostock
Implementation Results (Xilinx Virtex 4 FX20-11) Hardware Module Speed in MHz Area Logic min/typ/max BRAMs MPLS-UNI 1125/1486/2227 11/11/12 MPLS-Labeler 187 129 2 MPLS-Delabeler 322 101 Memory Arbiter 163 152/203/343 CPU Arbiter 168 640 Key Parser & Framebuffer 159 352/494/721 9/9/10 Framebuffer 177 205 Memory internal (1K Entries) 126 783/1145/1917 3/7/15 Sync FIFOs + MACs 169 850 6 ∑System 130 2600/3400/4700 20/24/33 I implemented the classifier into a Xilinx Virtex 4 FPGA. For 1024 keys, and one data path. The MPLS-UNI itself required between 1125 and 2227 Slices The Memory, when implemented out of internal BRAMs required between 783 and 1917 Slices A whole Test system including MACs requires up to 4700 Slices. Typically 3400 Slices are required The differences in the required resources result out of the different configurations and key sizes that are possible for the implementation of the MPLS-UNI 4/24/2019 University of Rostock
Performance 4 Gbps @ „natural Traffic“ No packet loss 30 % 60 Byte 10 % 590 Byte 11 % 1514 Byte 49 % random No packet loss Average delay of 120 Cycles 860 ns @125 MHz Finally, I want to present the performance of our implementation. We did make extensive simulation runs on the VHDL RTL description. The graph on the left hand side shows the behavior of the MPLS-UNI when the data path is driven with 4 Gbps and very small frames. Depending on the size of the look up memory we experience different loss rates for small frame sizes. When all frames are minimal, meaning 60 Byte, the frame loss rate is between 35 and 55%. As you can see, the losses drop to zero, when the frame size reaches between 100 and 150 Byte. This performance is sufficient, as in real traffic, there is never a 4 Gbps load with only minimal frames. We made simulations with more realistic traffic, too. He we created a mix of framesizes with 30% minimal frames, 10% 590 Byte frames, 11% maximal sized frames and the rest with a random size between 60 and 1514 Byte. In these simulations with an average size of around 400 Byte, there was no packet loss at all, as could be expected from the first simulations with the small packets. Furhtermore, the the average delay of the MPLS-UNI was at less than 120 clockcycles, meaning a inserted delay of less than 1 us. 4/24/2019 University of Rostock
Conclusion Powerful and cost-effective solution to expand MPLS networks into the Access Network area @125 MHz, 4 Gbps can be handled Size of the system can be minimized considering the actual tasks Functional spectrum can be broadened, due to reconfigurable HW Concluding the presentation, I can say, that we developed a powerful and cost-effective solution to expand MPLS networks into the access network area The developed hardware can handle at least 4 Gbps @ 125 MHz. The required hardware resources can be minimized considering the actual task and the following system requirements. And of course, as we developed an architecture of an FPGA, the functional spectrum can be broadened. 4/24/2019 University of Rostock