Towards Effective Packet Classification

Slides:



Advertisements
Similar presentations
Task Partitioning for Multi-Core Network Processors Rob Ennals, Richard Sharp Intel Research, Cambridge Alan Mycroft Programming Languages Research Group,
Advertisements

A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.
Data plane algorithms in routers
Packet Classification using Hierarchical Intelligent Cuttings
Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond
1 IP-Lookup and Packet Classification Advanced Algorithms & Data Structures Lecture Theme 08 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Balajee Vamanan, Gwendolyn Voskuilen, and T. N. Vijaykumar School of Electrical & Computer Engineering SIGCOMM 2010.
Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Author: Wenjun Li, Xianfeng Li Publisher: 2013 IEEE 21 st Annual Symposium.
Outline Introduction Related work on packet classification Grouper Performance Empirical Evaluation Conclusions.
Survey of Packet Classification Algorithms. Outline Background and problem definition Classification schemes – One dimensional classification – Two dimensional.
KARL NADEN – NETWORKS (18-744) FALL 2010 Overview of Research in Router Design.
CSIE NCKU High-performance router architecture 高效能路由器的架構與設計.
Packet Classification on Multiple Fields Pankaj Gupta and Nick McKeown Stanford University {pankaj, September 2, 1999.
CS 268: Lectures 13/14 (Route Lookup and Packet Classification) Ion Stoica April 1/3, 2002.
CS 268: Route Lookup and Packet Classification Ion Stoica March 11, 2003.
1 Energy Efficient Packet Classification Hardware Accelerator Alan Kennedy, Xiaojun Wang HDL Lab, School of Electronic Engineering, Dublin City University.
Two stage packet classification using most specific filter matching and transport level sharing Authors: M.E. Kounavis *,A. Kumar,R. Yavatkar,H. Vin Presenter:
CS 268: Route Lookup and Packet Classification
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Wenjun Li Xianfeng Li School of Electronic and Computer Engineering.
Worst-Case TCAM Rule Expansion Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel)
Computer Networks Switching Professor Hui Zhang
A Scalable, Cache-Based Queue Management Subsystem for Network Processors Sailesh Kumar, Patrick Crowley Dept. of Computer Science and Engineering.
Paper Review Building a Robust Software-based Router Using Network Processors.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Xinming Chen, Zhen Chen, Beipeng Mu, Lingyun Ruan, Jinli Meng Towards High-performance IPsec on Cavium OCTEON Platform Research Institute of Information.
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
Applied Research Laboratory Edward W. Spitznagel 7 October Packet Classification for Core Routers: Is there an alternative to CAMs? Paper by: Florin.
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
Packet Classification on Multiple Fields 참고 논문 : Pankaj Gupta and Nick McKeown SigComm 1999.
Applied Research Laboratory Edward W. Spitznagel 24 October Packet Classification using Extended TCAMs Edward W. Spitznagel, Jonathan S. Turner,
Balajee Vamanan and T. N. Vijaykumar School of Electrical & Computer Engineering CoNEXT 2011.
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
Performance Analysis of Packet Classification Algorithms on Network Processors Deepa Srinivasan, IBM Corporation Wu-chang Feng, Portland State University.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
Cross-Product Packet Classification in GNIFS based on Non-overlapping Areas and Equivalence Class Author: Mohua Zhang, Ge Li Publisher: AISS 2012 Presenter:
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
Packet Switch Architectures The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar,
Packet Classification Using Multidimensional Cutting Sumeet Singh (UCSD) Florin Baboescu (UCSD) George Varghese (UCSD) Jia Wang (AT&T Labs-Research) Reviewed.
Dynamic Algorithms with Worst-case Performance for Packet Classification Pankaj Gupta and Nick McKeown Stanford University {pankaj,
Hierarchical packet classification using a Bloom filter and rule-priority tries Source : Computer Communications Authors : A. G. Alagu Priya 、 Hyesook.
Hierarchical Hybrid Search Structure for High Performance Packet Classification Authors : O˜guzhan Erdem, Hoang Le, Viktor K. Prasanna Publisher : INFOCOM,
IP Routers – internal view
Toward Advocacy-Free Evaluation of Packet Classification Algorithms
Data plane algorithms in routers
What’s “Inside” a Router?
An NP-Based Router for the Open Network Lab Hardware
Statistical Optimal Hash-based Longest Prefix Match
Transport Layer Systems Packet Classification
An NP-Based Router for the Open Network Lab Overview by JST
Data Plane Algorithms in Network Processing Systems
Packet Classification Using Coarse-Grained Tuple Spaces
Scalable Multi-Match Packet Classification Using TCAM and SRAM
Using decision trees to improve signature-based intrusion detection
High-performance router/switch architecture 高效能路由器/交換器的 架構與設計
Duo Liu, Bei Hua, Xianghui Hu, and Xinan Tang
Author: Xianghui Hu, Xinan Tang, Bei Hua Lecturer: Bo Xu
IXP C Programming Language
Worst-Case TCAM Rule Expansion
Packet Classification Using Binary Content Addressable Memory
Presentation transcript:

Towards Effective Packet Classification J. Li, Y. Qi, and B. Xu Network Security Lab RIIT, Tsinghua University Dec, 2005

Outline Algorithm Study Network Processor Implementation Summary Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

Outline Algorithm Study Network Processor Implementation Summary Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

1.1 Understanding Packet Classification Packet Classification Overview Incoming Packet HEADER ---- Predicate Action Policy/Rule Database) Packet Classification Forwarding Engine ACTION

1.1 Understanding Packet Classification An Example Rule Set E.g. A packet P(152.168.3.32, 152.163.171.71, …, TCP) would have action A2 (also matches An but A2 has higher priority) applied to it.

1.1 Understanding Packet Classification Rules In the Search Space 255 128 R4 R5 R3 R2 R6 R7 R1 P Rule Xrange Yrange R1 0-31 0-255 R2 128-131 R3 64-71 128-255 R4 67-67 0-127 R5 0-15 R6 128-191 4-131 R7 192-192 R1 (0-31, 0-255) Spaces: Single/multiple dimensions (fields); Span of each dimensions. Rules: Prefix/Range matching; Structural characteristics. Packets: Dynamic characteristics.

1.2 Worst-case Complexity Analysis Point Location: among N non-overlapping regions in F dimensions takes Either O(logN) time with O(NF) space Or O(N) space with O(logF-1N) time E.g. N=1000,F=4:1000G space,1000 accesses De-overlapping: N overlapping regions need up to (2N-1)F non-overlapping region to represent. Range-to-Prefix: N rules in range [0, 2W] need up to (N(2W-1))F prefixes to represent. F: number of fields; W: bit length of each field

1.2 Worst-case Complexity Analysis Conclusion The theoretical bounds tell us that it is not possible to arrive at a practical worst case solution. Fortunately, we don’t have to; No single algorithm will perform well for all cases. Hence a hybrid scheme might be able to combine the advantages of several different approaches. —— P. Gupta

1.3 Existing Algorithmic Solutions Algorithm Categorization (1) Categorization Based on Packet Search Data-structures [17]

1.3 Existing Algorithmic Solutions Algorithm Categorization (2) Categorization Based on Space Partition

1.4 Novel Algorithms: D-Cuts Dynamic Cuttings: Ideas HiCuts Tree Rules D-Cuts Tree If most traffic matches {R1, R3, R4}, i.e. goes through subspace {X(000:111), Y(000:001)} then we can rebuild the HiCuts tree to cut down the worst case depth.

1.4 Novel Algorithms: D-Cuts Dynamic Cuttings: Performance Topt: Optimized for time Sopt: for space

1.4 Novel Algorithms: HSM Hierarchical Space Mapping: Ideas SA AMT DA Indexed Search ? Binary Search ? AMT: Address Mapping Table PMT: Port Mapping Table PLT: Policy Lookup Table Packet PLT AMT PMT SA DA SP DP (RFC) Indexed Search: too large index tables (HSM) Binary Search: not slow but avoid large index tables

1.4 Novel Algorithms: HSM Hierarchical Space Mapping: Performance Classifiers Number of rules RFC(kB) HSM(kB) Percentage Improved FW1 68 802 41 95% FW2 136 838 111 87% FW3 340 1,186 262 78% CR1 500 1,060 119 89% CR2 1,000 2,122 923 61% CR3 1,535 3,454 1,947 44% CR4 1,945 6,320 3,957 37%

1.4 Novel Algorithms: sBits Shifting Bits: Ideas Pointer array: Too Large Replace the pointers with a Bit string Replace the pointers with Indexes Note: 32:1 compression rate No additional memory access Hardware supported

1.4 Novel Algorithms: sBits Shifting Bits: Performance No. Rules HiCuts HyperCuts sBits FW1 68 5,443 35,401 420 FW2 136 10,779 69,782 924 FW3 340 24,645 172,932 2,331 CR1 500 29,409 89,005 3,612 CR2 1000 979,736 871,541 28,287 CR3 1530 13,606,858* 480,225 29,204 CR4 1945 5,928,724* 672,442 43,183 sBits vs. HiCuts/HyperCuts: memory usage (Unit: 32-bit long-word)

1.4 Novel Algorithms: sBits Shifting Bits: Performance sBits vs. HyperCuts: memory usage against rulesets of different size

1.4 Novel Algorithms: Summary Tree-based Algorithms (HiCuts, D-Cuts) Memory efficient No explicit worst-case bound, not fast enough Table-based Algorithms (RFC, HSM) Fast search speed Not memory efficient Hybrid Algorithms (sBits) Combine the advantages of several different approaches. Maybe hard to implement (too complicated)

Outline Algorithm Study Network Processor Implementation Summary Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

2.1 Current Hardware Limits TCAM Board area Power Range matching ASIC/FPGA R&D cost Update General Purpose CPU Continuity of both time and space Network Processor Highly integrated processing units Date plane & Control plane Handle rarely associative network traffics

2.2 External & Internal Traffics Traffic In a Router Processor Memory External Traffic Internal Traffic Example: Assume: 1 rule = 64 Byte in Memory Assume: 1 packet= 64 Byte going through Processor By Linear Search: process 1 packet needs to read 1K rules in worst-case. (Internal Traffic) : (External Traffic) = 1000:1

2.2 External & Internal Traffics Traffic In a Router Processor Memory External Traffic Internal Traffic Example (continue): Assume SRAM Bandwidth in NP = 20GByte/s If (Internal Traffic) : (External Traffic) = 1000:1 External Traffic < (20G/1000) = 20MByte/s (160Mbps) Else if (Internal Traffic) : (External Traffic) = 40:1 External Traffic < (20G/40) = 0.5GByte/s (4Gbps)

2.2 External & Internal Traffics Existing Algorithms (dealing with 2,000 rules) Table-based Algorithms: (Internal Traffic) : (External Traffic) = 1:1~5:1 Best temporal performance Require up to 30MB SRAM Tree-based Algorithms: (Internal Traffic) : (External Traffic) = 20:1~30:1 Require less than 10MB SRAM Unstable performance, no worst-case bound

2.3 Intel IXP Implementation Intel Network Processor Architecture XScale Core 32K IC 32K DC MEv2 10 11 12 15 14 13 Rbuf 64 @ 128B Tbuf Hash 64/48/128 Scratch 16KB QDR SRAM 2 1 RDRAM 3 G A S K E T PCI (64b) 66 MHz IXP2800 16b 18 64b P I 4 or C X Stripe E/D Q 9 16 7 6 5 8 CSRs -Fast_wr -UART -Timers -GPIO -BootROM/SlowPort

2.3 Intel IXP Implementation IXP2xxx Packet Processing Stages Packet Rx Ethernet Decap Range Matching IPv4 Forwarding Queue Managing Packet Tx Scheduling SPI4 CSIX Packet Processing Stages of the Packet Classification Application. Packet classification algorithms are running in Rage Matching PPS.

2.3 Intel IXP Implementation Simulation Result: Linear Search Performance Evaluation of Linear Search Algorithm. Each incoming packet just matches the default rule, so that the worst-case performance is obtained. Deterministic worst-case bound: O(N).

2.3 Intel IXP Implementation Simulation Result: HSM Performance Evaluation of HSM Algorithm. Deterministic worst-case bound: O(logN).

2.3 Intel IXP Implementation Simulation Result: HiCuts (worst-case path) Performance Evaluation of HiCuts Algorithm. Non-deterministic worst-case bound. 1k rules often need a 10-level decision tree.

2.3 Intel IXP Implementation Simulation Result: HiCuts (worst-case path) And what’s more, in the worst-case, it often needs up to 10 times of linear searches after tracing down the decision tree.

Outline Algorithm Study Network Processor Implementation Summary Understanding Packet Classification Worst-case Complexity Analysis Existing Algorithmic Solutions Our Novel Algorithms Network Processor Implementation Current Hardware Limits External & Internal Traffics Intel IXP Implementation Summary

3 Summary No single algorithm will perform well for all cases: We search for algorithms that are “fast enough” and use “not too much” memory. Search Speed should be guaranteed in the worst-case. Hardware limits require flexible algorithms: Designing an effective algorithm should consider the features and limits of the hardware: e.g. SRAM size… Implementation of an algorithm should make full use of all hardware units: e.g. Local Memory for Port Indexing…

References [1] P. Gupta and N. McKeown, “Packet Classification Using Hierarchical Intelligent Cuttings”, Proc. Hot Interconnects, 1999 [2] M.H. Overmars and A.F. van der Stappen, “Range Searching and Point Location among Fat Objects”, J. of Algorithms, 21(3), 1996. [3] P. Gupta and N. McKeown, “Packet Classification on Multiple Fields”, Proc. ACM SIGCOMM 99, 1999. [4] B. Xu, D. Jiang, J. Li, “HSM: A Fast Packet Classification Algorithm”, Proc. IEEE 19th International Conference on Advanced Information Networking and Applications (AINA), 2005. [5] V. Srinivasan, et al., "Fast and Scalable Layer Four Switching“, Proc. ACM SIGCOMM, 1998. [6] F. Baboescu, and G. Varghese, "Scalable Packet Classification“, Proc. ACM SIGCOMM, 2001. [7] S. Singh, F. Baboescu, and G. Varghese, "Packet Classification Using Multidimensional Cutting“, Proc. ACM SIGCOMM, 2003. [8] F. Baboescu, S. Singh, and G. Varghese, “Packet Classification for Core Routers: Is There An Alternative To CAMs?” Proc. INFOCOM, 2003. [9] V.Srinivasan, S.Suri, and G.Varghese, “Packet Classification using Tuple Space Search”, Proc. SIGCOMM, 1999. [10] S. Singh and F. Baboescu, “Packet Classification Repository”, http://ial.ucsd.edu/classification [11] A. Feldman and S. Muthukrishnan, “Tradeoffs for Packet Classification”, Proc. INFOCOM, 2000. [12] T. Lakshman and D. Stiliadis, “High Speed Policy-based Packet Forwarding using Efficient Multi-dimensional Range Matching”, Proc. SIGCOMM,1998. [13] T.Y.C Woo, “A Modular Approach to Packet Classification: Algorithms and Results”, Proc. IEEE INFOCOM, 2000. [14] F. Geraci, M. Pellegrini, and P. Pisati, “Packet Classification via Improved Space Decomposition Techniques”, Proc. IEEE INFOCOM, 2005. [15] Y. Qi and J. Li, “Dynamic Cuttings: Packet Classification with Network Traffic Statistics”, Proc. 3rd International Trusted Internet Workshop (TIW), 2004. [16] P. Gupta and N. McKewon, “Algorithms for Packet Classification”, IEEE Network, March/April 2001, 2001. [17] D. E. Taylor , “Survey & Taxonomy of Packet Classification Techniques”, Washington University in Saint-Louis, US, 2004. [18] M.E. Kounavis, A. Kumar, H. Vin, R. Yavatkar and A.T. Campbell, “Directions in Packet Classification for Network Processors”, Proc. Second Workshop on Network Processors (NP2), 2003.

Thanks junl@tsinghua.edu.cn