1 Energy Efficient Multi-match Packet Classification with TCAM Fang Yu

Slides:



Advertisements
Similar presentations
1 SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu 1 T. V. Lakshman 2 Marti Austin Motoyama 1 Randy H. Katz 1 1 EECS.
Advertisements

Introduction to Algorithms NP-Complete
Fast Updating Algorithms for TCAMs Devavrat Shah Pankaj Gupta IEEE MICRO, Jan.-Feb
Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Author: Wenjun Li, Xianfeng Li Publisher: 2013 IEEE 21 st Annual Symposium.
Outline Introduction Related work on packet classification Grouper Performance Empirical Evaluation Conclusions.
A Ternary Unification Framework for Optimizing TCAM-Based Packet Classification Systems Author: Eric Norige, Alex X. Liu, and Eric Torng Publisher: ANCS.
Reviewer: Jing Lu Gigabit Rate Packet Pattern- Matching Using TCAM Fang Yu, Randy H. Katz T. V. Lakshman UC Berkeley Bell Labs, Lucent ICNP’2004.
Lectures on Network Flows
Asymptotic Growth Rate
Efficient Multi-match Packet Classification with TCAM Fang Yu Randy H. Katz EECS Department, UC Berkeley {fyu,
Fast Filter Updates for Packet Classification using TCAM Authors: Haoyu Song, Jonathan Turner. Publisher: GLOBECOM 2006, IEEE Present: Chen-Yu Lin Date:
The Theory of NP-Completeness
PC-DUOS: Fast TCAM Lookup and Update for Packet Classifiers Author: Tania Banerjee-Mishra, Sartaj Sahni,Gunasekaran Seetharaman Publisher: IEEE Symposium.
Efficient Multi-Match Packet Classification with TCAM Fang Yu
1 Gigabit Rate Multiple- Pattern Matching with TCAM Fang Yu Randy H. Katz T. V. Lakshman
1 A Fast IP Lookup Scheme for Longest-Matching Prefix Authors: Lih-Chyau Wuu, Shou-Yu Pin Reporter: Chen-Nien Tsai.
SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu 1 T. V. Lakshman 2 Martin Austin Motoyama 1 Randy H. Katz 1 1 EECS.
An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
Algorithms for Advanced Packet Classification with TCAMs Karthik Lakshminarayanan UC Berkeley Joint work with Anand Rangarajan and Srinivasan Venkatachary.
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
Efficient Partition Trees Jiri Matousek Presented By Benny Schlesinger Omer Tavori 1.
1 Efficient packet classification using TCAMs Authors: Derek Pao, Yiu Keung Li and Peng Zhou Publisher: Computer Networks 2006 Present: Chen-Yu Lin Date:
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
CoPTUA: Consistent Policy Table Update Algorithm for TCAM without Locking Zhijun Wang, Hao Che, Mohan Kumar, Senior Member, IEEE, and Sajal K. Das.
Presented by Dajiang Zhu 09/20/2011.  Motivation  Introduction & Conclusion  Pre – Definition Approximation Algorithms  Two problems as examples SUBSET-SUM.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
IT253: Computer Organization
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
Fast Packet Classification Using Bloom filters Authors: Sarang Dharmapurikar, Haoyu Song, Jonathan Turner, and John Lockwood Publisher: ANCS 2006 Present:
Packet Classification on Multiple Fields 참고 논문 : Pankaj Gupta and Nick McKeown SigComm 1999.
Packet Classifiers In Ternary CAMs Can Be Smaller Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison) Jia Wang.
Multi-Field Range Encoding for Packet Classification in TCAM Author: Yeim-Kuan Chang, Chun-I Lee and Cheng-Chien Su Publisher: INFOCOM 2011 Presenter:
Balajee Vamanan and T. N. Vijaykumar School of Electrical & Computer Engineering CoNEXT 2011.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
CSCI 3160 Design and Analysis of Algorithms Tutorial 10 Chengyu Lin.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison.
Cross-Product Packet Classification in GNIFS based on Non-overlapping Areas and Equivalence Class Author: Mohua Zhang, Ge Li Publisher: AISS 2012 Presenter:
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
Computer Science Background for Biologists CSC 487/687 Computing for Bioinformatics Fall 2005.
Searching Topics Sequential Search Binary Search.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
Young CS 331 D&A of Algo. NP-Completeness1 NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and.
Packet Classification Using Multidimensional Cutting Sumeet Singh (UCSD) Florin Baboescu (UCSD) George Varghese (UCSD) Jia Wang (AT&T Labs-Research) Reviewed.
IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo a, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
1 The instructor will be absent on March 29 th. The class resumes on March 31 st.
DRES: Dynamic Range Encoding Scheme for TCAM Coprocessors 2008 YU-ANTL Lab Seminar June 11, 2008 JeongKi Park Advanced Networking Technology Lab. (YU-ANTL)
Theory of Computational Complexity Yusuke FURUKAWA Iwama Ito lab M1.
Hard Problems Some problems are hard to solve.
Lectures on Network Flows
Transport Layer Systems Packet Classification
Packet Classification Using Coarse-Grained Tuple Spaces
NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and Johnson, W.H. Freeman and Company, 1979.
Hash Functions for Network Applications (II)
Author: Yaron Weinsberg ,Shimrit Tzur-David ,Danny Dolev and Tal Anker
Worst-Case TCAM Rule Expansion
Presentation transcript:

1 Energy Efficient Multi-match Packet Classification with TCAM Fang Yu

2 Outline Introduction to multi-match classification Multi-match classification using TCAM –May create many intersections –Consume many TACM resources and high power consumption Filter set splitting algorithm to remove intersections Simulations results Conclusions and future work

3 Single-Match Classification –Assumption: all the filters are associated with priorities –Only the highest priority match matters –E.g., longest prefix match Packet headerPacket Payload Multi-Match Classification –Report all matching results –No priority among filters –PNE (iBox): identify the all the relevant functions –Intrusion Detection System: identify all the related rules –Need faster solutions because of the complex follow-up processing Packet Classification

4 Ternary-CAM (TCAM) Fully associative memory compare input string with all the entries in parallel –If multiple matches, report the index of the first match Each cell takes one of three logic states – ‘0’, ‘1’, and ‘?’(don’t care) Current TCAM technology –Fast match time: 4 ns –Size: 1-2MB priced at $200-$300 entry cell width

5 Ternary-CAM (TCAM) Fully associative memory compare input string with all the entries in parallel –If multiple matches, report the index of the first match Each cell takes one of three logic states – ‘0’, ‘1’, and ‘?’(don’t care) Current TCAM technology –Fast match time: 4 ns –Size: 1-2MB priced at $200-$300 –Power consumption is high

6 Report Multi-match Results Problem: TCAM only reports the first matching result –For example, two filters have intersection relationship – “Tcp $SQL_SERVER 1433 $EXTERNAL_NET any” – “Tcp Any Any Any 139” – “Tcp $SQL_SERVER 1433 $EXTERNAL_NET 139” –Return a bit-vector of matched results? processing cost for the bit-vector can still be O(N) Intersection:

7 Report Multi-match Results (cont.) Solution: add additional intersection filters Pros: –High speed –Return the all the matching results within one cycle –Deterministic lookup time Cons: –May require high storage and is not energy efficient Create ~10N intersection filters for the Snort rule set May create O(N F ) intersection filters in the worst case –Not easily updatable Goal: decrease number of intersectionsand easy for update

8 Observation Split filters to two sets to reduce intersection –Report the union of results from all sets –No need to include the intersections of the filters from different sets –Decrease the number of filters in TCAM, decrease power consumption –Increase the number of TCAM access N filters +O(N 2 ) intersection 1 TCAM lookup N filters + 1 intersection 2 TCAM lookups Original Two sets F1F1 FNFN Matching F 1 and F N Matching F 1 Matching F N

9 Problem Definition Given a set of filters F(F 1,F 2, …., F N ) Filters create a set of intersections I(I 1,I 2, …., I M ) –e.g., I 1 = intersection of (F 1,, F 5, F 6 ) How to divide the filters into several sets –Residual intersection set I’: intersections from filters in the same set –N + |I’| < TCAM size –Number of sets (TCAM accesses) is minimum –NP hard problem!

10 Split filters into Two Sets Still an NP hard problem (known as maximum set splitting or maximum hypergraph cut ) Best known approximation algorithms –Yield a performance ratio of 0.72 to the optimum solution –Require quadratic programming  slow when the number of filters is large Our algorithm based on Johnson’s algorithm –Remove at least half of the intersections –O(NM) complexity, where N is the total number of filters, and M is the total number of intersections

11 Maximum Satisfiability Problem –A set of literals {F 1, F 1, F 2, F 2,.., F N, F N} –A set of clauses, each clause is a subset of literals E.g., C 1 ={F 1 F 5 F 6 } –Goal: Find an assignment of F to satisfy maximum number of clauses

12 Johnson’s Algorithm to Maximum Satisfiability Problem Assign each clause a weight = 2 -|c| E.g., weight of C 1 ={F 1, F 5 F 6 } is 2 -3 Let F i be any literal which hasn’t been assigned value yet –If the weight of all clauses contain F i is higher than the clauses contain F i Assign F i a true value, remove all clauses containing F i Multiply the weight of all the clauses containing F i by 2 –Otherwise Assign F i a false value, remove all clauses containing F i Multiply the weight of all the clauses containing F i by 2

13 Johnson’s Theory If all the clauses have at least k literals –Johnson’s algorithm can satisfy at least (2 K -1)/ 2 K percent of the total clauses –e.g., k=2, satisfy at least ¾ of the clauses –It is proved that (2 K -1)/ 2 K is the best approximable bound for k>2

14 Filter Split Algorithm For any intersection (e.g., I 1 = intersection of F 1,, F 5, and F 6 ), add two clauses –C={F 1, F 5 F 6 } and C’={F 1, F 5 F 6 } –Total number of clauses is 2M Run Johnson’s algorithm and assign each filter F i either a true (put in set one) or a false value (put in set two)

15 Filter Split Algorithm (cont.) According to Johnson’s theory –At least ¾ of the clauses are satisfied  2M*3/4=1.5M  At least 0.5M of the intersections have both clauses satisfied Suppose for intersection of F 1,, F 5, and F 6, C={F 1,, F 5, F 6 } and C’={F 1,, F 5, F 6 } both are satisfied At least one of F 1,, F 5, F 6 is true and at least one is false F 1,, F 5, F 6 are split into different sets, thus this intersection doesn’t need to be presented in TCAM  At least 50% of the intersection is removed!

16 Simulation Results SNORT intrusion detection rule set VersionFilter Set Size No splitSplit into 2 sets Unique Inter- sections TCAM Entries Remaining Inter- sections TCAM Entries Saving , % , % , % , %

17 Split filters into Multiple Sets

18 Conclusion We propose a filter split algorithm to decrease the intersections –O(NM) complexity –Guarantee to remove 50% of the intersections each time the filter set splits Save TCAM space Reduce power consumption –Save ~80% TCAM space and power consumption for the snort rule sets With the cost of one more TCAM access

19 Ongoing Work Narrow down the search region (Region Split) –E.g., tcp packet only needs to search tcp related filters –Use SRAM accesses to narrow down the search region Filter Splits Only Accesses all filters in TCAM Memory access = # of set Storage cost =O(N) Update cost: low Region Splits Only Tree based algorithms Memory access =O(logN) Storage cost = O(N F ) Power consumption min Update cost: high Middle ground Hybrid SRAM and TCAM approach Memory access = several (e.g., <5) Storage cost =O(N) Power consumption: moderate Update cost: moderate Region Split (SRAM access) Filter Split (TCAM accesses)