1 Dynamic Pipelining: Making IP- Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar School of Electrical and Computer Engineering, Purdue University.

Slides:



Advertisements
Similar presentations
A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.
Advertisements

August 17, 2000 Hot Interconnects 8 Devavrat Shah and Pankaj Gupta
Internet Routers
Balajee Vamanan, Gwendolyn Voskuilen, and T. N. Vijaykumar School of Electrical & Computer Engineering SIGCOMM 2010.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
Network Algorithms, Lecture 4: Longest Matching Prefix Lookups George Varghese.
An On-Chip IP Address Lookup Algorithm Author: Xuehong Sun and Yiqiang Q. Zhao Publisher: IEEE TRANSACTIONS ON COMPUTERS, 2005 Presenter: Yu Hao, Tseng.
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
1 Fast Routing Table Lookup Based on Deterministic Multi- hashing Zhuo Huang, David Lin, Jih-Kwon Peir, Shigang Chen, S. M. Iftekharul Alam Department.
M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Shulin You UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Electrical and Computer Engineering.
IP Routing Lookups Scalable High Speed IP Routing Lookups.
Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.
Hybrid Data Structure for IP Lookup in Virtual Routers Using FPGAs Authors: Oĝuzhan Erdem, Hoang Le, Viktor K. Prasanna, Cüneyt F. Bazlamaçcı Publisher:
Digital Search Trees & Binary Tries Analog of radix sort to searching. Keys are binary bit strings.  Fixed length – 0110, 0010, 1010,  Variable.
Router Architecture : Building high-performance routers Ian Pratt
1 Author: Ioannis Sourdis, Sri Harsha Katamaneni Publisher: IEEE ASAP,2011 Presenter: Jia-Wei Yo Date: 2011/11/16 Longest prefix Match and Updates in Range.
IP Address Lookup for Internet Routers Using Balanced Binary Search with Prefix Vector Author: Hyesook Lim, Hyeong-gee Kim, Changhoon Publisher: IEEE TRANSACTIONS.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.
Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department.
1 Scalable high-throughput SRAM-based architecture for IP-lookup using FPGA Author: Hoang Le; Weirong Jiang; Prasanna, V.K.; Publisher: FPL Field.
Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Author: Jing Fu, Jennifer Rexford Publisher: ACM CoNEXT 2008 Presenter:
1 A Novel Scalable IPv6 Lookup Scheme Using Compressed Pipelined Tries Author: Michel Hanna, Sangyeun Cho, and Rami Melhem Publisher: NETWORKING 2011 Presenter:
Parallel IP Lookup using Multiple SRAM-based Pipelines Authors: Weirong Jiang and Viktor K. Prasanna Presenter: Yi-Sheng, Lin ( 林意勝 ) Date:
Study of IP address lookup Schemes
1 A Fast IP Lookup Scheme for Longest-Matching Prefix Authors: Lih-Chyau Wuu, Shou-Yu Pin Reporter: Chen-Nien Tsai.
An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu.
EaseCAM: An Energy And Storage Efficient TCAM-based IP-Lookup Architecture Rabi Mahapatra Texas A&M University;
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
IP Address Lookup Masoud Sabaei Assistant professor
LayeredTrees: Most Specific Prefix based Pipelined Design for On-Chip IP Address Lookups Author: Yeim-Kuau Chang, Fang-Chen Kuo, Han-Jhen Guo and Cheng-Chien.
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
A Hybrid IP Lookup Architecture with Fast Updates Author : Layong Luo, Gaogang Xie, Yingke Xie, Laurent Mathy, Kavé Salamatian Conference: IEEE INFOCOM,
CA-RAM: A High-Performance Memory Substrate for Search-Intensive Applications Sangyeun Cho, J. R. Martin, R. Xu, M. H. Hammoud and R. Melhem Dept. of Computer.
Balajee Vamanan and T. N. Vijaykumar School of Electrical & Computer Engineering CoNEXT 2011.
1 Towards Practical Architectures for SRAM-based Pipelined Lookup Engines Author: Weirong Jiang, Viktor K. Prasanna Publisher: INFOCOM 2010 Presenter:
Author : Guangdeng Liao, Heeyeol Yu, Laxmi Bhuyan Publisher : Publisher : DAC'10 Presenter : Jo-Ning Yu Date : 2010/10/06.
Routing Prefix Caching in Network Processor Design Huan Liu Department of Electrical Engineering Stanford University
Compact Trie Forest: Scalable architecture for IP Lookup on FPGAs Author: O˘guzhan Erdem, Aydin Carus and Hoang Le Publisher: ReConFig 2012 Presenter:
IP Address Lookup Masoud Sabaei Assistant professor
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
Scalable High Speed IP Routing Lookups Scalable High Speed IP Routing Lookups Authors: M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Zhqi.
A Small IP Forwarding Table Using Hashing Yeim-Kuan Chang and Wen-Hsin Cheng Dept. of Computer Science and Information Engineering National Cheng Kung.
PARALLEL-SEARCH TRIE- BASED SCHEME FOR FAST IP LOOKUP Author: Roberto Rojas-Cessa, Lakshmi Ramesh, Ziqian Dong, Lin Cai Nirwan Ansari Publisher: IEEE GLOBECOM.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
Dynamic Pipelining: Making IP-Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar Presented by Sailesh Kumar.
Evaluating and Optimizing IP Lookup on Many Core Processors Author: Peng He, Hongtao Guan, Gaogang Xie and Kav´e Salamatian Publisher: International Conference.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
Author : Masanori Bando and H. Jonathan Chao Publisher : INFOCOM, 2010 Presenter : Jo-Ning Yu Date : 2011/02/16.
1 DESIGN AND EVALUATION OF A PIPELINED FORWARDING ENGINE Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan.
Ofir Luzon Supervisor: Prof. Michael Segal Longest Prefix Match For IP Lookup.
IP Address Lookup Masoud Sabaei Assistant professor Computer Engineering and Information Technology Department, Amirkabir University of Technology.
Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM.
IP Routers – internal view
Jason Klaus Supervisor: Duncan Elliott August 2, 2007 (Confidential)
Scalable Memory-Less Architecture for String Matching With FPGAs
Jason Klaus, Duncan Elliott Confidential
Scalable Multi-Match Packet Classification Using TCAM and SRAM
A Trie Merging Approach with Incremental Updates for Virtual Routers
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
Presentation transcript:

1 Dynamic Pipelining: Making IP- Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar School of Electrical and Computer Engineering, Purdue University SIGCOMM ’05 Rung-Bo-Su 10/26/05

2 0.Abstract IP-lookup scheme must address five challenges of scalability, namely: –routing-table size, –lookup throughput, –implementation cost, –power dissipation, –and routing-table update cost.

3 Outline 1. Introduction 2. Background 3. Pipelined and Scalable IP-Lookup 4. Brief Review of TCAM-based Schemes 5. Methodology 6. Experimental Results 7. Conclusions

4 1.Introduction Fiber optics enabling high line-rates. Two major problems for IP-lookup –First, 2 ns per packet (for a 160 Gbps line-rate and minimum packet size of 40 bytes). –Second, a large number of prefixes.

5 1.Introduction Key component : –routing-table memory is used to search through the prefixes to locate the one that matches the incoming packet.

6 1.Introduction Five key scaling requirements : –Memory required. –Keep up with the ever-increasing line- rates. –Keep the complexity of heat removal and the cost of cooling reasonable. –Update –implementation cost and complexity

7 1.Introduction Two categories : –Trie-based –TCAMs.

8 1.Introduction Tries scale well in power but they do not scale well in throughput if they are not pipelined. Two approaches for pipelining tries are : –Hardware-level pipelining (HLP) –Datastructure-level pipelining (DLP) To solve DLP’s problems, we propose scalable dynamic pipelining (SDP).

9 2.Background Requirements : –(1) To avoid denial-of-service attacks and instabilities in the network , minimum sized packets streaming in at full line-rate. –(2) Provide enough memory –(3) Choose the prefix with the longest match.

10 2.Background Trie-Based IP-lookup Schemes

11 2.Background Multiple-bit Stride Tries(striding)

12 2.Background The Need for Pipelined Tries –One memory access may take longer than the packet inter-arrival time. –The problem is aggravated that perform multiple memory accesses for one lookup.

13 3.Pipelined and Scalable IP- Lookup The observation that pipelining can be used to solve the scalability problem of IP-lookup is not new. –Hardware-level pipelined (HLP) scheme. –Data-structure-level pipelined (DLP) scheme.

14 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining – k is the number of levels in the multi-bit trie. – d, the total delay of one memory access. –one lookup every t seconds. – HLP hardware- level pipelines the entire memory holding the trie into k*d/t stages.

15 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining

16 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining XY Output 2X2X 2Y2Y Decoder Memory Array Access Multiplex

17 3.Pipelined and Scalable IP- Lookup Hardware-Level Pipelining XY 2X2X 2Y2Y Decoder Memory Array Access Multiplex

18 3.Pipelined and Scalable IP- Lookup Data-Structure-Level Pipelining –places each level of the trie in a different memory, so that each memory is accessed only once per packet lookup. Does not rely on expensive memory technologies or deep hardware pipelining, it scales well in power and implementation cost.

19 3.Pipelined and Scalable IP- Lookup

20 3.Pipelined and Scalable IP- Lookup Three remaining challenges : –Scalability in memory size –in route-update cost –and in lookup throughput.

21 3.Pipelined and Scalable IP- Lookup DLP’s Scalability Problems in Memory Size –each memory stage should be sufficient for any prefix distribution. –for the prefix distribution shown in Figure 4 its worst-case memory size would be no better.

22 3.Pipelined and Scalable IP- Lookup DLP’s Scalability Problems in Route-update Cost –Multibit trie : Arbitrarily many nodes. –Tree Bitmap : Almost doubles the size of each trie node.

23 3.Pipelined and Scalable IP- Lookup DLP’s Non-Scalability in Throughput Scalable Dynamic Pipelining(SDP) * 0* 00* 000* 1* 10* 100* 1010*

24 3.Pipelined and Scalable IP- Lookup DELETE :

25 3.Pipelined and Scalable IP- Lookup Jump Nodes : –k bits must have an array of 2 k pointers –Often there may be only one child and the remaining pointers are null.

26 3.Pipelined and Scalable IP- Lookup Jump Nodes :

27 3.Pipelined and Scalable IP- Lookup Per-Stage Memory Bound (a)binary search tree with N leaves (b)memory size of a trie with jump-nodes for the worst-case prefix distribution of Figure 4, compared to size of 1-bit trie

28 3.Pipelined and Scalable IP- Lookup Per-Stage Memory Bound (c) The space taken at various levels by a trie with jump-nodes, for various prefix distributions

29 3.Pipelined and Scalable IP- Lookup System Architecture –shadow trie : a copy of the trie containing all the required auxiliary information. accessed only during the construction or update of the trie. it using slow and cheap memory (DRAM). the modifications access only the shadow trie and the IP-lookups access only the SDP trie.

30 3.Pipelined and Scalable IP- Lookup Ensures that no read operation may encounter the data-structure in an inconsistent or erroneous state.

31 3.Pipelined and Scalable IP- Lookup Optimum Cost Incremental Route-updates

32 3.Pipelined and Scalable IP- Lookup Memory Management Overhead Scalability in Lookup Rate

33 4. Brief Review of TCAM-based Schemes Content Addressable Memory (CAM) : –Compares all memory locations against the input key to find matching entries. Ternary Content Addressable Memory (TCAM) : –Supports wild card bits in the entries. –Finds the longest matching prefixes in one operation.

34 4. Brief Review of TCAM-based Schemes a single access activates all memory locations, as opposed to just one, a TCAM dissipates a lot more power compared to RAM. TCAMs are pipelined at the hardware level. TCAMs do not scale well in power and implementation cost at high line- rates.

Methodology Utilize CACTI 3.2. CACTI is a tool that models accurately. SRAM and CAM structures. Only for 100nm CMOS technology.

Experimental Results (a) Worst-case per-stage memory versus trie-levels for DLP

Experimental Results (b) Worst-case total memory versus trie-levels for HLP

Experimental Results (c) A comparison of total worst-case memory versus routing table size for various IP-lookup schemes.

Experimental Results Comparison of power dissipation versus line-rate for various schemes with tables sizes of (a) 250,000 (b) 500,000 (c) 1 million

Experimental Results Comparison of chip area versus line-rate for various schemes with table sizes of (a) 250,000 (b) 500,000 (c) 1 million prefixes.

Experimental Results Summary of Results –HLP does not scale well in total memory size, power dissipation, route-update cost, and implementation cost. –DLP does not scale well in total memory size, lookup throughput, and route- update cost. –TCAMs do not scale well in implementation cost and power dissipation.

Conclusions Proposed scalable dynamic pipelining (SDP) Three key innovations : –prove a worst-case per-stage memory bound which is significantly tighter than those of previous schemes. –This route-update cost is obviously the optimum. –Scalability at the data-structure level and hardware level.

Conclusions SDP naturally scales in power and implementation cost. Using detailed hardware simulation. SDP is the only scheme that achieves all the five scalability requirements.