Presentation is loading. Please wait.

Presentation is loading. Please wait.

BiCuts: A fast packet classification algorithm using bit-level cutting

Similar presentations


Presentation on theme: "BiCuts: A fast packet classification algorithm using bit-level cutting"— Presentation transcript:

1 BiCuts: A fast packet classification algorithm using bit-level cutting
2019/7/29 BiCuts: A fast packet classification algorithm using bit-level cutting Authors :Zhi Liu,Shijie Sun,Hang Zhu,Jiaqi Gao ,Jun Presenter : Yi-Fang, Huang Conference :  Computer Communications volume109(September 2017) 1 CSIE CIAL Lab

2 2019/7/29 Introduction This paper proposes BitCuts, a decision-tree algorithm that performs bit-level cutting. The contributions of this paper are summarized as follows: We study the cutting schemes of existing decision-tree algorithms, and reveal their inefficiencies in terms of speed and space. “Bit-level cut” is able to zoom into densely clustered rule space and cut at the right granularity, which avoids unnecessary partitions and excessive memory consumption. BitCuts uses parallel bitindexing to support fast child-node traversal and enable large node fanout. For 5-tuple rules, the child-node indexing can be implemented by two bit-manipulation instructions, achieving ultra-fast decision-tree traversal. In order to build an efficient decision tree, a bit-selection algorithm is proposed to determine the cutting bits at each tree node, targeted at finding the most effective bits for separating the rules. 由於FPGA的可重複改寫的特性和高頻寬介面,使FPGA成為發展網路架構的選擇。 但是,它們的有限內存限制了可以存儲在芯片內部的information,並且off chip memory通常太慢以至於不能滿足高速網絡的需求。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

3 2019/7/29 The Bitcuts overview 過濾幾個數據包時,所有通過link的數據包必須檢查filter為 正常運行的 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

4 Classification operation
2019/7/29 Classification operation 過濾幾個數據包時,所有通過link的數據包必須檢查filter為 正常運行的 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

5 Classification operation
2019/7/29 Classification operation 過濾幾個數據包時,所有通過link的數據包必須檢查filter為 正常運行的 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

6 Classification operation
2019/7/29 Classification operation 過濾幾個數據包時,所有通過link的數據包必須檢查filter為 正常運行的 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

7 2019/7/29 Rule set update The Micron AC-510 Board was chosen for its HMC memory module, but it lacks the network interface that would allow our system to be tested in a real network deployment. Instead, our architecture was tested with a spoofed network interface: packets were generated and results verified onchip. The Test Packet Generator creates and sends a continuous stream of packets at a line rate of up to 10 Gbps. Rather than generate random packets on-chip, this component holds a group of pre-generated packets in on-chip memory and sends them in an infinite loop. The random packets themselves are generated in software and streamed to the test framework at initialization time through the Pico Framework. Note, since the random packets are generated in software, the expected match results can also be computed in software and sent to the system at initialization time Spoofed:模仿 由軟體產生Packet且透過Test Packet Generator 把預先產生的Packet存到on-chip memory(block ram) ,然後用一個無窮迴圈重複送。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

8 2019/7/29 Rule set update Spoofed:模仿 由軟體產生Packet且透過Test Packet Generator 把預先產生的Packet存到on-chip memory(block ram) ,然後用一個無窮迴圈重複送。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

9 2019/7/29 Rule set update The Micron AC-510 Board was chosen for its HMC memory module, but it lacks the network interface that would allow our system to be tested in a real network deployment. Instead, our architecture was tested with a spoofed network interface: packets were generated and results verified onchip. The Test Packet Generator creates and sends a continuous stream of packets at a line rate of up to 10 Gbps. Rather than generate random packets on-chip, this component holds a group of pre-generated packets in on-chip memory and sends them in an infinite loop. The random packets themselves are generated in software and streamed to the test framework at initialization time through the Pico Framework. Note, since the random packets are generated in software, the expected match results can also be computed in software and sent to the system at initialization time Spoofed:模仿 由軟體產生Packet且透過Test Packet Generator 把預先產生的Packet存到on-chip memory(block ram) ,然後用一個無窮迴圈重複送。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

10 2019/7/29 Evaluation 如果user指定了某個pad(Pcap中某個Packet段),則它將包含在模塊頭中 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

11 Packet Throughput & Processing Latency
2019/7/29 Packet Throughput & Processing Latency Two entries in the table correspond to our work: the system tested with 1504 rules (max rule count at 10 Gbps) and 2^20= rules, as the first and second entries respectively. We note that previous systems outperform our implementation at lower rule counts; this is expected, since these systems utilize only on-chip memory. Our off-chip memory solution is the only hardware system (to the best of our knowledge) that can support much larger rule counts, achieving a processing rate of 16.4 Mbps at 1M rules. Table 2 presents the packet throughput of our best system, 160 systolic PEs, along with other state-of-the-art hardware implementations of packet matching systems. Two entries in the table correspond to our work: the system tested with 1504 rules (max rule count at 10 Gbps) and 220 = rules, as the first and second entries respectively. We note that previous systems outperform our implementation at lower rule counts; this is expected, since these systems utilize only on-chip memory. Our off-chip memory solution is the only hardware system (to the best of our knowledge) that can support much larger rule counts, achieving a processing rate of 16.4 Mbps at 1M rules. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

12 2019/7/29 Resource Utilization 如果user指定了某個pad(Pcap中某個Packet段),則它將包含在模塊頭中 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab


Download ppt "BiCuts: A fast packet classification algorithm using bit-level cutting"

Similar presentations


Ads by Google