Presentation is loading. Please wait.

Presentation is loading. Please wait.

Online NetFPGA decision tree statistical traffic classifier

Similar presentations


Presentation on theme: "Online NetFPGA decision tree statistical traffic classifier"— Presentation transcript:

1 Online NetFPGA decision tree statistical traffic classifier
2019/5/11 Online NetFPGA decision tree statistical traffic classifier Author:Alireza Monemi, Roozbeh Zarei, Muhammad N. Marsono Publisher/Journal:Elsevier/Computer Communications 36(2013)page 1329–1340 Referenced:17 times Presenter:Yu-Hsiang Lin Date:2018/12/26 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. CSIE CIAL Lab 1

2 2019/5/11 Introduction(1) Propose an online statistical traffic classifier using the C4.5 machine learning algorithm running on the NetFPGA platform. NetFPGA classifier is constructed by adding three main modules to the NetFPGA reference switch design: netflow probe, feature extractor, ML classifier The proposed classifier is able to classify the input traffics at the maximum line speed of the NetFPGA platform, i.e. 8 Gbps without any packet loss. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

3 2019/5/11 Introduction(2) Network traffic classification algorithms are generally divided into two groups based on network data layering: stateless and stateful. In stateless classifiers (often called packet classifiers), the required features that distinguish traffic classes from the others are extracted from individual packets. IP address, port address, or even the interval time between two consecutive packets are the examples of stateless features. Stateful features are extracted from traffic flows. This means that a stateful flow classifier needs to keep track of all active flows. For a stateful classifier, the statistical properties of the transport layer such as flow duration, flow size, and flow packet inter-arrival time distinguish between different classes of network applications National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

4 Design requirements for inline NetFPGA traffic classifier
2019/5/11 Design requirements for inline NetFPGA traffic classifier The proposed device is aimed to be placed before an edge router or a campus gateway. In this paper, all flows which are sent out from the campus network are named uplink flows, while the received flows are named downlink flows. Flows which are transferring between the two endpoints (one inside the campus and the other on the outside) are called a bidirectional flow. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

5 2019/5/11 Netflow probe The Netflow unit categorizes all packets which share the same 5-tuple in a unique flow while keeping the following information for every flow: number of packets and bytes in the flow, first and last packet inter-arrival time, and TCP flags. In the proposed design, we need to have access to the Netflow information of both uplink and downlink flows (bidirectional flow) separately, at the same time. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

6 2019/5/11 Feature extractor(1) Be able to receive Netflow data and extract the appropriate feature set. Extracts statistical features from the first few transmitted packets between two endpoints on the NetFPGA platform. In this paper, a set of 35 real-time features are introduced which can be easily extracted from the Netflow data National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

7 2019/5/11 Feature extractor(2) National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

8 2019/5/11 ML classifier Decision tree based classifiers are the cornerstone of a machine learning based traffic classifier. In order to design an efficient DT, it must support the following characteristics: The hardware structure of search tree must be able to be updated with a new tree by software. The classification accuracy decreases over the time as the traffic behavior changes. Hence, the generated decision tree obtained from a certain training dataset may become outdated as time passes by. Updating the DT must not interrupt or disrupt the classification process. Updating DT on hardware is slow since it is updated through the slow register bus by the host computer of the NetFPGA platform. A mechanism to avoid system down time is required during an update. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

9 Proposed architecture(1) – traffic classifier switch
2019/5/11 Proposed architecture(1) – traffic classifier switch The time stamp is a 64-bit counter with 1 μs accuracy and is added to design in order to capture the arrival-time of each packet National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

10 2019/5/11 Proposed architecture(2) – flow classifier block diagram & packet bus data format National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

11 Proposed architecture (3) – flow bus data format
2019/5/11 Proposed architecture (3) – flow bus data format National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

12 Netflow probe (1) Provide two output bus: packet bus and flow bus
2019/5/11 Netflow probe (1) Provide two output bus: packet bus and flow bus Packet bus has the same format as the input bus with an exception that the class field is updated with a new value. This value is a unique number (class-ID) which is assigned to each class by the software classifier. If the packet belongs to a flow which has been classified before, this class field is updated by the class-ID. Otherwise, the class field is left empty (0x00). Flow bus contains a total of 11 words. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

13 2019/5/11 Netflow probe (2) The endpoint which has the bigger IP address is called endpoint1 and the other is called endpoint2. The flow from endpoint1 to endpoint2 is called flow1 while the second one is called flow2. A unique 42-bit number (called flow-ID) is assigned to both flows transmitted between two endpoints. To distinguish the flow in which the current packet belongs to, a swap bit is reset when the current packet is sent from endpoint1 to endpoint2. Otherwise, it is set. If the current packet is the first packet sent between the two endpoints, the New field is asserted. The Class field contains the flow class and the Loc field shows the flow location in flow lookup table. The Total_Pck_Ovf and Total_Pck_Num show the total number of packets transmitted between two endpoints. These fields determine when the features should be extracted from a flow. The flow bus is generated for UDP or TCP packets only. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

14 Netflow probe – header parser
2019/5/11 Netflow probe – header parser This module examines the header of each packet at arrival and extracts the significant fields. The 104 bits of 5-tuple are passed to a CRC-64 hash generator. The first 42 bits of the generated hash value is considered as flow-ID In order to generate the same flow-ID for both uplink and downlink flows, the position of source IP and port addresses are swapped with the destination IP and port addresses when the source IP address is larger than destination IP address. The TCP-UDP packet field is asserted if the packet is UDP or TCP. For a UDP packet, the TCP flags are all zero. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

15 Netflow probe - Flow lookup table (1)
2019/5/11 Netflow probe - Flow lookup table (1) Receives the flow-ID from the header parser module and returns the address of the statistical flow record within the external SRAM memory. Another output of the FLUT is the detected flow class. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

16 Netflow probe - Flow lookup table (2)
2019/5/11 Netflow probe - Flow lookup table (2) The flow-ID is divided into two parts. The first 10 bits are used as the address of the BRAM, while the second 32 bits are used as their flow signatures. Each line of the BRAM can store up to 8 flow records (signatures, class and Least Recently Used Order (LRUO)). The lookup table checks the signature of newly arrived packet by all the signatures stored in one BRAM line. If a match occurs, the LRUO of that flow is set to 7 and the LRUO of other flows are reduced by one. Otherwise, an empty place is selected to store the flow signature. The returned address of the FLUT is a 10-bit flow address joined by the 3 bits position address of current flow in one BRAM line. The combined address is called the flow location. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

17 Netflow probe - Inactive flow detector (1)
2019/5/11 Netflow probe - Inactive flow detector (1) The inactive flow detector determines how long the flow information has remained in the memory while no arriving packets belonging to any of both endpoints are observed. A flow which is not activated within 60 seconds is removed from the FLUT. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

18 Netflow probe - Inactive flow detector (2)
2019/5/11 Netflow probe - Inactive flow detector (2) To detect inactive flows whenever a new packet arrives, its flow location is given to inactive flow detector, and uses the flow location as the address of a dual port BRAM and stores the value of the timer whenever it receives a new flow location. The stored value indicates the last time a flow was activated, expired flows have the last active time one second greater than the current value of the timer. To detect inactive flows, another port of this memory is connected to a counter which counts from 0 to the maximum number of FLUT entities while a comparator checks for expired flows. If any matches are found, the Delete pin is asserted and the flow time stamp is reset to zero. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

19 Netflow probe - Statistic storage (1)
2019/5/11 Netflow probe - Statistic storage (1) We used the NetFPGA SRAM memory to keep the flow statistics. All flows transmitted between two endpoints occupy 4 consecutive addresses in the SRAM memory. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

20 Netflow probe - Statistic storage (2)
2019/5/11 Netflow probe - Statistic storage (2) The first flow word address obtained by multiplying the flow location in FLUT by four. If it is the first packet sent between two endpoints, all 4 lines of statistical data are updated with new values. Otherwise, the two fields belonging to the current flow (uplink or downlink) are updated. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

21 Netflow probe – divided into two pipeline modules.
2019/5/11 Netflow probe – divided into two pipeline modules. First pipeline module consists of the header parser, FLUT and inactive flow detector. It updates the class field of packet bus and sends it to the Out port lookup module, also update the first 5 words of the flow bus. The total number of clock cycles needed for updating packet bus data is 16. Second pipeline module consists of the statistic storage module and updates the other 6 words of the flow bus. The flow bus data are generated in 32 clock cycles after receiving the packet data. Netflow module allows access to both flows information transferred between the two endpoints at the same time. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

22 Feature extractor module
2019/5/11 Feature extractor module This module extracts the features from bidirectional flows for the DT classifier. The features are extracted when the total number of packets transmitted between two endpoints is equal to a user defined number n. To generate the average features per packet or per second, a 32-bit pipeline divider(fixed point) is used. The divider latency is 32 clock cycles and has the throughput of 4 clock cycles per instance. For 6 features, the computation latency is equal to *4 = 56 clock cycles. The feature extractor module has 3 pipeline stages. For the above example, each stage has 19 clock cycles latency. To distinguish between the uplink and downlink features, the flows which match the source IP addresses mentioned below, are marked as uplink packets. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

23 2019/5/11 C4.5 Decision tree (DT) (1) The two consecutive levels of DT are merged in one level and called merged DT. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

24 2019/5/11 C4.5 Decision tree (DT) (2) The DT classifier is constructed by mapping tree nodes into memory blocks and placing them in pipeline modules. A binary search tree made of three pipeline modules. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

25 2019/5/11 C4.5 Decision tree (DT) (3) By asserting next_instant_en when the instance is not yet classified (valid pin is asserted) The instance and the last merged node information are passed to the next module via F_out and node_out pins. If an instance reaches a leaf (is classified), the done signal is asserted while the detected class is set on the class output. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

26 Functional block diagram of a merged node
National Cheng Kung University CSIE Computer & Internet Architecture Lab

27 2019/5/11 C4.5 Decision tree (DT) (4) The feature instances passed to a module through the features_in port are captured in a register when next_instance_rd is asserted. The output of this register is connected to the features_out port which provides the feature instances for the next neighbor module. The features_reg is connected to 3 parallel multiplexers. These multiplexers are controlled by f_sel which is taken from the node data. The selected features are compared by the cmp_v values resulting in 3 cmp signals being asserted. The cmp is set if the value of the input features is bigger than the cmp_v value, otherwise it is reset. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

28 2019/5/11 C4.5 Decision tree (DT) (5) The node_in port is used to receive the node data from the previous module in the pipeline chain one clock cycle after the next_instance_rd is asserted. Otherwise, the node data is provided by the internal Block RAM. A leaf is reached when one of the three f_sels is zero. When an instance is classified and valid_reg is set, the done pin is asserted. At the same time, the valid_out pin is reset to inform the next module not to process the feature_in values anymore. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

29 2019/5/11 C4.5 Decision tree (DT) (6) In order to achieve on-the-fly updates, a dual port RAM is used. The port A of the memory is connected to the DT and the other port is connected to the host computer. The memory space is divided into two segments. The Mem_sel pin and its inverted signal are used as the most significant bit of port A and B addresses, respectively. Using this structure, the DT is able to read a node’s data from one segment of the memory while the host computer is able to update the DT on another segment of the memory. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

30 2019/5/11 Flow UDP generator In order to monitor the input traffic, the UDP packets that encapsulate the flow bus are sent to the host computer. Since the host computer receives this packet from a slow PCI port, the generated UDP packets are sent only when the total number of transmitted packets between two endpoints is a multiplication of a parameterized number entered in by the users. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

31 Hardware utilization summary on the Vertex 2vp50ff1152-7
2019/5/11 Hardware utilization summary on the Vertex 2vp50ff1152-7 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab


Download ppt "Online NetFPGA decision tree statistical traffic classifier"

Similar presentations


Ads by Google