Real-time Traffic monitoring and containment A. L. Narasimha Reddy Dept. of Electrical Engineering Texas A & M University
Narasimha Reddy Texas A & M University 2 Outline Motivation DOS attacks –Partial state routers DDOS attacks, worms –Aggregate Packet header data as signals –Signal/image based anomaly/attack detectors
Narasimha Reddy Texas A & M University 3 Real-time traffic monitoring Attacks motivate us to monitor network traffic –Potential anomaly/attack detectors –Potentially contain/throttle them as they happen Line speeds are increasing –Need simple, effective mechanisms Attacks constantly changing –CodeRed yesterday, MyDoom today, what next
Narasimha Reddy Texas A & M University 4 Motivation Most current monitoring/policing tools are tailored to known attacks –Look for packets with port number 1434 (CodeRed) –Contain Kaaza traffic to 20% of the link Become ineffective when traffic patterns or attacks change –New threats are constantly emerging
Narasimha Reddy Texas A & M University 5 Motivation Can we design generic (and generalized) mechanisms for attack detection and containment? Can we make them simple enough to implement them at line speeds?
Narasimha Reddy Texas A & M University 6 Introduction Why look for Kaaza packets –They consume resources –Consumes resources more than we want Not much different from DOS flood –Consumes resources to stage attacks Why not monitor resource usage? –Do not want to rely on attack specific info
Narasimha Reddy Texas A & M University 7 Attacks DOS attacks –Few sources = resource hogs DDOS attacks, worms –Many sources –Individual flows look normal –Look at the aggregate picture
Narasimha Reddy Texas A & M University 8 DOS attacks & Network Flows Too many flows to monitor each flow Maintain a fixed amount of state/memory –State not enough to monitor all flows (Partial state) –Manage the state to monitor high-bandwidth flows –How? Sample packets –High-BW flows more likely to be selected Use a cache and employ LRU type policy –Traffic driven –Cache retains frequently arriving flows
Narasimha Reddy Texas A & M University 9 Partial State Approach Similar to how caches are employed in computer memory systems –Exploit locality Employ an engineering solution in an architecture-transparent fashion
Narasimha Reddy Texas A & M University 10 Identifying resource hogs Lots of web flows –Tend to corrupt the cache quickly Apply probabilistic admission into cache –Flow has to arrive often to be included in cache –Most web flows not admitted Works well in identifying high-BW flows Can apply resource management techniques to contain cached/identified flows
Narasimha Reddy Texas A & M University 11 LRU with probabilistic admission Employ a modified LRU On a miss, flow admitted with probability p –When p is small, keeps smaller flows out –High-BW flows more likely admitted –Allows high-BW flows to be retained in cache Nonresponsive flows more likely to stay in cache
Narasimha Reddy Texas A & M University 12 Traffic Driven State Management Monitor top 100 flows at any time –Don’t know the identity of these flows –Don’t know how much BW these may consume
Narasimha Reddy Texas A & M University 13 Policy Driven State Management An ISP could decide to monitor flows above 1Mbps –Will need state >= link capacity/1 Mbps Could monitor flows consuming more than 1% of link capacity –For security reasons –At most 100 flows with 1% BW consumption
Narasimha Reddy Texas A & M University 14 Partial State –Trace-driven evaluation
Narasimha Reddy Texas A & M University 15 Partial State –Trace-driven Evaluation
Narasimha Reddy Texas A & M University 16 UDP Cache Occupancy
Narasimha Reddy Texas A & M University 17 TCP Cache Occupancy
Narasimha Reddy Texas A & M University 18 Resource Management
Narasimha Reddy Texas A & M University 19 Preferential Dropping drop prob Queue length drop prob for high bandwidth flows minthmaxth maxp 1 drop prob for other flows
Narasimha Reddy Texas A & M University 20 Multiple possibilities SACRED: Monitor flows above certain rate (policy driven), differential RED, (iwqos99) LRU-RED: Traffic driven state management, differential RED (Globecom01) –Approximately fair BW distribution LRU-FQ: Traffic driven state management, fair queuing (ICC 04) –Contain DOS attacks –Provide shorter delays for short-term flows
Narasimha Reddy Texas A & M University 21 LRU-FQ Resource Management
Narasimha Reddy Texas A & M University 22 LRU-FQ flow chart – enqueue event Packet Arrival Is Flow in Cache? Yes No Does Cache Have space? Yes Admit flow with Probability ‘p’ No Is Flow Admitted? Record flow details Initialize ‘count’ to 0 Yes Increment ‘count’ Move flow to top of cache No Is ‘count’ >= ‘threshold’ No Yes Enqueue in Partial state Queue Enqueue in Normal Queue
Narasimha Reddy Texas A & M University 23 Linux IP Packet Forwarding Packet Arrival Check & Store Packet Enqueue pkt Request Scheduler To invoke bottom half Device Prepares packet Packet Departure Error checking Verify Destination Route to destination Update Packet Packet Enqueued Scheduler invokes Bottom half Scheduler runs Device driver Local packet Deliver to upper layers UPPER LAYERS IP LAYER LINK LAYER Design space
Narasimha Reddy Texas A & M University 24 Linux Kernel traffic control Filters are used to distinguish between different classes of flows. Each class of flows can be further categorized into sub-classes using filters. Queuing disciplines control how the packets are enqueued and dequeued
Narasimha Reddy Texas A & M University 25 LRU-FQ Implementation LRU component of the scheme is implemented as a filter. –All parameters: threshold, probability and cache size are passed as parameters to the filter Fair Queuing employed as a queuing discipline. –Scheduling based on queue’s weight. –Start-time Fair Queuing
LRU-FQ - Results
Narasimha Reddy Texas A & M University 27 Experimental Setup
Narasimha Reddy Texas A & M University 28 Long-Term flow differentiation Probability = 1/25Cache size= 11 threshold= 125 Normal TCP fraction = 0.07
Narasimha Reddy Texas A & M University 29 Long-term flow differentiation Probability = 1/25Cache size= 11 threshold= 125
Narasimha Reddy Texas A & M University 30 Protecting Web Mice
Narasimha Reddy Texas A & M University 31 Protecting Web mice 1:1LRU : Normal Queue 11LRU Cache Size 125Threshold 1/50Probability 20Web Clients 2 – 4LongTerm UDP Flows 20Long Term TCP Flows Experimental Setup
Narasimha Reddy Texas A & M University 32 Protecting Web Mice Bandwidth Results TCP Fraction TCP Tput # Web Requests UDP Tput UDP Flows TCP Fraction TCP Tput # Web Requests UDP Tput UDP Flows Normal Router LRU-FQ Router
Narasimha Reddy Texas A & M University 33 Protecting Web Mice Timing Results Normal Router LRU-FQ Router
Narasimha Reddy Texas A & M University 34 Summary of LRU-FQ Provides a good control of DOS attacks with limited number of flows Provides better delays for short-term flows Automatically identifies resource hogs Partial state packet handling cost -not an issue at 100Mbps.
Narasimha Reddy Texas A & M University 35 Applications of Partial State More intelligent control of network traffic Accounting and measurement of high bandwidth flows Denial of Service (DOS) attack prevention Tracing of high bandwidth flows QOS routing
Narasimha Reddy Texas A & M University 36 Aggregated packet analysis
Narasimha Reddy Texas A & M University 37 Approach Network Traffic Signal Generation & Data Filtering (Address correlation) Anomaly Detection (Thresholding) Detection Signal Statistical or Signal Analysis (Wavelets or DCT)
Narasimha Reddy Texas A & M University 38 Signal Generation Traffic volume (bytes or packets) –Analyzed before –May not be a great signal when links are always congested (typical campus access links) Lot more information in packet headers –Source address –Destination address –Protocol number –Port numbers
Narasimha Reddy Texas A & M University 39 Signal Generation Per packet cost is important driver Update a counter for each packet header field –Too much memory to put in SRAM Break the field into multiple 8-bit fields –32-bit address into four 8-bit fields –1024 locations instead of 2^32 locations –In general, 256* (k/8) instead of 2^k –k/8 counter updates instead of 1
Narasimha Reddy Texas A & M University 40 Signal Generation What kind of signals can we generate with addresses, port numbers and protocol numbers?
Narasimha Reddy Texas A & M University 41 Addresses are correlated Most of us have habits –Access same web sites Large web sites get significant part of traffic –Google.com, hp.com, yahoo.com Large downloads correlate over time –ftp, video On an aggregate, addresses are correlated
Narasimha Reddy Texas A & M University 42 Address Correlation –attacks? Address correlation changes when traffic patterns change abruptly –Denial of service attacks –Flash crowds –Worms Results in differences in correlation –High --single attack victim –Low – lots of addresses --worm
Narasimha Reddy Texas A & M University 43 Address correlation signals Address correlation: Simplified Address correlation:
Narasimha Reddy Texas A & M University 44 Address Correlation Signals
Narasimha Reddy Texas A & M University 45 Address Correlation Signals
Narasimha Reddy Texas A & M University 46 Signal Analysis Capture information over a sampling period –Of the order of a few seconds to minutes Analyze each sample to detect anomalies –Compare with historical norms Post-mortem/Real-time analysis –May use different amounts of data & analysis Detailed information of past few samples Less detailed information of older samples
Narasimha Reddy Texas A & M University 47 Signal Analysis Address correlation as a time series signal Employ known techniques to analyze time series signals Wavelets –one powerful technique –Allows analysis in both time and frequency domain Per-sample analysis has more flexibility –Not in forwarding path
Narasimha Reddy Texas A & M University 48 Does this work?
Narasimha Reddy Texas A & M University 49 Analysis of address signal
Narasimha Reddy Texas A & M University 50 Image based analysis Treat the traffic data as images Apply image processing based analysis Treat each sample as a frame in a video –Video compression techniques lead to data reduction –Scene change analysis leads to anomaly detection –Motion prediction leads to attack prediction
Narasimha Reddy Texas A & M University 51 Signal Generation
Narasimha Reddy Texas A & M University 52 Two dimensional images Horizontal/vertical lines indicate anomalies –Infected machine contacting multiple destinations (worm propagation) –Multiple source machines targeting a destination (DDOS)
Narasimha Reddy Texas A & M University 53 DCT analysis of addresses
Narasimha Reddy Texas A & M University 54 Semi-random attacks
Narasimha Reddy Texas A & M University 55 Random attacks
Narasimha Reddy Texas A & M University 56 Better than volume analysis
Narasimha Reddy Texas A & M University 57 Motion prediction
Narasimha Reddy Texas A & M University 58 Advantages Not looking for specific known attacks Generic mechanism Works in real-time –Latencies of a few samples –Simple enough to be implemented inline
Narasimha Reddy Texas A & M University 59 Prototypes Linux-PC boxes On Intel Network processors –Can push to Gbps packet forwarding rates –Forwarding throughput not impacted –Sampling rates of a few ms possible
Narasimha Reddy Texas A & M University 60 Conclusion Real-time resource accounting is feasible Real-time traffic monitoring is feasible –Simple enough to be implemented inline Can rely on many tools from signal/image processing area –More robust offline analysis possible –Concise for logging and playback
Narasimha Reddy Texas A & M University 61 Thank you !! For more information,
Narasimha Reddy Texas A & M University 62 Other work Enhancements to TCP –TCP-DCR for wireless losses, packet reordering –Layered TCP for high-speed(Gbps) links Alternate routing for improving service availability during link transients –Continues routing packets until routing tables are recomputed –Important for VOIP applications
Narasimha Reddy Texas A & M University 63 TCP Enhancements TCP-DCR: –Modifies TCP’s congestion response to tolerate non-congestion events (channel errors, packet reordering) LTCP (Layered TCP): –Improves TCP’s performance in high-speed networks
Narasimha Reddy Texas A & M University 64 TCP-DCR –channel errors
Narasimha Reddy Texas A & M University 65 TCP-DCR –packet reordering
Narasimha Reddy Texas A & M University 66 LTCP
Narasimha Reddy Texas A & M University 67 Thank you !! For more information,
Narasimha Reddy Texas A & M University 68 LRU-RED Results
Narasimha Reddy Texas A & M University 69 RTT Bias -TCP flows
Narasimha Reddy Texas A & M University 70 Impact of Cache size Effect of varying cache size –to study impact of cache size on performance of the scheme –probability= 1/55, threshold = 125 –number of TCP flows=20 –equal weights for both queues.
Narasimha Reddy Texas A & M University 71 Results – Cache size
Narasimha Reddy Texas A & M University 72 Normal Workloads Performance under normal workloads –working of scheme when non-responsive loads are absent or use their fair share of bandwidth –cache size = 9, threshold =125 –probability = 1/55
Narasimha Reddy Texas A & M University 73 Results – Normal workload
Narasimha Reddy Texas A & M University 74 Normal Mixed workload