Real-time Traffic monitoring and containment A. L. Narasimha Reddy Dept. of Electrical Engineering Texas A & M University
Narasimha Reddy Texas A & M University 2 Acknowledgements Deying Tong, Smitha, Phani Achanta Seong Soo Kim
Narasimha Reddy Texas A & M University 3 Outline Motivation DOS attacks –Partial state routers DDOS attacks, worms –Aggregate Packet header data as signals –Signal/image based anomaly/attack detectors
Narasimha Reddy Texas A & M University 4 Real-time traffic monitoring Attacks motivate us to monitor network traffic –Potential anomaly/attack detectors –Potentially contain/throttle them as they happen Line speeds are increasing –Need simple, effective mechanisms Attacks constantly changing –CodeRed yesterday, MyDoom today, what next
Narasimha Reddy Texas A & M University 5 Motivation Most current monitoring/policing tools are tailored to known attacks –Look for packets with port number 1434 (CodeRed) –Contain Kaaza traffic to 20% of the link Become ineffective when traffic patterns or attacks change –New threats are constantly emerging
Narasimha Reddy Texas A & M University 6 Motivation Can we design generic (and generalized) mechanisms for attack detection and containment? Can we make them simple enough to implement them at line speeds?
Narasimha Reddy Texas A & M University 7 Introduction Why look for Kaaza packets –They consume resources –Consume resources more than we want Not much different from DOS flood –Consumes resources to stage attacks Why not monitor resource usage? –Do not want to rely on attack specific info
Narasimha Reddy Texas A & M University 8 Attacks DOS attacks –Few sources = resource hogs DDOS attacks, worms –Many sources –Individual flows look normal –Look at the aggregate picture
Narasimha Reddy Texas A & M University 9 DOS attacks & Network Flows Too many flows to monitor each flow Maintain a fixed amount of state/memory –State not enough to monitor all flows (Partial state) –Manage the state to monitor high-bandwidth flows –How? Sample packets –High-BW flows more likely to be selected Use a cache and employ LRU type policy –Traffic driven –Cache retains frequently arriving flows
Narasimha Reddy Texas A & M University 10 Partial State Approach Similar to how caches are employed in computer memory systems –Exploit locality Employ an engineering solution in an architecture-transparent fashion
Narasimha Reddy Texas A & M University 11 Identifying resource hogs Lots of web flows –Tend to corrupt the cache quickly Apply probabilistic admission into cache –Flow has to arrive often to be included in cache –Most web flows not admitted Works well in identifying high-BW flows Can apply resource management techniques to contain cached/identified flows
Narasimha Reddy Texas A & M University 12 LRU with probabilistic admission Employ a modified LRU On a miss, flow admitted with probability p –When p is small, keeps smaller flows out –High-BW flows more likely admitted –Allows high-BW flows to be retained in cache Nonresponsive flows more likely to stay in cache
Narasimha Reddy Texas A & M University 13 Traffic Driven State Management Monitor top 100 flows at any time –Don’t know the identity of these flows –Don’t know how much BW these may consume
Narasimha Reddy Texas A & M University 14 Policy Driven State Management An ISP could decide to monitor flows above 1Mbps –Will need state >= link capacity/1 Mbps Could monitor flows consuming more than 1% of link capacity –For security reasons –At most 100 flows with 1% BW consumption
Narasimha Reddy Texas A & M University 15 Partial State –Trace-driven evaluation
Narasimha Reddy Texas A & M University 16 Partial State –Trace-driven Evaluation
Narasimha Reddy Texas A & M University 17 UDP Cache Occupancy
Narasimha Reddy Texas A & M University 18 TCP Cache Occupancy
Narasimha Reddy Texas A & M University 19 Resource Management
Narasimha Reddy Texas A & M University 20 Preferential Dropping drop prob Queue length drop prob for high bandwidth flows minthmaxth maxp 1 drop prob for other flows
Narasimha Reddy Texas A & M University 21 Multiple possibilities SACRED: Monitor flows above certain rate (policy driven), differential RED, (iwqos99) LRU-RED: Traffic driven state management, differential RED (Globecom01) –Approximately fair BW distribution LRU-FQ: Traffic driven state management, fair queuing (ICC 04) –Contain DOS attacks –Provide shorter delays for short-term flows
Narasimha Reddy Texas A & M University 22 SACRED Sampling And Caching RED Maintain flow rate as state for cached flows If flow rate > threshold, drop at higher rate –Drop rate keeps increasing if flow stays above threshold –Tends to punish nonresponsive flows, high-BW flows If flow rate < threshold, remove from cache –Make room for another flow
Narasimha Reddy Texas A & M University 23 SACRED results -10% state
Narasimha Reddy Texas A & M University 24 SACRED – cache associativity
Narasimha Reddy Texas A & M University 25 SACRED --Additive
Narasimha Reddy Texas A & M University 26 SACRED –TCP only
Narasimha Reddy Texas A & M University 27 LRU-FQ Resource Management
Narasimha Reddy Texas A & M University 28 LRU-FQ flow chart – enqueue event Packet Arrival Is Flow in Cache? Yes No Does Cache Have space? Yes Admit flow with Probability ‘p’ No Is Flow Admitted? Record flow details Initialize ‘count’ to 0 Yes Increment ‘count’ Move flow to top of cache No Is ‘count’ >= ‘threshold’ No Yes Enqueue in Partial state Queue Enqueue in Normal Queue
Narasimha Reddy Texas A & M University 29 Linux IP Packet Forwarding Packet Arrival Check & Store Packet Enqueue pkt Request Scheduler To invoke bottom half Device Prepares packet Packet Departure Error checking Verify Destination Route to destination Update Packet Packet Enqueued Scheduler invokes Bottom half Scheduler runs Device driver Local packet Deliver to upper layers UPPER LAYERS IP LAYER LINK LAYER Design space
Narasimha Reddy Texas A & M University 30 Linux Kernel traffic control Filters are used to distinguish between different classes of flows. Each class of flows can be further categorized into sub-classes using filters. Queuing disciplines control how the packets are enqueued and dequeued
Narasimha Reddy Texas A & M University 31 LRU-FQ Implementation LRU component of the scheme is implemented as a filter. –All parameters: threshold, probability and cache size are passed as parameters to the filter Fair Queuing employed as a queuing discipline. –Scheduling based on queue’s weight. –Start-time Fair Queuing
Narasimha Reddy Texas A & M University 32 Experimental Setup
Narasimha Reddy Texas A & M University 33 Long-Term flow differentiation Probability = 1/25Cache size= 11 threshold= 125 Normal TCP fraction = 0.07
Narasimha Reddy Texas A & M University 34 Long-term flow differentiation Probability = 1/25Cache size= 11 threshold= 125
Narasimha Reddy Texas A & M University 35 Protecting Web Mice
Narasimha Reddy Texas A & M University 36 Protecting Web mice 1:1LRU : Normal Queue 11LRU Cache Size 125Threshold 1/50Probability 20Web Clients 2 – 4LongTerm UDP Flows 20Long Term TCP Flows Experimental Setup
Narasimha Reddy Texas A & M University 37 Protecting Web Mice Bandwidth Results TCP Fraction TCP Tput # Web Requests UDP Tput UDP Flows TCP Fraction TCP Tput # Web Requests UDP Tput UDP Flows Normal Router LRU-FQ Router
Narasimha Reddy Texas A & M University 38 Protecting Web Mice Timing Results Normal Router LRU-FQ Router
Narasimha Reddy Texas A & M University 39 Summary of Partial-State Sampling and Caching allows simple identification of resource hogs Provides a good control of DOS attacks with limited number of flows Provides fairer distribution of link BW Partial state packet handling cost -not an issue at 100Mbps/1Gbps. –1Gbps implemented on Intel Network processor
Narasimha Reddy Texas A & M University 40 Applications of Partial State More intelligent control of network traffic Accounting and measurement of high bandwidth flows Denial of Service (DOS) attack prevention Tracing of high bandwidth flows QOS routing
Narasimha Reddy Texas A & M University 41 Aggregated packet analysis
Narasimha Reddy Texas A & M University 42 Approach Network Traffic Signal Generation & Data Filtering (Address correlation) Anomaly Detection (Thresholding) Detection Signal Statistical or Signal Analysis (Wavelets or DCT)
Narasimha Reddy Texas A & M University 43 Signal Generation Traffic volume (bytes or packets) –Analyzed before –May not be a great signal when links are always congested (typical campus access links) Lot more information in packet headers –Source address –Destination address –Protocol number –Port numbers
Narasimha Reddy Texas A & M University 44 Signal Generation Per packet cost is important driver Update a counter for each packet header field –Too much memory to put in SRAM Break the field into multiple 8-bit fields –32-bit address into four 8-bit fields –1024 locations instead of 2^32 locations –In general, 256* (k/8) instead of 2^k –k/8 counter updates instead of 1
Narasimha Reddy Texas A & M University 45 Signal Generation What kind of signals can we generate with addresses, port numbers and protocol numbers?
Narasimha Reddy Texas A & M University 46 Addresses are correlated Most of us have habits –Access same web sites Large web sites get significant part of traffic –Google.com, hp.com, yahoo.com Large downloads correlate over time –ftp, video On an aggregate, addresses are correlated
Narasimha Reddy Texas A & M University 47 Address Correlation –attacks? Address correlation changes when traffic patterns change abruptly –Denial of service attacks –Flash crowds –Worms Results in differences in correlation –High --single attack victim –Low – lots of addresses --worm
Narasimha Reddy Texas A & M University 48 Address correlation signals Address correlation: Simplified Address correlation:
Narasimha Reddy Texas A & M University 49 Address Correlation Signals
Narasimha Reddy Texas A & M University 50 Address Correlation Signals
Narasimha Reddy Texas A & M University 51 Signal Analysis Capture information over a sampling period –Of the order of a few seconds to minutes Analyze each sample to detect anomalies –Compare with historical norms Post-mortem/Real-time analysis –May use different amounts of data & analysis Detailed information of past few samples Less detailed information of older samples
Narasimha Reddy Texas A & M University 52 Signal Analysis Address correlation as a time series signal Employ known techniques to analyze time series signals Wavelets –one powerful technique –Allows analysis in both time and frequency domain Per-sample analysis has more flexibility –Not in forwarding path
Narasimha Reddy Texas A & M University 53 Does this work?
Narasimha Reddy Texas A & M University 54 Analysis of address signal
Narasimha Reddy Texas A & M University 55 Image based analysis Treat the traffic data as images Apply image processing based analysis Treat each sample as a frame in a video –Video compression techniques lead to data reduction –Scene change analysis leads to anomaly detection –Motion prediction leads to attack prediction
Narasimha Reddy Texas A & M University 56 Signal Generation
Narasimha Reddy Texas A & M University 57 Two dimensional images Horizontal/vertical lines indicate anomalies –Infected machine contacting multiple destinations (worm propagation) –Multiple source machines targeting a destination (DDOS)
Narasimha Reddy Texas A & M University 58 DCT analysis of addresses
Narasimha Reddy Texas A & M University 59 Semi-random attacks
Narasimha Reddy Texas A & M University 60 Random attacks
Narasimha Reddy Texas A & M University 61 Complex attacks
Narasimha Reddy Texas A & M University 62 Better than volume analysis
Narasimha Reddy Texas A & M University 63 Evaluation True Positive Rate False Alarm Rate or False Positive Rate True Negative Rate False Negative Rate LR = true positive rate/ false positive rate NLR = false negative rate/true –ve rate Ideally, LR = infinity, NLR = 0
Narasimha Reddy Texas A & M University 64 Comparison of Scalar signals
Narasimha Reddy Texas A & M University 65 Protocol Composition During attack, attack protocol volume will be higher –Observation of changes can lead to detection
Narasimha Reddy Texas A & M University 66 Protocol Composition
Narasimha Reddy Texas A & M University 67 Address based signals
Narasimha Reddy Texas A & M University 68 Port Number Domain
Narasimha Reddy Texas A & M University 69 Thresholds vs. Detection
Narasimha Reddy Texas A & M University 70 Motion prediction
Narasimha Reddy Texas A & M University 71 Advantages Not looking for specific known attacks Generic mechanism Works in real-time –Latencies of a few samples –Simple enough to be implemented inline
Narasimha Reddy Texas A & M University 72 Prototypes Linux-PC boxes On Intel Network processors –Can push to Gbps packet forwarding rates –Forwarding throughput not impacted –Sampling rates of a few ms possible
Narasimha Reddy Texas A & M University 73 Related Work Resource usage monitoring –Estan & Verghese –Bloom filters –Kodialam & Lakshman – Run detection –Mahajan et al – RED-PD –Duffield (AT & T) – Sampling –Others
Narasimha Reddy Texas A & M University 74 Related Work –Worms Payload monitoring –Singh, Savage & Verghese, Tang & Chen –Look for matches against constant length payloads Sampling, Rabin Signatures –Prototype implementation –Detects worms within 5-30 seconds –Effective with polymorphic worms
Narasimha Reddy Texas A & M University 75 Related Work -- Worms Look for TCP Reset signals –Weaver & Paxson –Random host scan at a specific ports –Not all hosts open attack port –Attacking worm will get many Resets –Too many Resets => Attacker –Effective for TCP based attacks –Can detect/contain in real-time
Narasimha Reddy Texas A & M University 76 Related Work -- Worms Quick spreading worms use randomly generated addresses –Normal users use names, DNS –Worms don’t have DNS activity –Lots of accesses without DNS requests => Worms –Many detectors within a campus Local DNS servers
Narasimha Reddy Texas A & M University 77 Related Work -- Worms Address honeypots –Arbor networks, Paxson, CrowCroft –Configure machines to accept packets for unassigned addresses –Only worms will contact these machines –Capture payloads to analyze –Quickly propagate signatures
Narasimha Reddy Texas A & M University 78 Related Work -- Worms IP Traceback – Savage et al –Address spoofing makes origin of attacks difficult to detect –Tracing, if universal, will limit attacks Fear of detection –Post-attack detection Not helpful in mitigating or detection –Most attack machines are innocent participants
Narasimha Reddy Texas A & M University 79 Related Work –host based Limit the number of new connections of individual hosts –TwyCross & Williamson (HP) –Reduces the speed at which a worm can spread –Can be used to detect worms Monitor application execution sequences –Profiling based indication of anomalous behavior => Detect and sandbox worms
Narasimha Reddy Texas A & M University 80 Conclusion Real-time resource accounting is feasible Real-time traffic monitoring is feasible –Simple enough to be implemented inline Can rely on many tools from signal/image processing area –More robust offline analysis possible –Concise for logging and playback
Narasimha Reddy Texas A & M University 81 Thank you !! For more information,
Narasimha Reddy Texas A & M University 82 LRU-RED Results
Narasimha Reddy Texas A & M University 83 RTT Bias -TCP flows
Narasimha Reddy Texas A & M University 84 Impact of Cache size Effect of varying cache size –to study impact of cache size on performance of the scheme –probability= 1/55, threshold = 125 –number of TCP flows=20 –equal weights for both queues.
Narasimha Reddy Texas A & M University 85 Results – Cache size
Narasimha Reddy Texas A & M University 86 Normal Workloads Performance under normal workloads –working of scheme when non-responsive loads are absent or use their fair share of bandwidth –cache size = 9, threshold =125 –probability = 1/55
Narasimha Reddy Texas A & M University 87 Results – Normal workload
Narasimha Reddy Texas A & M University 88 Normal Mixed workload