Detecting Botnets Using Hidden Markov Models on Network Traces Wade Gobel Bio-Grid, Summer 2008
Bots and Botnets Malicious self-propagating program Difficult to detect Most antivirus software is signature-based Ability to communicate and coordinate with botmaster IRC HTTP Prevalence Honeypots Size is power
Bot Infection Security flaws Port scanning Compromised servers Increase range Allow for communication indirection
Bot Attacks & Profits “Renting out” a botnet Spam DDoS Click fraud Identity theft
Bot Detection Indicators Similar requests Synchronization Problems Potentially little traffic Potential delay between command and action
Hidden Markov Models
Initial State Probabilities
Transition Probabilities
Observation Probabilities
Complete HMM
Example States: Observations: Question: What’s the weather been like? Example courtesy of
Modeling with HMMs Only given observations Generate most likely HMM that generates the sequence of states The Baum-Welch algorithm
The Process Collect network data Extract some characteristic HMM models underlying state of computer / network Test for similarity between HMMs Synchronization may result in greater similarity
Sample Data Variation Regular / random intervals Same / Different number of bot-initiated requests Synchronization With / Without user browsing