1 Modeling and Measuring Botnets David Dagon, Wenke Lee Georgia Institute of Technology Cliff C. Zou Univ. of Central Florida Funded by NSF CyberTrust Program, 2006
2 Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment
3 Motivation Botnet becomes a serious threat Not much research on botnet yet Empirical analysis of captured botnets Mainly based on honeypot spying Need understanding of the network of botnet Botnet growth dynamics Botnet (on-line) population, threat level … Well prepared for next generation botnet
4 Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment
5 Botnet Monitor: Gatech KarstNet A lot bots use Dyn-DNS name to find C&C bot C&C attacker C&C KarstNet sinkhole cc1.com KarstNet informs DNS provider of cc1.com Detect cc1.com by its abnormal DNS queries DNS provider maps cc1.com to Gatech sinkhole (DNS hijack) bot All/most bots attempt to connect the sinkhole
6 Diurnal Pattern in Monitored Botnets Diurnal pattern affects botnet propagation rate Diurnal pattern affects botnet attack strength
7 Botnet Diurnal Propagation Model Model botnet propagation via vulnerability exploit Same as worm propagation Extension of epidemic models Model diurnal pattern Computers in one time zone same diurnal pattern “Diurnal shaping function” i (t) of time zone i Percentage of online hosts in time zone i Derived based on the continuously connection attempts by bots in time zone i to Gatech KarstNet
8 Modeling Propagation: Single Time Zone : # of infected : # of vulnerable :# of online infected :# of online vulnerable Epidemic model Diurnal pattern means: Diurnal model removal
9 Modeling Propagation: K Multiple Time Zones (Internet) Limited ability to model non-uniform scan scan rate from zone j i IP space size of zone i
10 Validation: Fitting model to botnet data Diurnal model is more accurate than traditional epidemic model
11 Applications of diurnal model Predict future botnet growth with monitored ones Use same vulnerability? have similar (t) Improve response priority Released at different time
12 Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment
13 Population estimation I: Capture-recapture How to obtain two independent samples? KarstNet monitors two C&C for one botnet Need to verify independence with more data Study how to get good estimation when two samples are not independent KarstNet + honeypot spying Guaranteed independence? Botnet population # of observed (two samples) # of observed in both samples
14 Population estimation II: DNS cache snooping Estimate # of bots in each domain via DNS queries of C&C to its local DNS server Non-recursive query will not change DNS cache Time …. Cache TTL If queries inter-arrival time is exponentially distributed, same then T i follows the same exp. distr. ( memoryless ) Query rate/bot
15 Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment
16 Basic threat assessment Botnet size (population estimation) Active/online population when attack (diurnal model) IP addresses of bots in botnets Basis for effective filtering/defense KarstNet is a good monitor for this Honeypot spying is not good at this Botnet control structure (easy to disrupt?) IPs and # of C&C for a botnet? P2P botnets?
17 Botnet attack bandwidth Bot bandwidth: Heavy-tailed distribution Filtering 32% of bots cut off 70% of attack traffic How about bots bandwidth in term of ASes? If yes, then contacting top x% of ASes is enough for a victim to defend against botnet DDoS attack