Modeling and Measuring Botnets David Dagon, Wenke Lee Georgia Institute of Technology Cliff C. Zou Univ. of Central Florida
Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment Advanced botnet
Motivation Botnet becomes a serious threat Not much research on botnet yet Empirical analysis of captured botnets Mainly based on honeypot spying Need understanding of the network of botnet Botnet growth dynamics Botnet (on-line) population, threat level … Well prepared for next generation botnet
Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment Advanced botnet
Botnet Monitor: Gatech KarstNet attacker A lot bots use Dyn-DNS name to find C&C C&C C&C cc1.com KarstNet informs DNS provider of cc1.com Detect cc1.com by its abnormal DNS queries bot bot DNS provider maps cc1.com to Gatech sinkhole (DNS hijack) bot KarstNet sinkhole All/most bots attempt to connect the sinkhole
Diurnal Pattern in Monitored Botnets Diurnal pattern affects botnet propagation rate Diurnal pattern affects botnet attack strength
Botnet Diurnal Propagation Model Model botnet propagation via vulnerability exploit Same as worm propagation Extension of epidemic models Model diurnal pattern Computers in one time zone same diurnal pattern “Diurnal shaping function” i(t) of time zone i Percentage of online hosts in time zone i Derived based on the continuously connection attempts by bots in time zone i to Gatech KarstNet
Modeling Propagation: Single Time Zone : # of infected :# of online infected : # of vulnerable :# of online vulnerable Diurnal pattern means: removal Epidemic model Diurnal model
Modeling Propagation: K Multiple Time Zones (Internet) Limited ability to model non-uniform scan scan rate from zone ji IP space size of zone i
Validation: Fitting model to botnet data Diurnal model is more accurate than traditional epidemic model
Applications of diurnal model Predict future botnet growth with monitored ones Use same vulnerability? have similar (t) Improve response priority Released at different time
Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment Advanced botnet
Population estimation I: Capture-recapture # of observed (two samples) Botnet population # of observed in both samples How to obtain two independent samples? KarstNet monitors two C&C for one botnet Need to verify independence with more data Study how to get good estimation when two samples are not independent KarstNet + honeypot spying Guaranteed independence?
Population estimation II: DNS cache snooping Estimate # of bots in each domain via DNS queries of C&C to its local DNS server Non-recursive query will not change DNS cache Cache TTL …. Time If queries inter-arrival time is exponentially distributed, then Ti follows the same exp. distr. (memoryless) Query rate/bot
Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment Advanced botnet
Basic threat assessment Botnet size (population estimation) Active/online population when attack (diurnal model) IP addresses of bots in botnets Basis for effective filtering/defense KarstNet is a good monitor for this Honeypot spying is not good at this Botnet control structure (easy to disrupt?) IPs and # of C&C for a botnet? P2P botnets?
Botnet attack bandwidth Bot bandwidth: Heavy-tailed distribution Filtering 32% of bots cut off 70% of attack traffic How about bots bandwidth in term of ASes? If yes, then contacting top x% of ASes is enough for a victim to defend against botnet DDoS attack
Outline Motivation Diurnal modeling of botnet propagation Botnet population estimation Botnet threat assessment Advanced botnet
Monitoring evasion by botmasters Honeypot detection Honeypot defenders are liable for attacks sending out C&C bot sensor (secret) malicious traffic Inform bot’s IP Authorize C&C hijacking detection (e.g., KarstNet) Check if C&C names map to their real IPs Attacker knows which computers used for C&Cs Check if C&C passes trivial commands to bots
Advanced hybrid P2P botnet Why use P2P by attackers? Remove control bottleneck (C&C) C&Cs are easy to be monitored One honeypot spy reveals all C&Cs One captured/hijacked C&C reveals all bots C&C are easy to be shut down (limited number) Current P2P protocols will not work for botnets Bootstrap process is vulnerable to be blocked Disable global view from each bot (prevent monitoring) Must consider DHCP, private IP, firewall, capture, removal
Advanced botnet designs Servent bots Servent bots: static IP, no firewall blocking Peer-list based connection: Max number of servent bot IPs in each bot Limited view of botnet Built as a botnet spreads No bootstrap process No reveal of entire botnet Client bots Compare to C&C botnets: Large # of C&C bots interconnect to each other
Advanced botnet designs Public key in bot code, private key in botmaster Ensure command authentication/integrity Individualized encryption, service port Defeat traffic-based detection Limited exposure when one bot is captured Peer list:
Advanced botnet designs Easy monitoring by botmaster Command all bots report to a “sensor” host Each bot report: peer list, encryption key, service port, IP, diurnal property, IP property, link bandwidth…. Different sensor hosts in each round of report command Prevent sensors from being blocked, captured Robust botnet construction by peer-list updating With few re-infections, initial servent bots are highly connected (each connecting to >60% of bots in a botnet) “Peer-list Update” command: each bot goes to a “sensor” host to get its new peer list Peer list randomly selected from previous reported servent bots
Botnet robustness study : remove top p fraction of servent bots used in “update” command : connected ratio – how many remaining bots are connected Simulation settings: 20,000-size botnet, 5000 are servent bots (hundreds of reinfections) 1000 servent bots used in update command
Future work Propagation modeling Population estimation: Diurnal model of email-based propagation Parameters: (t), , removal dynamics Population estimation: Validate the independence of monitor samples Validate the Poisson arrival in C&C DNS queries Threat assessment AS-level botnet bandwidth (heavy tailed?) Bot access link speed --- better representation? Monitor and model of advanced botnets
Reference NSF Cyber Trust grant: CNS-0627318 "Collaborative Research: CT-ISG: Modeling and Measuring Botnets" PI: Cliff Zou, PI: Wenke Lee David Dagon, Cliff C. Zou, and Wenke Lee. "Modeling Botnet Propagation Using Time Zones," in 13th Annual Network and Distributed System Security Symposium (NDSS), Feb., San Diego, 2006 (Acceptance ratio: 17/127=13.4%). Cliff C. Zou and Ryan Cunningham. "Honeypot-Aware Advanced Botnet Construction and Maintenance," in the International Conference on Dependable Systems and Networks (DSN), Jun., Philadelphia, 2006 (Acceptance ratio: 34/187=18.2%). Ping Wang, Sherri Sparks, Cliff C. Zou. “An Advanced Hybrid Peer-to-Peer Botnet,” in submission. Cliff Zou homepage: http://www.cs.ucf.edu/~czou/