Download presentation
Presentation is loading. Please wait.
1
Studying Spamming Botnets Using Botlab
John P. John, Alexander Moshchuk, Steven D. Gribble, Arvind Krishnamurthy [John 2009] John, John P., Alexander Moshchuk, Steven D. Gribble, and Arvind Krishnamurthy. "Studying Spamming Botnets Using Botlab." In NSDI, vol. 9, pp Presented by Sharan Dhanala
2
Background on the Botnet Threat
A botnet is a large-scale, coordinated network of computers, each of which executes specific bot software. Botnet operators recruit new nodes by taking control of the victim hosts and secretly installing bot code onto them. The resulting army of “zombie” computers is typically controlled by one or more command-and-control (C&C) servers. Botnets have become more sophisticated and complex in how they recruit new victims and mask their presence from detection systems Propagation 2. Customized C&C protocols 3. Rapid evolution Propagation-> Reply on social engineering to find and compromise victims. Customized C&C protocols -> Older botnets used IRC to communicate with C&C but the newer ones use encrypted and customized protocol (HTTP request) to send commands to the bots. Rapid evolution -> Most malware binaries are packed with polymorphic packers which means that the binaries look different but the underlying code base is the same. Botnet operators are using fast flux DNS rather than a single web server to host their scam.
3
Botlab architecture Image source: [ This is the botlab architecture. We will go through each entities in details in the following slides.
4
The Botlab Monitoring platform
Botlab’s design was motivated by four requirements: Attribution Adaptation Immediacy Safety Incoming Spam On an average, UW receives 2.5 million messages each day, over 90% of which is classified as spam. Malware Collection Botlab crawls URLs found in its incoming spam feed. Botlab periodically crawls binaries or URLs contained in public malware repositories or collected by MWCollect Alliance honeypots. Attribution-> Botlab must identify spam botnets and their hosts that are responsible for campaigns. Adaptation-> Botlab must track changes with the botnet’s behaviour from time to time. Immediacy-> Botlab must provide information about botnet asap as that information might degrade quickly. Safety-> Botlab must not cause harm. Monitor Incoming feed->of about 200,000 UOW address. MalwareCollection Running captive bots nodes requires upto date bot binaries. Crawl URL-> 100,000 unique URLs per day in their spam feed. 1% malicious executables or drive-by downloads.
5
The Botlab Monitoring platform Identifying spamming bots
Botlab executes spamming bots within sandboxes to monitor botnet behavior. Network fingerprinting Each flow record <protocol, IP address, DNS address, port> Similarity coefficient of two binaries B1 and B2 If similarity coefficient of two binaries is high then the binaries are to be behavioural duplicates. - Prune the binaries obtained by Botlab to identify those that correspond to the botnet and discard any duplicate binaries which are already being monitored by Botlab. Simple hashing is not sufficient as the binaries are polymorphic packed and this circumvents the signature based security tools. For more reliable behavioural signature, Botlab produces network fingerprint of each binary. 𝑆 𝐵1,𝐵2 = |𝑁1∩𝑁2| |𝑁1∪𝑁2|
6
The Botlab Monitoring platform Identifying spamming bots
Safely generating fingerprints Tight rope between safety & effectiveness. Human operator with tools that act as safety net. Redirect traffic to spamhole. Experience classifying bots Bots that detect VM & bare-metal. Bots checking domain name- required modifying spamhole. Bots perform comprehensive SMTP verification. Tension between safety and effectiveness in evident when constructing signatures of newly gathered binaries. Safe approach-> drop network packets instead of transmitting them but this is ineffective as most of the binaries first communicate with the C&C server before fully activating. Effective approach would be to give the binaries unconditional access to internet but this is not safe as they might start spreading spam across. In order to walk this tight rope, Botlab has a human intervention where there is a human operator with tools acting as a safety net- traffic destines to privileged ports are automatically dropped, there is a limit enforced on connection rates, data transmission, total window of time allowed for binaries to execute. Bots detect when they are being run in VM and disable themselves. Execute the created binaries in VM and bare-metal and then compare the results to check whether the binary performs any VM detection. SMTP verification- MegaD example of checking MessageID and verifying with C&C before sending out instructions.
7
The Botlab Monitoring platform Execution Engine
Seven spamming bots: Grum, Kraken, MegaD, Pushdo, Rustock, Srizbi, and Storm. Avoiding blacklisting anonymizing “Tor” (The Onion router) network Multiple C&C servers C&C redundancy mechanism Image source: /images/zombie_network.jpg There is a chance of blacklisting IP address belonging to UOW if the botlab’s existence is learnt by the botnet owners. To manage this, Botlab routes any bot traffic through anonymised TOR network. TOR is a just a temporary solution but on a long term basis the idea of monitoring agents at secret locations with the hosting provided by organisations that desire to combat botnet threat.
8
The Botlab Monitoring platform Correlation analyzer
Correlate incoming spam with outgoing spam and perform attribution; identify IPs for a given botnet. For spam that cannot be directly attributed, cluster based on source IPs and merge with an attributed set if there is overlap.
9
Analysis Examine the actions of the bots being run in Botlab – Outgoing Spam. Analyse the incoming spam feed. Analysis obtained out of studying both the outgoing and incoming spam feeds.
10
Analysis Behavioural characteristics
Image source: [ The authors have monitored over the past 6 months and have deduced these characteristics. Amount of outgoing spam is vastly different. Big variability in send rates suggest these rates might be useful in fingerprinting and distinguishing various botnets. Most the spam botnets have the C&C’s IP address statically configured and Botlab can efficiently pinpoint the IP address of all these servers. If these servers can be found efficiently and shut down then the percentage of world’s spam will reduce considerably. Most of the spam do not change their C&C for a very long time which signifies that they stick with one C&C rather than hopping around. Ideally you would expect them to change their C&C from time to time to avoid detection or re-establish a compromised server.
11
Analysis Analysis of outgoing spam
Outgoing spam feeds Size of mailing lists Using the outgoing spam feeds to estimate the size of the botnets’ recipient lists. A bot periodically obtains a new chunk of recipients from the master and sends spam to this recipient list. On each such request, the chunk of recipients is selected uniformly at random from the spam list. The chunk of recipients received by a bot is much smaller than the spam list size.
12
Analysis Analysis of outgoing spam
Outgoing spam feeds Overlap in mailing lists They also examined whether botnets systematically share parts of their spam lists. Image source: [ Different botnets cover different partitions of the global list and this give the spammers the benefit of using multiple botnets to get a wider reach.
13
Analysis Analysis of outgoing spam
Outgoing spam feeds Spam subjects Between any two spam botnets, there is no overlap in subjects sent within a given day, and an average overlap of 0.3% during the length of their study. Subject-based classification. Botnets carefully design and hand-tune custom spam subjects to defeat spam filters and attract attention.
14
Analysis Analysis of Incoming Spam
Analysed 46 million spam messages obtained from a 50 day trace. University of Washington’s filtering systems : 89.2% of incoming mail as spam 0.5% of spam contain viruses as attachments. 95% of the spam messages contain HTTP links. 1% contain links to executables.
15
Analysis Analysis of Incoming Spam
Spam campaigns and Web hosting They cluster spam based on the following attributes The domain names appearing in the URLs found in spam. The content of Web pages linked to by the URLs. The resolved IP addresses of the machines hosting this content. Imagesource:[ 95% of the spam in their feed contains links. Content of webpage linked to by the URLs is the most useful attribute for characterising campaigns. Domain name clustering -> Graph shows that the number of distinct hostnames is large and increases steadily as spammers typically use newly registered domains. Too fine grained to reveal the true extent of botnet infections. Content clustering-> 80% of spam pointed to just 11 distinct web pages and content of these pages do not change. Conclude that though the content of message that is sent out by spammer is obfuscated, the web pages being advertised is static. Can identify distinct campaigns but cannot attribute them to specific botnets. IP based clustering-> they resolved all the spam URLs and grouped the IP address… they then grouped the spam messages based on the IP clusters. Found that 80% of spam corresponds to 15 IP clusters (57 IPs). Too coarse grained to determine individual botnets. Imagesource:
16
Analysis Correlation analysis
Spam classification Image source: [ Image source: [ To classify each spam message received by UOW- subject based signature is used. Each signature is dyamic- it changes when botnets change their outgoing spam. 6 botnets are responsible for 79% of UOW incoming spam. 35% from Srizbi which is quite a lot.
17
Analysis Correlation analysis
Spam campaigns Image source: [ Image source: [ They classified the incoming spam according to spam campaigns. “Other” less common campaigns. Every pair of botnets share some hosting infrastructure which says that the scam hosting is sort of 3rd party service which is used by multiple botnets. (IP based clustering)
18
Analysis Correlation analysis
Recruiting campaigns Image source: [ They were able to identify incoming spam messages contains links to executables infecting victims with the storm, pushdo and srizbi. Peaks- campaigns launched by the botnets to recruit new victims. They expected spikes to be translate to an increase in number of messages sent by these three but apparently it was not the case. Botnet operator will try to limit the overall spam volume sent out by the whole botnet rather than assigning all available bots to send spam at max rate.
19
Applications enabled by Botlab
Safer web browsing They have found 40K malicious URLs propagated by Srizbi None of them were in malware DBs (Google, etc.) Further Gmail’s spam filtering rate was only 21% for Srizbi. BotLab can generate malware list in real-time; they have developed a Firefox plugin to check against this Spam filtering Developed a Thunderbird extension that compares an incoming with the list of spam subjects and list of URLs being propagated by captive bots Preliminary results are promising Availability of Botlab Data- Botlab protects users from messages contains dangerous links and social engineering traps by using its real time database which contains malicious links seen in outgoing botnet-generated spam. They have create firefox plugin which checks the links a user visits against the database before navigating.
20
Critics about the paper
"Relying on anti-virus software is also impractical, as these tools do not detect many new malware variants." was mentioned in the paper yet they used anti-virus tools to validate their duplicate binaries elimination procedure. Botnets are continuous evolving and it is going to be quite hard to conduct safe experiments. More ways of monitoring can be done on the application layer. In addition to monitoring attachments and message headers, monitoring the text content of can also be facilitated for spam monitoring. There are some entities in Botlab that needs human operators. This doesn't completely eradicate the human interference. Would be exciting to see a fully automated tool. Paper can be considered as a basis for building a more powerful tool for spam filtering.
21
Conclusion Described Botlab, a real-time botnet monitoring system.
Behaviour and classification of botnets. My critics on the paper.
22
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.