Download presentation
Presentation is loading. Please wait.
1
Understanding the Network-Level Behavior of Spammers Mike Delahunty Bryan Lutz Kimberly Peng Kevin Kazmierski John Thykattil By Anirudh Ramachandran and Nick Feamster Defense Team:
2
Agenda Introduction Background and Related Work Data Collection Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements Lessons from Better Spam Mitigation Conclusion
3
Introduction Spam Multiple emails sent to many recipients Multiple emails sent to many recipients Unsolicited commercial messages Unsolicited commercial messages Study based on network level behavior of spammers IP address ranges IP address ranges Spamming modes (route hijacking, bots, etc.) Spamming modes (route hijacking, bots, etc.) Temporal persistence of spamming hosts Temporal persistence of spamming hosts Characteristics of spamming botnets Characteristics of spamming botnets Much attention has been paid to studying the content of spam
4
Introduction Cont. Study posits that Network Level properties need to be investigated in order to determine creative ways to mitigate spam Paper analyzes network properties of spam that is observed at a large spam “sinkhole” BGP route advertisements BGP route advertisements Traces of command and control messages of a Bobax botnet Traces of command and control messages of a Bobax botnet Legitimate emails Legitimate emails Surprising Conclusions Most spam comes from a small IP address space (but so does legitimate email) Most spam comes from a small IP address space (but so does legitimate email) Most spam comes from Microsoft Windows hosts – bots Most spam comes from Microsoft Windows hosts – bots Small set of spammers use short-lived route announcements to remain untraceable Small set of spammers use short-lived route announcements to remain untraceable
5
Background Methods and Mitigation Spamming Methods Spamming Methods Direct Spamming – via spam friendly ISPs or dial-up IPs Open Relays and Proxies – mail serves that allow unauthenticated to relay email Botnets – hijacked machines acting under the control of centralized ‘botmaster’ BGP Spectrum Agility – short-lived route announcements to the IP addresses from which they send spam; hampers traceability Mitigation Techniques Filtering: Content based and IP Blacklists Filtering: Content based and IP Blacklists
6
Related Work Related Work – Previous Studies Packet traces to determine bandwidth bottlenecks from spam sources Packet traces to determine bandwidth bottlenecks from spam sources Project Honeypot Project Honeypot Sink for email traffic and hands out trap email addresses to determine harvesting behavior and identity of spammers Time monitoring from harvesting to receipt of first spam message Countries where harvesting infrastructure is located Persistence of spam harvesters
7
Related Work Cont. Mitigation SpamAssassin Project – reverse engineering via mail content analysis SpamAssassin Project – reverse engineering via mail content analysis DNS blacklist – 80% of IPs sending spam were in the blacklist DNS blacklist – 80% of IPs sending spam were in the blacklist Unusual Route Announcements Bogus Well-Known addresses Bogus Well-Known addresses Suggestions of short lived route announcements Suggestions of short lived route announcements
8
Data Collection Reserve a “sinkhole” Reserve a “sinkhole” Registered domain with no legitimate email addresses Registered domain with no legitimate email addresses Establish a DNS Mail Exchange record for it. Establish a DNS Mail Exchange record for it. All emails received by the server are spam All emails received by the server are spam Run metrics on incoming emails Run metrics on incoming emails IP address of the relay; also run a traceroute IP address of the relay; also run a traceroute TPC fingerprint to get the source OS TPC fingerprint to get the source OS Results of DNS blacklist from 8 different blacklist servers Results of DNS blacklist from 8 different blacklist servers
9
Data Collection Cont. Spam received per day at sinkhole (Aug. 2004 – Dec. 2005)
10
Data Collection Cont. “Hijack” the DNS server for the domain running a botnet Have botnet commands go to a known machine instead. Have botnet commands go to a known machine instead. M onitor the BGP update from the networks where the spams are received M onitor the BGP update from the networks where the spams are received Collect logs from large email provider (40 million mailboxes) Collect logs from large email provider (40 million mailboxes) Allows analysis of network characteristics for spam and non-spam Allows analysis of network characteristics for spam and non-spam
11
Data Analysis Study focuses on network level characteristics Study focuses on network level characteristics Distribution of spam across IP address space is similar to legitimate emails (although not exact) Distribution of spam across IP address space is similar to legitimate emails (although not exact) Spam over IP address range is not uniform Spam over IP address range is not uniform 12% of all received spam comes from two Autonomous Systems (AS) 12% of all received spam comes from two Autonomous Systems (AS) 37% come from top 20 ASes. 37% come from top 20 ASes. Offers insight into spam prevention Offers insight into spam prevention Classifying spam by country: China, Korea, & US dominate Classifying spam by country: China, Korea, & US dominate Defense suggestion Defense suggestion Correlate originating country with IP range to estimate probability of spam. Correlate originating country with IP range to estimate probability of spam.
12
Cumulative Distribution Function (CDF) of Spam and Legitimate Email Greater probability of legitimate emails Big increase in probability of received spam
13
Spam Persistence 85% of unique spammers send 10 emails or less If this is true for all, what’s the value in filtering by a specific IP address?
14
Effectiveness of Blacklists About 80% of spam listed in at least one major blacklist
15
Effectiveness of Blacklists Cont. Most spam bots are detected by at least one DNSRBL Only 50% of spammers using transient BGP announcements detected by one DNSRBL
16
Spam from Botnets Circumstantial evidence suggests that most spam originates from bots Spamming hosts and Bobax drones have very similar distributions across IP address space Suggests that much spam received may be due to botnets such as Bobax Suggests that much spam received may be due to botnets such as Bobax
17
More on Bots Most individual bots send low volume of spam individually
18
Operating Systems Used by Spammers Used OS fingerprinting tool “p0f” in Mail Avenger Able to identify OS of 75% of hosts that sent spam Of this 75% identifiable segment, 95% run Windows Of this 75% identifiable segment, 95% run Windows Consistent with percentage of hosts on Internet that run Windows Consistent with percentage of hosts on Internet that run Windows Only about 4% run other OS, but are responsible for 8% of received spam. This goes against common perception that most spam originates from Windows botnet drones This goes against common perception that most spam originates from Windows botnet drones
19
Spam from Transient BGP Announcements Some spammers briefly hijack large portions of IP address space (that do not belong to them), send spam, and withdraw routes immediately after spamming Not much known, not well defended against Very difficult to trace Allows spammer to evade DNSRBLs Allows spammer to evade DNSRBLs Used 10% or less of the time, as complementary spamming tactic
20
Lessons on Spam Mitigation Why should we use network-level information? Information is less malleable Information is less malleable More constant than spam email contents, which content-based filters monitor Information is observable in the middle of the network Information is observable in the middle of the network Closer to the source of the spam than other techniques Will result in more effective spam filters Will result in more effective spam filters When combined with other techniques Has potential to stop spam that other techniques miss Has potential to stop spam that other techniques miss
21
More Lessons Improves knowledge of host identity Bases detection techniques on aggregate behavior Protects against route hijacking “BGP spectrum agility” “BGP spectrum agility” Other techniques do not Other techniques do not Uses network-level properties to detect and filter
22
Conclusion Studying the network-level behavior of spammers Designing better spam filters with network- level filters Network-level behavior filters vs. content- based filters Should not replace content-based filters, but complement them Should not replace content-based filters, but complement them
23
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.