Understanding the Network-Level Behavior of Spammers Best Student Paper, ACM Sigcomm 2006 Anirudh Ramachandran and Nick Feamster Ye Wang (sando)
2 Content Motivation Data Collection Data Analysis Network-level Characteristics of Spammers Spam from Botnets Spam from Transient BGP Announcements Lessons for Better Spam Mitigation Conclusion & Discussion
3 Motivation Scalability, Security, Reliability, Operability keys of next generation Internet service Internet business model stands on them –then performance, increase services, applications large amount of funding tells this secret Security issue is tough Attackers always win! spam, botnet, DDoS, worm, probe, hijack, crack, phishing
4 Motivation Spam (and Mitigation) eat bandwidth, degrade service, complications direct, open relays, botnets, spectrum agility –content filter (large corpuses for training) –IP blacklist (IP-layer behavior is not clear) Target of this 18-month project characterize the network-level behavior of spammers –IP address, AS, country of spammers –IP-layer techniques of spammers: botnets, routing give some guideline for better mitigation
5 Data Collection Spam Traces a “sinkhole” corpus domain Aug. 5, 2005 – Jan. 6, ,000,000 spams collect network-level properties of spams –IP address of the relay –traceroute –passive “p0f” TCP fingerprint (indication of OS) –whether the relay in the DNS blacklists
6 Data Collection Legitimate Traces from a large service provider –*Nick is always welcome 700,000 legitimate s Botnet Command and Control Data a trace of hosts infected by W32/Bobax worm redirect DNS queries to the sinkhole running botnet command and control BGP Routing Measurements BGP monitor –just like our rumor-collector
7 Data Analysis Network-level Charateristics of Spammers Distribution across IP address space –Majority spam from a small fraction of IP –Spammers quite distributed
8 Data Analysis Network-level Charateristics of Spammers Distribution across ASes and by country –(spam and legitimate) 10% from 2 ASes; 36% from 20 Ases
9 Data Analysis Network-level Charateristics of Spammers The Effectiveness of Blacklists –80% relays in the blacklists
10 Data Analysis Spam from Botnets Bobax vs spammer distribution –4693/117,268 Bobax bots sent spam; but similar CDF of IP address for spamers and Bobax dones
11 Data Analysis Spam from Botnets OS of spamming hosts –4% not Windows; but sent 8% spam
12 Data Analysis Spam from Botnets Spamming Bot Activity Profile –65% single-shot bots; 75% sent less than two
13 Data Analysis Spam from Transient BGP Announcements BGP Spectrum Agility –hijack /8 send spam withdraw –66./8 of AS21562, 82./8 of AS8717, (61./8 of AS4678)
14 Data Analysis Spam from Transient BGP Announcements How much spam from Spectrum Agility –1% spam from short-lived routes; but sometimes 10% Prevalence of BGP Spectrum Agility –Persistence != Volume –AS4788, AS4678
15 Lessons for Better Spam Mitigation 1.Spam filtering requires host identity 2.Detection based on aggregate behavior is better than single IP address 3.Securing the Internet routing infrastructure bolsters identity and traceability of s 4.Network-level properties incorporated into spam filters may be effective
16 Conclusion Methodology joint analysis of a unique combination of datasets strong hacking techniques –*only Nick can handle that easily measurement based study Contribution important results of spammers’ network-level behavior –network-level properties are less malleable –network-level properties may be observable at a early stage defense guidelines and lessons
17 Discussion We could learn much from this paper research motivation must be strong –significance of Routing Management, IVI, CGENI? employ diversified techniques to enrich the methodology arbitrary conclusion should be avoided Some questions the problem itself is far beyong being solved still some arguable data (botnets) in the paper spamming reveals in return the defect of service itself and the design of its business model (pay for spam?)
Thank You All big things in this world are done by people who are naïve and have an idea that is obviously impossible Frank Richards