1 Spam: Why? Chris Kanich Christian Kreibich Kirill Levchenko Brandon Enright Vern Paxson Geoffrey M. Voelker Stefan Savage +=

Slides:



Advertisements
Similar presentations
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Advertisements

Basic Communication on the Internet:
Click Trajectories: End-to-End Analysis of the Spam Value Chain Author : Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, M’ark F’elegyh’azi,
Chapter 1 We’ve Got Problems…. Four Horsemen  … of the electronic apocalypse  Spam --- unsolicited bulk o Over 70% of traffic  Bugs ---
Principles of Information Systems, Sixth Edition The Internet, Intranets, and Extranets Chapter 7.
Security and Trust in E- Commerce. The E-commerce Security Environment: The Scope of the Problem  Overall size of cybercrime unclear; amount of losses.
Breaking Trust On The Internet
1 Aug. 3 rd, 2007Conference on and Anti-Spam (CEAS’07) Slicing Spam with Occam’s Razor Chris Fleizach, Geoffrey M. Voelker, Stefan Savage University.
Introduction to Security Computer Networks Computer Networks Term B10.
 What is a botnet?  How are botnets created?  How are they controlled?  How are bots acquired?  What type of attacks are they responsible for? 
Understanding the Network-Level Behavior of Spammers Mike Delahunty Bryan Lutz Kimberly Peng Kevin Kazmierski John Thykattil By Anirudh Ramachandran and.
Fighting Spam Randy Appleton Northern Michigan University
Botnets Abhishek Debchoudhury Jason Holmes. What is a botnet? A network of computers running software that runs autonomously. In a security context we.
MSIS 110: Introduction to Computers; Instructor: S. Mathiyalakan1 The Internet, Intranets, and Extranets Chapter 7.
Spam Sonia Jahid University of Illinois Fall 2007.
23 October 2002Emmanuel Ormancey1 Spam Filtering at CERN Emmanuel Ormancey - 23 October 2002.
Botnets Uses, Prevention, and Examples. Background Robot Network Programs communicating over a network to complete a task Adapted new meaning in the security.
Norman SecureTide Powerful cloud solution to stop spam and threats before it reaches your network.
Norman SecureSurf Protect your users when surfing the Internet.
Internet Safety Basics Being responsible -- and safer -- online Visit age-appropriate sites Minimize chatting with strangers. Think critically about.
1 Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling - Proceedings.
Internet Safety CSA September 21, Internet Threats Malware (viruses) Spyware Spam Hackers Cyber-criminals.
Botnets An Introduction Into the World of Botnets Tyler Hudak
Mr C Johnston ICT Teacher
Captcha Soft solutions Pvt Ltd is a recognized name in the web design industry. For the past three years, we’ve been doing what we love: inventing, conceptualizing,
May l Washington, DC l Omni Shoreham The ROI of Messaging Security JF Sullivan VP Marketing, Cloudmark, Inc.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
Malicious Code Brian E. Brzezicki. Malicious Code (from Chapter 13 and 11)
 Collection of connected programs communicating with similar programs to perform tasks  Legal  IRC bots to moderate/administer channels  Origin of.
1 All Your iFRAMEs Point to Us Mike Burry. 2 Drive-by downloads Malicious code (typically Javascript) Downloaded without user interaction (automatic),
Economics of Malware: Spam Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content may be borrowed from other resources. See the last.
Click Trajectories: End-to-End Analysis of the spam value chain Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, Tristan Halvorson,
GOLD UNIT 4 - IT SECURITY FOR USERS (2 CREDITS) Thomas Jenkins.
Electronic Commerce & Marketing. What is E-Commerce? Business communications and transactions over networks and through computers, specifically –The buying.
MURI Kickoff Welcome! First, introductions… all around Some context and expectations u We’re going to give some informal presentations about our plans.
Bots Used to Facilitate Spam Matt Ziemniak. Discuss Snort lab improvements Spam as a vehicle behind cyber threats Bots and botnets What can be done.
Botnets: Yesterday, Today, and Tomorrow CS 598: Advanced Internet Presented by: Imranul Hoque.
2010/6/7 Spamalytics An Empirical Analysis of Spam Marketing Conversion Author: Chris Kanich Christian Kreibich Kirill Levchenko Brandon Enright Geoffrey.
The Internet 8th Edition Tutorial 2 Basic Communication on the Internet: .
2012 4th International Conference on Cyber Conflict C. Czosseck, R. Ottis, K. Ziolkowski (Eds.) 2012 © NATO CCD COE Publications, Tallinn 朱祐呈.
Collaborative Center for Internet Epidemiology and Defenses (CCIED) Stefan Savage Department of Computer Science & Engineering University of California,
Here is a list of viruses Adware- or advertising-supported software-, is any software package which automatically plays, displays, or downloads advertisements.
Botnet behavior and detection October RONOG Silviu Sofronie – a Head of Forensics.
BOTNET JUDO Fighting Spam with Itself By: Pitsillidis, Levchenko, Kreibich, Kanich, Voelker, Paxson, Weaver, and Savage Presentation by: Heath Carroll.
1 Fighting Comment Spam Employing the site’s audience, coding skills, and free distributed solutions to fight back.
Not So Fast Flux Networks for Concealing Scam Servers Theodore O. Cochran; James Cannady, Ph.D. Risks and Security of Internet and Systems (CRiSIS), 2010.
ACM 511 Introduction to Computer Networks. Computer Networks.
The UCSD Network Telescope A Real-time Monitoring System for Tracking Internet Attacks Stefan Savage David Moore, Geoff Voelker, and Colleen Shannon Department.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
11 Spamcraft: An Inside Look At Spam Campaign Orchestration Reporter: 林佳宜 Advisor: Chun-Ying Huang /6/3.
Studying Spamming Botnets Using Botlab 台灣科技大學資工所 楊馨豪 2009/10/201 Machine Learning And Bioinformatics Laboratory.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
Optimal Database Marketing Drozdenko & Drake,
1 Introduction to Malcode, DoS Attack, Traceback, RFID Security Cliff C. Zou 03/02/06.
Principles of Information Systems, Sixth Edition 1 The Internet, Intranets, and Extranets Chapter 7.
Studying Spamming Botnets Using Botlab
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
Speaker: Hom-Jay Hom Date:2009/10/20 Botnet Research Survey Zhaosheng Zhu. et al July 28-August
Cybersecurity Test Review Introduction to Digital Technology.
CERN - IT Department CH-1211 Genève 23 Switzerland t OIS Update on the anti spam system at CERN Pawel Grzywaczewski, CERN IT/OIS HEPIX fall.
Spoofing The False Digital Identity. What is Spoofing?  Spoofing is the action of making something look like something that it is not in order to gain.
Understanding Spam Economics Chris Kanich Computer Science & Engineering UC San Diego +=
Analysing s Michael Jones. Overview How works Types of crimes associated with Mitigations Countermeasures Michael Jones2Analsysing s.
Spamalytics: An Empirical Analysis of Spam Marketing Conversion
Botnets A collection of compromised machines
BUILD SECURE PRODUCTS AND SERVICES
TMG Client Protection 6NPS – Session 7.
A lustrum of malware network communication: Evolution & insights
Botnets A collection of compromised machines
Presentation transcript:

1 Spam: Why? Chris Kanich Christian Kreibich Kirill Levchenko Brandon Enright Vern Paxson Geoffrey M. Voelker Stefan Savage +=

2 What is Computer security?

3 Most of computer science is about providing functionality: u User Interface u Software Design u Algorithms u Operating Systems/Networking u Compilers/PL u Microarchitecture u VLSI/CAD Computer security is not about functionality It is about how the embodiment of functionality behaves in the presence of an adversary Security mindset – think like a bad guy

My Background Collaborative Center for Internet Epidemiology and Defenses (CCIED) u UCSD/ICSI group created in response to worm threat u Very well funded, many strong partners Goals u Internet epidemiology: measuring/understanding attacks u Automated defenses: stopping outbreaks/attacks u Economic and legal issues: that other stuff

Many big successes… 50+ papers, lots of tech transfer, big sytems, etc Network Telescope u Passive monitor for > 1% of routable Internet addr space Potemkin & GQ Honeyfarms u Active VM honeypot servers on >250k IP addresses Earlybird u On-line learning of new worm signatures in < 1ms

But… depressing truth We didn’t stop Internet worms, let alone malware, let alone cybercrime… nor did anyone else. At best, moved it around a bit. By any meaningful metric the bad guys are winning… Mistake: looking at this solely as a technical problem

Key threat transformations of the 21 st century Efficient large-scale compromises u Internet communications model u Software homogeneity u User naïveity/fatigue Centralized control u Makes compromised host a commodity good u Platform economy Profit-driven applications u Commodity resources (IP, bandwidth, storage, CPU) u Unique resources (PII/credentials, CD-Keys, address book, etc) 7

DDoS for sale Emergence of economic engine for Internet crime u SPAM, phishing, spyware, etc Fluid third party markets for illicit digital goods/services u Bots ~$0.5/host, special orders, value added tiers u Cards, malware, exploits, DDoS, cashout, etc.

9 3.6 cents per bot week 6 cents per bot week 2.5 cents per bot week September 2004 postings to SpecialHam.com, Spamforum.biz >20-30k always online SOCKs4, url is de-duped and updated > every 10 minutes. 900/weekly, Samples will be sent on > request. Monthly payments arranged at discount prices. >$350.00/weekly - $1,000/monthly (USD) >Type of service: Exclusive (One slot only) >Always Online: 5, ,000 >Updated every: 10 minutes >$220.00/weekly - $800.00/monthly (USD) >Type of service: Shared (4 slots) >Always Online: 9, ,000 >Updated every: 5 minutes Botnet Spammer Rental Rates Bot Payloads

Spamalytics 11

Key structural asymmetries Defenders reactive, attackers proactive u Defenses public, attacker develops/tests in private u Arms race where best case for defender is to “catch up” New defenses expensive, new attacks cheap u Defenses sunk costs/business model, attacker agile and not tied to particular technology Low risk to attacker, high reward to attacker u Minimal deterrence u Functional anonymity on the Internet; very hard to fix Defenses hard to measure, attacks easy to measure u Few security metrics (no “evidence-based” security), attackers measure monetization which drives attack quality 12

Revisiting the problem We tend to think about this in terms of technical means for securing computer systems Most of B IT budget on cyber security is spent on securing the end host u AV, firewalls, IDS, encryption, etc… u Single most expensive front to secure u Single hardest front to secure But are individual end hosts valuable to bad guys? u Maybe $1.50? Even less in bulk… not a pain point What instead? Economically informed strategies Identify and attack economic bottlenecks in value chain This means understanding the return-on-investment for bad guys 13

Today: the spam problem We tend to focus on the costs of spam u > 100 Billion spam s sent every day [Ironport] u > $1B in direct costs – anti-spam products/services [IDC] u Estimates of indirect costs (e.g., productivity) x more But spam exists only because it is profitable Someone is buying! (though no one has admitted it to me…) Our goal u Understand underlying economic support for spam 14

History of the spam business model Direct Mail: origins in 19 th century catalog business u Idea: send unsolicited advertisements to potential customers u Rough value proposition: Delivery cost < (Conversion rate * Marginal revenue) Modern direct mail (> $60B in US) u Response rate: ~2.5% (mean per DMA) u CPM (cost per thousand) = $250 - $1000 Spam is qualitatively the same… 15

… but quantitatively different Advantages of direct marketing u No printing cost u Legitimate delivery cost low (outsourced price ~ $0.001/message [Get Response]) u Dominated by production & lead generation cost (i.e. mailing list) u But this is for spam as a legal marketing vehicle… a minority Spam as marketing/bait for criminal enterprises (scams) u Mailing lists → ε (purchase/steal/harvest) <$10/M retail u Delivery cost → ε (botnet-based delivery) <$70M retail 16

Aside: economic impact of anti-spam technology? Suppose new technology filters out 99.9% of spam (at sites deploying it) u Little impact on delivery cost, mainly lowers conversion rate u Short term, compensate by sending more different s or to more people »… and pity the shmucks with the old 95% filter u Long term, incentive for spammer to bypass filter Seems likely the outcome of anti-spam has been u Increased amount of spam sent u Change in distribution of recipient pool u Unclear what profit impact is (deployment biases) 17

Brief history of the spam arms race Anti-spam action 1.Real-time IP blacklisting 2.Clean up open relays/proxies 3.Content-based learning 4.Site takedown 5.CAPTCHAs 18 Spammer response 1. Send via open relays/proxies 2. Delivery via compromised botnets 3. Content chaff, polymorphic spam generators, img spam 4. Fast-flux redirect and transparent proxies 5. CAPTCHA outsourcing, OCR-based breaking

Courtesy Stuart Brown modernlifisrubbish.co.uk Anatomy of a modern Pharma spam campaign

Estimating spam profits Recall key basic inequality: ( Delivery Cost) < (Conversion Rate) x (Marginal Revenue) We have some handle on two of these (e.g., [Franklin07] ) u Delivery cost to send spam »Outsourced cost: retail purchase price < $70/M addrs »In-house cost: development/management labor u Marginal revenue » Average pharma sale of $100, affiliate commissions ≈ 50% Conversion rate is fundamentally different We don’t know; estimates vary by orders of magnitude 20

The measurement conundrum No accident that we lack good conversion measures Its easy to measure spam from a receiver viewpoint u Which MTA sent it to me? u What does the content contain? u Where do the links go? etc… But the key economic issue is only known by the sender u Conversion rate * marginal profit = revenue per msg sent What to do? u Interview spammers? ( ) [Carmack03] u Guess? (“millions of dollars a day”) [Corman08]) u Send lots of spam and see who clicks on links? (gold standard) 21

Botnet infiltration Key idea: distributed C&C is a vulnerability u Botnet authors like de-centralized communications for scalability and resilience, but… u … to do so, they trust their bots to be good actors u If you can modify the right bots you can observe and influence actions of the botnet Rest of today: preliminary results from a case study u Infiltrated Storm P2P botnet, instrumented ~500M spams u Delivery rates (anti-spam impacts on delivery) u Click through (visits to spam advertized sites) u Conversions (purchases and purchase amounts) 22 Kanich, Kreibich, Levchenko, Enright, Paxson, Voelker and Savage, Spamalytics: an Empirical Analysis of Spam Marketing Conversion, ACM CCS 2008

How this works in detail Botnet Infiltration u Overview of the Storm peer-to-peer botnet »How does Storm work? u Mechanics of botnet spamming »How can Storm’s C&C be instrumented? Economic issues u Using a botnet for measurement »How to measure conversion via C&C interposition u Measuring spam delivery pipeline »What happens to spam from when a bot sends it… »…to when a user clicks “purchase” at a scam site? 23

Storm Storm is a well-known peer-to-peer botnet Storm has a hierarchical architecture u Workers perform tasks (send spam, launch DDoS attacks, etc.) u Proxies organize workers, connect to HTTP proxies u Master servers controlled directly by botmaster Workers and proxies are compromised hosts (bots) u Use a Distributed Hash Table protocol (Overnet) for rendezvous u Roughly 20,000 actives bots at any time in April [Kanich08] Master servers run in “bullet-proof” hosting centers u Communicate with proxies and workers via command and control (C&C) protocol over TCP Spamalytics24 Kanich, Levchenko, Enright, Voelker and Savage, The Heisenbot Uncertainty Problem: Challenges in Separating Bots from Chaff, LEET 2008.

Storm architecture 25 Dr. Evil Master servers Proxy bots Worker bots

Storm setup New bots decide if they are proxies or workers u Inbound connectivity? Yes, proxy. No, worker. Proxies advertise their status via encrypted variant of Overnet DHT P2P protocol u Master sends “Breath of Life” packet to new proxies to tell them IP address of master servers (RSA signature) u Allows master servers to be mobile if necessary Workers use Overnet to find proxies (tricky: time-based key identifies request) Workers send to proxy, proxy forwards to one of master servers in “safe” data center Bottom line: imperfect, but remarkably sophisticated 26

Storm spam campaigns lWorkers request “updates” to send spam [Kreibich08] u Dictionaries: names, domains, URLs, etc. u templates for producing polymorphic spam »Macros instantiate fields: %^Fdomains^% from domains dict u Lists of target addresses (batches of at a time) lWorkers immediately act on these updates u Create a unique message for each address u Send the message to the target u Report the results (success, failure) back to proxies lMany campaign types u Self-propagation malware, pharmaceutical, stocks, phishing, … 27 Kreibich, Kanich, Levchenko, Enright, Voelker, Paxson and Savage, On the Spam Campaign Trail, LEET 2008.

Storm templates Example Storm spam template and instantiation 28 Macro expansion to insert target address

Misc Storm stuff Templates updated fairly frequently (but mainly just header polymorphism changes) A few special campaigns u Test campaigns u Special mailing list campaigns (e.g. only canadian recpts) Storm nodes also harvest addresses u Grovel hard disk and send back u Re-integrated into master mailing list (some filtering) Storm nodes also do DDoS, DNS fast flux proxying and Web proxying Several different levels of message encoding, but nothing really hard to reverse yet 29

Received: from %^C0%^P%^R2- 6^%:qwertyuiopasdfghjklzxcvbnm^%.%^ P%^R2- 6^%:qwertyuiopasdfghjklzxcvbnm^%^% ([%^C6%^I^%.%^I^%.%^I^%.%^I^%^%]) by %^A^% with Microsoft SMTPSVC(%^Fsvcver^%); %^D^% From: To: Subject: Say hello to bluepill! Received: from %^C0%^P%^R2- 6^%:qwertyuiopasdfghjklzxcvbnm^%.%^P %^R2- 6^%:qwertyuiopasdfghjklzxcvbnm^%^% ([%^C6%^I^%.%^I^%.%^I^%.%^I^%^%]) by %^A^% with Microsoft SMTPSVC(%^Fsvcver^%); %^D^% From: To: Subject: Say hello to bluepill! Received: from auz.xwzww ([ ]) by dsl prod-infinitum.com.mx with Microsoft SMTPSVC( ); Wed, 6 Feb :33: From: To: Subject: Say hello to bluepill! spammerdomain2.com Received: from auz.xwzww ([ ]) by dsl prod-infinitum.com.mx with Microsoft SMTPSVC( ); Wed, 6 Feb :33: From: To: Subject: Say hello to bluepill! spammerdomain1.com Received: from auz.xwzww ([ ]) by dsl prod-infinitum.com.mx with Microsoft SMTPSVC( ); Wed, 6 Feb :33: From: To: Subject: Say hello to bluepill! spammerdomain2.com Storm in action ~!pharma_links~! spammerdomain1.com spammerdomain2.com spammerdomain3.com … ~!names~!eduardo rafael katiera chris johnny … 30 Received: from dkjs.sgdsz ([ ]) by dsl prod-infinitum.com.mx with Microsoft SMTPSVC( ); Wed, 6 Feb :33: From: To: Subject: Say hello to bluepill! spammerdomain3.com

Interposition on Storm We interpose on Storm command and control network u Reverse-engineered Storm protocols, communication scrambling, rendezvous mechanisms [Kanich08] [Kreibich08] Run unmodified Storm proxy bots in VMs u Key issue: Real bot workers connect to our proxies Insert rewriting proxies between workers & proxies u Transparently interpose on messages between Storm proxies and their associated Storm workers u Generic engine for rewriting traffic based on rules Interpose to control site URLs and spam delivery u Which sites the spam advertises (replace urls in template links) u To whom spam gets sent (replace addrs in target list) 31

spammerdomain.com spammerdomain2.com spammerdomain3.com Modifying template links newdomain1.com newdomain2.com newdomain3.com Received: from dkjs.sgdsz ([ ]) by dsl prod-infinitum.com.mx with Microsoft SMTPSVC( ); Wed, 6 Feb :33: From: To: Subject: Say hello to bluepill! spammerdomain3.com Received: from dkjs.sgdsz ([ ]) by dsl prod-infinitum.com.mx with Microsoft SMTPSVC( ); Wed, 6 Feb :33: From: To: Subject: Say hello to bluepill! newdomain2.com

Create two sites that mirror actual sites in spam u E-card (self-propagation) and pharmaceutical u Replace dictionaries with URLs to our sites E-card (self-prop) site u Link to benign executable that POSTs to our server u Log all POSTs to track downloads and executions Pharma site u Log all accesses up through clicks on “purchase” u Track the contents of shopping carts Strive for verisimilitude to remove bias (spam filtering) u Site content is similar, URLs have same format as originals, … Measuring click-through 33

Aside: having fun 34

Measuring Delivery Create various test accounts u At Web mail providers: Hotmail, Yahoo!, Gmail u Behind a commercial spam filtering appliance u As SMTP sinks: accept every message delivered Put addresses in Storm target delivery lists Log all s delivered to these addresses u Both labeled as spam (“Junk ”) and in inbox 35

Ethical context Consequentialism First, do no harm (users no worse off than before) u We do not send any spam »Proxies are relays, worker bots send spam u We do not enable additional spam to be sent »Workers would have connected to some other proxy u We do not enable spam to be sent to additional users »Users are already on target lists, only add control addresses Second, reduce harm where possible u Our pharma sites don’t take credit card info u Our e-card sites don’t export malicious code 36

Legal context Warning: IANAL (we had lawyers involved though) CAN*SPAM Subject to strong definition of “initiator”; we don’t fit it ECPA Our proxy is directly addressed by worker bots (“party to” communication carve out) CFAA We do not contact worker bots, they contact us (“unauthorized access”?) We do not cause any information to be extracted or any fundamentally new activity to take place Hard to find a good theory of damages (functionally indistinguishable -- consequentialism) 37

But… In this kind of work there is little precedent No agency to get permission; no way to get indemnity Lawyers tend to say “I believe this activity has low risk of…” We communicate our activities to a lot of people Security researchers in industry, academia Affected network operators/registrars Law enforcement FTC 38

Aside: Spam is hard Lots of operational complexities to a study like this Net Ops notices huge Storm infestation Address space cleanliness Registrar issues u GoDaddy u TUCOWS Abuse complaints Spam site support Anti-virus signatures Law-enforcement 39

Spam conversion experiment Experimented with Storm March 21 – April 15, 2008 Instrumented roughly 1.5% of Storm’s total output 40 Pharmacy Campaign E-card Campaigns PostcardApril Fool Worker bots31,34817,6393,678 s347,590,38983,665,47938,651,124 Duration19 days7 days3 days

Spam pipeline M 347.5M 21.1M (25%) 82.7M (24%) 3,827 (0.005%) 10,522 (0.003%) 316 ( %) 28 ( %) --- Pharma: 12 M spam s for one “purchase” SentMTAVisitsConversionsInbox 40.1 M10.1M (25%)2,721 (0.005%)225 ( %) E-card: 1 in 10 visitors execute the binary Spam filtering software The fraction of spam delivered into user inboxes depends on the spam filtering software used u Combination of site filtering (e.g., blacklists) and content filtering (e.g., spamassassin) Difficult to generalize, but we can use our test accounts for specific services Fraction of spam sent that was delivered to inboxes Effects of Blacklisting (CBL Feed) Unused Effective Other filtering Response rates by country Two orders of magnitude No large aberrations based on topic

The spammer’s bottom line Recall that we tracked the contents of shopping carts Using the prices on the actual site, we can estimate the value of the purchases u 28 purchases for $2,731 over 25 days, or $100/day ($140 active) We only interposed on a fraction of the workers u Connected to approx 1.5% of workers u Back-of-the-envelope (be very careful)  $7-10k/day for all, or ~$3M/year u With a 50% affiliate commission, $1.5M/year revenue For self-propagation u Roughly 3-9k new bots/day 42

Summary First measurement study of spam marketing conversion Infiltrated Storm botnet, interposed on spam campaigns u Rewriting proxies take advantage of Storm reverse-engineering Pharmaceutical spam u 1 in 12M conversion rate  $1.5M/yr net revenue u Profitability possibly tied to infrastructure integration u Sent via retail market, this campaign would not be profitable u Ergo: in-house delivery (Storm owners = pharma spammers) Self Propagation spam u 250k spam s per infection u Social engineering effective: one in ten visitors run executable 43

What are we doing now? More analysis u Extending infiltration to ~15 botnets; comparative analysis u Characteristic fingerprints of different spammers/crews u Characterizing supply chain relationships »Broadly order on-line “viagra”, rolexes, etc »Cluster credit processor/merchant, mailing materials, etc »Cluster on manufacturing fingerprint (e.g., NIR spectroscopy) u Measuring monetization by purposely losing credit cards Proactive defenses u Automated filter generation from templates u Automated classification of URLs u Automated vision-based detection of phishing pages 44

Security courses at UCSD CSE107 – Introduction to modern cryptography CSE127 – Computer Security But… Security plays a role in virtually all of your courses 45

Questions? Yahoo!46 Collaborative Center for Internet Epidemiology and Defenses

What’s next: Value-chain characterization Value-chain characterization u Empirical map establishing links between criminal groups and enablers »Affiliate programs, botnets, fast flux networks, registrars, payment processors, SEO/traffic partners, fulfillment/manufacturing »Data mining across huge data feeds we’ve built or established relationships for u Social network among criminal groups »Semantic Web mining

New: Fulfillment measurements About to start purchasing wide range of spam-advertized products u Watches u Pharma u Traffic Cluster purchases based on u Merchant and processor u Packaging (postmark, forensic analysis of paper) u Artifacts of manufacturing process (e.g., FT-NIR on drugs) 48

Observations –Modest number of bots send most spam –Virtually all bots use templates with simple rules to describe polymorphism –Templates+dictionaries ≈ regex describing spam to be generated –If we can extract or infer these from the botnets, we have a perfect filter for all the spam generated by the botnet –Very specific filters, extremely low FP risk New: Bot-based spam filter generation random letters and numbers phrases from a dictionary

Early results (last week) 0 FP with 50 examples 0 FN on Storm with 500 examples Still tuning for other botnets

Spare slides

Removing crawlers/honeyclients Anyone can send to our accounts or visit our Web sites, potentially muddying the waters u Use various heuristics to validate the logs Validate spam in mailboxes was sent by us u Spam from other campaigns, bounce messages, etc. u Subject line matches our campaign, URL from our dictionary Validate Web accesses were by users in response u Sites with links in spam are immediately crawled by Google, A/V vendors, etc. u Special 3 rd -level DNS names, special url encoding u Ignore hosts that access robots.txt, don’t load javascript, don’t load flash, don’t load images, many malformed requests 52

Pharma and e-card conversions 53

Who is targeted? 54 l Top 20 domains l Many Web mail & broadband providers, but very long tail l Campaigns have nearly identical distributions l Same scammers, or target lists sold to multiple scammers