The Invisible Trail: Third-Party Tracking on the Web Franziska Roesner Assistant Professor Computer Science & Engineering University of Washington 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
The Invisible Trail: Third-Party Tracking on the Web Franziska Roesner Assistant Professor Computer Science & Engineering University of Washington + many collaborators! 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
New technologies bring new benefits… … but also new risks. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Security & Privacy Research Goal: Improve security & privacy of technologies. Security mindset: Challenge assumptions, think like an attacker. Study existing technologies: attack and measure. Design and build defenses and new technologies. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
S&P Challenges Arise Everywhere Today’s talk: web privacy Who tracks you as you browse, and how? 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Outline Understanding web tracking Measuring web tracking Defenses 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Outline Understanding web tracking Measuring web tracking Defenses 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Ads That Follow You Advertisers (and others) track your browsing behaviors for the purposes of targeted ads, website analytics, and personalized content. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Third-Party Web Tracking Browsing profile for user 123: cnn.com theonion.com adult-site.com political-site.com These ads allow criteo.com to link your visits between sites, even if you never click on the ads. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Concerns About Privacy 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Understanding the Tracking Ecosystem In 2011, much discussion about tracking, but limited understanding of how it actually works. Our Goal: systematically study web tracking ecosystem to inform policy and defenses. Challenges: No agreement on definition of tracking. No automated way to detect trackers. (State of the art: blacklists) 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Our Approach ANALYZE Reverse-engineer trackers’ methods. Develop tracking taxonomy. MEASURE (3) Build automated detection tool. (4) Measure prevalence in the wild. (5) Evaluate existing defenses. BUILD (6) Develop new defenses. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Web 101: Cookies Websites store info in cookies in the browser. Only accessible to the site that set them. Automatically included with web requests. cookie: id=123 theonion.com server cookie: id=123 cookie: id=456 cnn.com server cookie: id=456 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Web 101: First and Third Parties Iframes allow one website to include another: <iframe src=“www.washington.edu”> </iframe> “first party” “third party” 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Anonymous Tracking Trackers included in other sites use third-party cookies containing unique identifiers to create browsing profiles. cookie: id=789 criteo.com user 789: theonion.com, cnn.com, adult-site.com, … cookie: id=789 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Basic Tracking Mechanisms Tracking requires: re-identifying a user. communicating id + visited site back to tracker. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Our Tracking Taxonomy [NSDI ’12] In the wild, tracking is much more complicated. (1) Trackers don’t just use cookies. Flash cookies, HTML5 LocalStorage, etc. (2) Trackers exhibit different behaviors. Within-site vs. cross-site. Anonymous vs. non-anonymous. Specific behavior types: analytics, vanilla, forced, referred, personal. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Other Trackers? “Personal” Trackers 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Personal Tracking cookie: id=franzi.roesner cookie: id=franzi.roesner facebook.com user franzi.roesner: theonion.com, cnn.com, adult-site.com, … cookie: id=franzi.roesner Tracking is not anonymous (linked to accounts). Users directly visit tracker’s site evades some defenses. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Outline Understanding web tracking Measuring web tracking Defenses 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Measurement Study Questions: [NSDI ’12] Questions: How prevalent is tracking (of different types)? How much of a user’s browsing history is captured? How effective are defenses? Approach: Build tool to automatically crawl web, detect and categorize trackers based on our taxonomy. TrackingObserver: tracking detection platform http://trackingobserver.cs.washington.edu 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
How prevalent is tracking? 524 unique trackers on Alexa top 500 websites (homepages + 4 links) 457 domains (91%) embed at least one tracker. (97% of those include at least one cross-site tracker.) 50% of domains embed between 4 and 5 trackers. One domain includes 43 trackers. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
How prevalent is tracking? 524 unique trackers on Alexa top 500 websites (homepages + 4 links) 457 domains (91%) embed at least one tracker. (97% of those include at least one cross-site tracker.) Tracking is increasing! Unique trackers on the top 500 websites (homepages only): 2011: 383 2013: 409 2015: 512 50% of domains embed between 4 and 5 trackers. One domain includes 43 trackers. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Who/what are the top trackers? (“Vanilla” and others) 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web How are users affected? Question: How much of a real user’s browsing history can top trackers capture? Measurement challenges: Privacy concerns. Users may not browse realistically while monitored. Insight: AOL search logs (released in 2006) represent real user behaviors. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web How are users affected? Idea: Use AOL search logs to create 30 hypothetical browsing histories. 300 unique queries per user top search hits. Trackers can capture a large fraction: Doubleclick: Avg 39% (Max 66%) Facebook: Avg 23% (Max 45%) Google: Avg 21% (Max 61%) 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web How are users affected? Idea: Use AOL search logs to create hypothetical browsing histories. 300 unique queries per user top search hits. Trackers can capture a large fraction: Doubleclick: Avg 39% (Max 66%) Facebook: Avg 23% (Max 45%) Google: Avg 21% (Max 61%) 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
How has this changed over time? [USENIX Security ’16] How has this changed over time? The web has existed for a while now… What about tracking before 2011? (our first study) What about tracking before 2009? (first academic study) Solution: time travel! 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
The Wayback Machine to the Rescue Time travel for web tracking: http://trackingexcavator.cs.washington.edu 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web 1996-2016: More & More Tracking More trackers of more types 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web 1996-2016: More & More Tracking More trackers of more types, more per site 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web 1996-2016: More & More Tracking More trackers of more types, more per site, more coverage 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Outline Understanding web tracking Measuring web tracking Defenses 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Defenses to Reduce Tracking Do Not Track proposal? Do Not Track is not a technical defense: trackers must honor the request. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Defenses to Reduce Tracking Do Not Track proposal? Private browsing mode? Private browsing mode protects against local, not network, attackers. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Defenses to Reduce Tracking Do Not Track proposal? Private browsing mode? Third-party cookie blocking? www.bar.com Bar’s Server www.bar.com’s cookie (1st party) www.foo.com Foo’s Server www.foo.com’s cookie (3rd party) 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Quirks of 3rd Party Cookie Blocking So if a third-party cookie is somehow set, it can be used. How to get a cookie set? One way: be a first party. In some browsers, this option means third-party cookies cannot be set, but they CAN be sent. etc. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
What 3rd Party Cookie Blocking Misses 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
What 3rd Party Cookie Blocking Misses Defenses for personal trackers (red bars) were inadequate. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Our Defense: ShareMeNot Prior defenses for personal trackers: ineffective or completely removed social media buttons. Our defense: ShareMeNot (for Chrome/Firefox) protects against tracking without compromising button functionality. Blocks requests to load buttons, replaces with local versions. On click, shares to social media as expected. Techniques adopted by Ghostery and the EFF. http://sharemenot.cs.washington.edu 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Defenses to Reduce Tracking Do Not Track proposal? Private browsing mode? Third-party cookie blocking? Browser add-ons? None are perfect, so use a combination: 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Recommended Browser Add-ons Privacy Badger (EFF) https://www.eff.org/privacybadger https://www.mozilla.org/en-US/lightbeam/ https://www.ghostery.com/ 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web Summary Web tracking is complicated and ubiquitous. We systematically developed a tracking taxonomy and performed an extensive measurement study. Understanding the tracking ecosystem helps us design new tools and defenses. Thanks to my collaborators! Yoshi Kohno, Ada Lerner, Chris Rovillos, Alisha Saxena, Anna Kornfeld Simpson, David Wetherall 10/10/16 Franziska Roesner – Third-Party Tracking on the Web
Franziska Roesner – Third-Party Tracking on the Web www.franziroesner.com franzi@cs.washington.edu Research Overview: Improving Security & Privacy Analyze existing systems. e.g.: web tracking, automobiles, QR codes. Build new systems. e.g.: web, OS, smartphones, UI toolkits, usable encrypted email. Understand mental models. e.g.: smartphone permissions, social media, journalists, lawyers. Anticipate future technologies. e.g.: telerobotics, wearables, augmented reality, IoT. 10/10/16 Franziska Roesner – Third-Party Tracking on the Web