TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson.

Slides:



Advertisements
Similar presentations
Predictive Client-Side Profiles for Personalized Advertising Misha Bilenko and Matt Richardson.
Advertisements

LeadManager™- Internet Marketing Lead Management Solution May, 2009.
Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Online Privacy and Codes of Conduct Peter Fleischer Global Privacy Counsel my personal blog:
On the Privacy of Private Browsing Kiavash Satvat, Matt Forshaw, Feng Hao, Ehsan Toreini Newcastle University DPM’13.
TxEIS Browser Settings
Non-tracking Web Analytics Istemi Ekin Akkus 1, Ruichuan Chen 1, Michaela Hardt 2, Paul Francis 1, Johannes Gehrke 3 1 Max Planck Institute for Software.
17 th ACM CCS (October, 2010).  Introduction  Threat Model  Cross-Origin CSS Attacks  Example Attacks  Defenses  Experiment  Related Work 2 A Presentation.
Georgios Kontaxis, Michalis Polychronakis Angelos D. Keromytis, Evangelos P. Markatos Siddhant Ujjain (2009cs10219) Deepak Sharma (2009cs10185)
Catching the Drift: Learning Broad Matches from Clickthrough Data Sonal Gupta, Mikhail Bilenko, Matthew Richardson University of Texas at Austin, Microsoft.
Thank you for your interest in Performance SEO. You are one step closer to realizing the enormous power of Organic Search Engine Optimization. If you are.
Third Party Web Tracking Policy and Technology based on the paper of Jonathan R. Mayer and John C. Mitchell Stanford University Stanford, CA
Privacy and Security on the Web Part 1. Agenda Questions? Stories? Questions? Stories? IRB: I will review and hopefully send tomorrow. IRB: I will review.
Technological Implications for Privacy David Kotz Department of Computer Science Dartmouth College
1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:
“IT Solutions for Tourism Industry” CAPS Workshop Yerevan April 14, 2009.
 Proxy Servers are software that act as intermediaries between client and servers on the Internet.  They help users on private networks get information.
CLICK FRAUD Alexander Tuzhilin By Vinny Rey. Why was the study done? Google was getting sued by advertisers because of click fraud. Google agreed to have.
The Privacy Tug of War: Advertisers vs. Consumers Presented by Group F.
PRIVAD: PRACTICAL PRIVACY IN ONLINE ADVERTISING Offense: Arindam Paul.
 A cookie is a piece of text that a Web server can store on a user's hard disk.  Cookie data is simply name-value pairs stored on your hard disk by.
Automated Tracking of Online Service Policies J. Trent Adams 1 Kevin Bauer 2 Asa Hardcastle 3 Dirk Grunwald 2 Douglas Sicker 2 1 The Internet Society 2.
WEB ANALYTICS Prof Sunil Wattal. Business questions How are people finding your website? What pages are the customers most interested in? Is your website.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Machine Learning at Orbitz Robert Lancaster and Jonathan Seidman Strata 2011 February 02 | 2011.
HTTP: cookies and advertising Concepts to cover:  web page content (including ads) from multiple site: composition at client  cookies  third-party cookies:
Fall 2006 Davison/LinCSE 197/BIS 197: Search Engine Strategies 6-1 Module II Overview PLANNING: Things to Know BEFORE You Start… Why SEM? Goal Analysis.
From Devices to People: Attribution of Search Activity in Multi-User Settings Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz Microsoft Research,
1 3 Web Proxies Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB PROXIES  Web Proxy Definition  Three of the Most Common Intermediaries.
Prevent Cross-Site Scripting (XSS) attack
How Can We Deal with Risks from the Internet: Why Privacy Legislation Is Hot Right Now Professor Peter Swire Ohio State University/Center for American.
Internet Advertising © 2001 Ann Schlosser, University of Washington Business School.
Click-Tracking Blocker: Privacy Preservation by Disabling Search Engine’s Click-Tracking Roberto Alberdeston, Erich Dondyk, Cliff C. Zou.
Search Engine Optimization ext 304 media-connection.com The process affecting the visibility of a website across various search engines to.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Privacy-Aware Personalization for Mobile Advertising
Creating a User ID (1) User makes any HTTP request
Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search.
Optimizing Marketing Spend Through Multi-Source Conversion Attribution David Jenkins.
Session ID: Session Classification: Dr. Michael Willett OASIS and WillettWorks DSP-R35A General Interest OASIS Privacy Management Reference Model (PMRM)
Web Engineering we define Web Engineering as follows: 1) Web Engineering is the application of systematic and proven approaches (concepts, methods, techniques,
Privacy Debate: Urgent Issue or Industry Hype? Getting Better all the time Can’t get no worse Bridging the Alan Chapell.
Personalized Search Xiao Liu
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
1 CS122B: Projects in Databases and Web Applications Spring 2015 Notes 03: Web-App Architectures Professor Chen Li Department of Computer Science CS122B.
Georgios Kontaxis‡, Michalis Polychronakis‡, Angelos D. Keromytis‡, and Evangelos P.Markatos* ‡Columbia University and *FORTH-ICS USENIX-SEC (August, 2012)
U.S. Department of Commerce Web Advisory Group Minding Your Own Business The Platform for Privacy Preferences Project.
1 Robust Defenses for Cross-Site Request Forgery Adam Barth, Collin Jackson, John C. Mitchell Stanford University 15th ACM CCS.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
Restoring Privacy, Cleaning Your Computer's Cookies and Beacons.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Week 1 Introduction to Search Engine Optimization.
1 Trustworthy Browsing Ian Moulster Software + Services Lead Microsoft Ltd.
COM: 111 Introduction to Computer Applications Department of Information & Communication Technology Panayiotis Christodoulou.
1 DATA-DRIVEN SOLUTIONS. 2 KEYWORD-LEVEL SEARCH RETARGETING TARGET USERS BASED ON THEIR RECENT SEARCH HISTORY AND SEARCH QUERIES. A user performs a search.
Computer Security Keeping you and your computer safe in the digital world.
Some from Chapter 11.9 – “Web” 4 th edition and SY306 Web and Databases for Cyber Operations Cookies and.
Internet Basics 10/23/2012. What is the Internet? It’s a world-wide network of computer networks. It grows hourly and involves national governments, communities,
CS122B: Projects in Databases and Web Applications Spring 2017
Practical Censorship Evasion Leveraging Content Delivery Networks
Latest Updates on BlackHawk Mines Music : Privacy Policy
Web Caching? Web Caching:.
What is Cookie? Cookie is small information stored in text file on user’s hard drive by web server. This information is later used by web browser to retrieve.
Web Privacy Chapter 6 – pp 125 – /12/9 Y K Choi.
CS122B: Projects in Databases and Web Applications Spring 2018
I (do not) consent to behavioural advertising
Report from the trenches of an HTML5 game provider
Presentation transcript:

TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Anonymous User Sees This Ad

Known User Sees A Different Ad

Personalized Advertising Today User is tracked: history of activity is stored On ad platform’s server and/or cookie History is processed into profile Reduced representation for quick lookup Can also be communicated or sold across parties Profile is used for ad targeting Total targeting revenue expected $2.6B by 2014 (eMarketer 2011) Supported by all major ad platforms

Talk Outline Client-side vs. server-side profiles Client-only Profiles (CoP): balancing privacy and personalization Experiments: client- vs. server-side revenue difference

Personalized Advertising is Ubiquitous Driven by economics Publishers, platforms: CPM rates 2.7x higher [Beales ‘10] Advertisers: 6x gain in CTR [Yao et al. ‘08] What about users? “It’s a little creepy, especially if you don’t know what’s going on” [NYT ‘11] What’s going on is complex and misunderstood [McDonald ’10-11] Ad industry: self-regulation, users can opt out via Browsers: Do Not Track (FF, IE, Safari), KeepMyOptOuts (Chrome) Privacy advocates: self-regulation is insufficient W3C Tracking Protection Working Group Legislation: multiple bills/hearings in US; European e-Privacy directive

Personalized Advertising Mechanics User information drives market efficiency Users have no knowledge/control of their information First vs. third-party distinction is increasingly non-trivial Publisher Ad Platform User Ad platform … … Advertiser Aggregator

Server-side User Profiles in Advertising (query or url)

Server-side User Profiles in Advertising (query or url) (ad)

Server-side User Profiles in Advertising (query or url) (ad)

Problem: No User Control over Data Users do not know what is stored, where and why Use, retention, sharing Users cannot edit or delete their behavioral data Deleting cookies insufficient: re-identification, LBOs, local storage Opting out ≠ having your data purged Most users tracking invasive when asked [McDonald-Cranor’10] But don’t do much about it: Do Not Track adoption in Firefox: 4-6%

Current “Do Not Track” Proposals Provide a mechanism for users to prevent being tracked Existing browser implementations HTTP headers, opt-out cookies Browser contacts server but notifies it that user does not want to be tracked. User must trust service providers Domain blocking / TPL lists Browser doesn’t send request to certain domains Tracking vs. targeting: collection vs. usage “All or nothing” approach: privacy = no targeting Undesirables extremes: inefficiency vs. loss of revenue

Client-Side Tracking Tracking is performed solely on client machine User retains control, targeting is still possible User can delete or edit profile Services don’t retain user history No back-end sharing of user data between companies Avoid issues around retention policies, deleting all copies, etc. Studies indicate users care more about being tracked than about being targeted

Existing Plugin-Based Approaches Privad, Adnostic, RePRIV User installs client plugin which collects user data and communicates with ad network Difficulties Requires user to install plugin Requires significant changes to existing ad serving infrastructure Hard to manage click fraud, ad budgets Bandwidth (e.g., 10x ads sent to client) Targeting algorithms baked into plugin may slow innovation Targeting on client = less information than targeting on server

Alternative: Client-Only Profiles (CoP) Profile stored in cookie on client machine Browser sends profile to server upon page request Server returns page and updated profile in cookie Server does not log user activity

Client-only Profiles

+ No plugins (AdNostic, RePRIV, Privad: users install plugins) + No major changes to serving infrastructure + Targeting server-side (advanced features/algorithms) + Profile update server-side (advanced features/algorithms) - Must trust ad platform to comply with policy and not retain Debatable proposition for security community… …but Do Not Track already makes the same assumption What will it cost compared to server-side tracking?

Comparison of Tracking Approaches

Incremental Profile Updates: Task How much does incremental update hurt? Compare to profiles constructed on server from full history May depend on the task (personalizing ads, content, search results) Representative task: predicting future ad clicks Discriminates long-term user interests Can be used for ad selection, ranking, CTR prediction, auction Bid Increments Advertiser specifies an increment to their bid if the user has the keyword in their profile

Incremental Profile Updates: Method [Bilenko and Richardson KDD-2011] Algorithm based on machine learning Features based on behavior frequency/recency, context, etc. ML function predicts p(click|keyword) using these features Select top-k keywords for profile Keyword value is incremental utility of ads not covered in profile so far Leads to a submodular optimization problem Solved by efficient, accurate approximate algorithm

Incremental Updates: Study Two months of activity on Bing search engine 2.4 million users (randomly sampled from total population) Train predictor using first 6 weeks Cookie contains Profile: Top-k keywords by predicted value Cache: LRU policy Metric Fraction of future clicks in profile (proportional to revenue gain)

Incremental Updates: Results Retains 97-99% of gain vs. server-side tracking Requires only keywords in profile (50 in cache)

Conclusions Client-side tracking balances privacy and market efficiency Possible approach: CoP, which Ensures user control over tracking Requires insignificant change to existing infrastructure Retains 97+% of revenue gains by ad targeting Should Do Not Track distinguish client-and server-side tracking? 1 st vs. 3 rd party are increasingly difficult to differentiate

THANKS!

Backup Slide If I have to trust the server anyway, why not trust it to store my profile as well? Trusting not to store is a lower bar than trusting to properly handle profile Storing profile on server = Trusting any team with access to your profile to: Know the policies Correctly implement things like opt-out, retention, publication. Either never copy your history, or ensure your edits/deletions are propagated through to all copies. Not to share it with any other team that might not know these things Storing profile on client = Trusting just the team that receives the profile to use it and throw it away.

1 st vs. 3 rd party Distinction is getting increasingly muddled 1 st party data collection is becoming pervasive 3 rd party collection can be tightly controlled by advertiser.

Regulatory Interest in Behavioral Advertising United States Federal Trade Commission has proposed a regulatory framework calling for Do Not Track solutions Legislation calls for Do Not Track solutions US Senate, US House of Representatives, California Legislature Europe Notice and Consent prior to depositing cookies

30 Do Not Track Solutions

31 Do Not Track Solutions DNT SolutionApple SafariGoogle Chrome Microsoft IEMozilla Firefox Blocking Traffic Opt-Out Cookie HTTP Header Do Not Track solutions are built into each browser with the exception of Google Chrome where the Opt-Out cookies are a part of a browser extension.