Automated Experiments on Ad Privacy Settings

Slides:



Advertisements
Similar presentations
The Internet and the Web
Advertisements

Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
Clearing your Cookies Google Chrome A short guide to help you navigate our website faster Brought to you by:
Web Security A how to guide on Keeping your Website Safe. By: Robert Black.
Application Process USAJOBS – Application Manager USA STAFFING ® —OPM’S AUTOMATED HIRING TOOL FOR FEDERAL AGENCIES.
New “Collaborate” Button Integrate UI directly into the browser. Preferred target: Firefox Easiest browser to extend in terms of UI.
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
HTTP: cookies and advertising Concepts to cover:  web page content (including ads) from multiple site: composition at client  cookies  third-party cookies:
Accountability through Information Flow Experiments Michael Carl Tschantz UC Berkeley Amit Datta, CMU Anupam Datta, CMU Jeannette M. Wing, MSR
Chapter 1: Introduction to Statistics
Fall, Privacy&Security - Virginia Tech – Computer Science Click to edit Master title style Design Extensions to Google+ CS6204 Privacy and Security.
Windows Internet Explorer 9 Chapter 1 Introduction to Internet Explorer.
Understanding Cross-site Linking in Online Social Networks Yang Chen 1, Chenfan Zhuang 2, Qiang Cao 1, Pan Hui 3 1 Duke University 2 Tsinghua University.
1 WS-Privacy Paul Bui Ryan Dickey. 2 Agenda  WS-Privacy  Introduction to P3P  How P3P Works  P3P Details  A P3P Scenario  Conclusion  References.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Cloak and Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego 左昌國 Seminar.
Canadian Advertising in Action, 6th ed. Keith J. Tuckwell ©2003 Pearson Education Canada Inc Elements of the Internet World Wide Web World.
Direct Teacher: Professor Ng Reporter: Cindy Pineapple 1 Summarized from :
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
Lecture 6 Title: Web Planning, Designing, Developing for E-Marketing By: Mr Hashem Alaidaros MKT 445.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Unconstrained Endpoint Profiling Googling the Internet Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci Reviewed by Lee Young Soo.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Google Analytics Workshop
242/102/49 0/51/59 181/172/166 Primary colors 248/152/29 PMS 172 PMS 137 PMS 546 PMS /206/ /227/ /129/123 Secondary colors 114/181/204.
Online Marketing. Types Marketing Link Building Content Marketing Search Engine Optimization(SEO) Social Media Marketing Advertising.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
■ A blog originally was a personal website meant to be like a diary or journal. ■ Basically a type of website, like a forum or a social bookmarking site.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
Experiments Textbook 4.2. Observational Study vs. Experiment Observational Studies observes individuals and measures variables of interest, but does not.
GOOGLE TAG MANAGER. INTRODUCTION Google Tag Manager (GTM) is a free solution, introduced in October Google Tag Manager (GTM) is a free solution,
What mobile ads know about mobile users
Chapter 17 The Need for HTML 5.
Lecture 9 Communication.
MicrosoftTM SharePoint Content Management SystemTutorial
Facebook privacy policy
Acknowledgement: Khem Gyawali
Information Systems for Managers Assignment FACEBOOK
Essential tools for implementing and testing websites
Creating Oracle Business Intelligence Interactive Dashboards
Automated ad placement
USAJOBS – Application Manager
18734: Foundations of Privacy Information Flow Experiments
PCB 3043L - General Ecology Data Analysis.
Shavonne Henry, Nikia Clarke, David Heymann, Brandon Knight
Parts of an Academic Paper
Latest Updates on BlackHawk Mines Music : Privacy Policy
Web Mining Ref:
Are these Ads Safe: Detecting Hidden A4acks through Mobile App-Web Interfaces Vaibhav Rastogi, Rui Shao, Yan Chen, Xiang Pan, Shihong Zou, and Ryan Riley.
Adding Assignments and Learning Units to Your TSS Course
PubMed/History, Advanced Search and Review (module 4.3)
Elementary Statistics
De-anonymizing the Internet Using Unreliable IDs By Yinglian Xie, Fang Yu, and Martín Abadi Presented by Peng Cheng 03/22/2017.
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall
How to Run a DataOnDemand Report
Unit 27 Web Server Scripting Extended Diploma in ICT
Google Privacy Policy Karen Tao.
Text Analysis and Search Analytics
iSRD Spam Review Detection with Imbalanced Data Distributions
Multithreaded Programming
An Introduction to Correlational Research
Recitation on AdFisher
Text Analysis and Search Analytics
DESIGN OF EXPERIMENTS by R. C. Baker
Unsupervised Machine Learning: Clustering Assignment
Exploring DOM-Based Cross Site Attacks
Social Media Google+ Marketing.
Do You Have Multiple Amazon Seller Accounts? Amazon Knows it! By EsellersCare Contact : +1 (855)
Cross Site Request Forgery (CSRF)
TRANCO: A Research-Oriented Top Sites Ranking Hardened Against Manipulation By Prudhvi raju G id:
Presentation transcript:

Automated Experiments on Ad Privacy Settings By Ajinkya Thorve

Introduction Advancement of tracking technologies has lead to increased data collection. Collected data used, sold and resold for serving targeted advertisements. Serious privacy concern! To increase transparency and provide control: https://www.google.com/settings/ads

Google Ad Settings Page Ref: http://suite4social.com/make-googles-ad-settings-work-for-you/

The Problem Little information about how these pages operate. Need to explore how user behavior (either directly with the Ad Settings or with content providers) alters the ads and settings shown to the user. Need to study the degree to which the settings provides transparency and choice as well as check for the presence of discrimination.

Privacy Properties 1. Discrimination Discrimination between two classes is difference in behavior towards those two classes. Membership in a class causes a change in ads. Discrimination is not always bad (e.g. clothing ads)

Privacy Properties (contd.) 2. Transparency Display to users what the ad network may have learned about them. Cannot expect an ad network to be completely transparent. Only study the extreme case of the lack of transparency — opacity. If some browsing activity results in a significant effect on the ads served, but has no effect on the ad settings — lack of transparency.

Privacy Properties (contd.) 3. Choice Effectful choice: Altering the settings has some effect on the ads seen by the user. Shows that altering the settings is not merely a “placebo button”, it has a real effect on the network’s ads. Ad choice: Removing an inferred interest results in decrease in the number of ads related to the removed interest. Not always possible to see effectful choice. Cars – no ads in repository. Also, does not capture whether the effect on ads is meaningful.

Methodology Null hypothesis: Inputs do not affect the outputs. Inputs: User Behavior, Ad Settings Output: Ads seen by the user The goal: To establish that changes in a certain type input to a system causes an effect on a certain type of output of the system. Input and Output Examples User Behavior, Ad Settings Ads seen Changes: Visit websites -> Ad Settings page Changes in Ad Settings -> Ads seen

Methodology (contd.)

Methodology (contd.)

AdFisher An automated tool to run experiments using the above methodology for a set of treatments, measurements, and classifiers. Extensible: allowing the experimenter to implement additional functionalities or even study a different online platform.

AdFisher (contd.) To simulate a new person, AdFisher creates an agent from a fresh browser instance with no browsing history, cookies, or other personalization. To simulate interests, AdFisher downloads the top 100 URLs for different categories from Alexa and creates lists of webpages. AdFisher randomly assigns each agent to a group and applies the appropriate treatment. Next, AdFisher takes measurements from the agent, parses the page to find the ads shown by Google and stores the ads. 10 reloads, 5s between successive reloads.

AdFisher (contd.) News sites since they generally show many ads. Among the top 20 news websites on alexa.com, only five displayed text ads served by Google. Most of the experiments on Times of India as it serves the most (five) text ads per page reload. Repeat some experiments on the Guardian (three ads per reload) to demonstrate that our results are not specific to one site.

AdFisher (contd.) It splits the entire data set into training and testing subsets, and examines a training subset of the collected measurements to select a classifier that distinguishes between the measurements taken from each group. AdFisher has functions for converting the text ads seen by an agent into three different feature sets. The URL feature set, the URL+Title feature set, the word feature set.

AdFisher (contd.) Explored a variety of classification algorithms provided by the scikit-learn library. Logistic regression with an L2 penalty over the URL+title feature set consistently performed well compared to the others.

Experiments

Experiments 1. Discrimination Set up AdFisher to have the agents in one group visit the Google Ad Settings page and set the gender bit to female while agents in the other group set theirs to male. All the agents then visited the top 100 websites listed under the Employment category of Alexa. The agents then collect ads from Times of India. The learned classifier attained a test-accuracy of 93%, suggesting that Google did in fact treat the genders differently.

Experiments (contd.)

Experiments (contd.) 2. Transparency The experimental group visited substance abuse websites while the control group idled. None of the 500 agents in the experimental group had interests related to substance abuse on their Ad Settings pages. Collected the ads shown to the agents.

Experiments (contd.)

Experiments (contd.) 3. Effectful Choice Tested whether making changes to Ad Settings has an effect on the ads seen, thereby giving the users a degree of choice over the ads. Simulated an interest in online dating by visiting the website www.midsummerseve.com Agents in the experimental group removed the interest “Dating & Personals”. All the agents then collected ads from the Times of India. Found statistically significant differences between the groups. Thus, the ad settings appear to actually give users the ability to avoid ads they might dislike or find embarrassing.

Experiments (contd.)

Conclusions Conducted 21 experiments using 17,370 agents that collected over 600,000 ads. Found instances of discrimination, opacity, and choice in targeted ads. Cannot assign blame; cannot determine whether Google, the advertiser, or complex interactions among them caused the issues; lack the access needed to make this determination.

My Understanding and Issues Only a few thousand browser agents, cannot generalize results. “...we do not claim these findings to generalize or imply widespread issues, we find them concerning and warranting further investigation by those with visibility into the ad ecosystem.” “We do not claim that we will always find a difference if one exists, nor that the differences we find are typical of those experienced by users.”

My Understanding and Issues (contd.) Limitations of the experiment: Only text ads, only two websites. “It comes with stock functionality for collecting and analyzing text ads. Experimenters can add methods for image, video, and flash ads.” “The experimenter can add parsers to collect ads from other websites.”

My Understanding and Issues (contd.) Same IP address. “We do not claim “completeness” or “power”: we might fail to detect some use of information.” “For example, Google might not serve different ads upon detecting that all the browser agents in our experiment are running from the same IP address. Despite this limitation in our experiments, we found interesting instances of usage.”

References Datta, A., Tschantz, M. & Datta, A. (2015). Automated Experiments on Ad Privacy Settings. Proceedings on Privacy Enhancing Technologies, 2015(1), pp. 92-112.

Thank You!