Presentation is loading. Please wait.

Presentation is loading. Please wait.

Non-tracking Web Analytics Istemi Ekin Akkus 1, Ruichuan Chen 1, Michaela Hardt 2, Paul Francis 1, Johannes Gehrke 3 1 Max Planck Institute for Software.

Similar presentations


Presentation on theme: "Non-tracking Web Analytics Istemi Ekin Akkus 1, Ruichuan Chen 1, Michaela Hardt 2, Paul Francis 1, Johannes Gehrke 3 1 Max Planck Institute for Software."— Presentation transcript:

1 Non-tracking Web Analytics Istemi Ekin Akkus 1, Ruichuan Chen 1, Michaela Hardt 2, Paul Francis 1, Johannes Gehrke 3 1 Max Planck Institute for Software Systems 2 Twitter Inc. 3 Cornell University

2 Web Analytics Statistics about users visiting a publisher website Akkus et al.Non-tracking Web Analytics2

3 Analytics by Data Aggregators Collect analytics for many publishers from many clients Infer extended analytics – Age, gender, education level, other sites visited, … Provide aggregate information to publishers & advertisers Akkus et al.Non-tracking Web Analytics3 Aggregate Extended Analytics Data AggregatorPublisher

4 Analytics Today Akkus et al.Non-tracking Web Analytics4 Publisher Client Data Aggregator

5 Tracking Data aggregators criticized – Collection of individual information Criticisms led to reactions – Do-not-Track proposal, EU cookie law – Voluntary opt-out mechanisms by aggregators – Client-side tools to blacklist aggregators Fewer tracked users  less data for inference  worse extended analytics for publishers Akkus et al.Non-tracking Web Analytics5

6 Goal Replicate the functionality of today’s systems without tracking Replicate the functionality of today’s systems without tracking Akkus et al.Non-tracking Web Analytics6

7 Specific Goals Privacy – No individual information collected by publishers & aggregators Functionality – Aggregate information for publishers & aggregators – No new organizational components – Practical and efficient Akkus et al.Non-tracking Web Analytics7

8 Outline Motivation & Goals Components & Assumptions Non-tracking Analytics Implementation & Evaluation Conclusion Akkus et al.Non-tracking Web Analytics8

9 Components Client locally stores information about the user Publisher serves webpages to clients Aggregator provides aggregation service Akkus et al.Non-tracking Web Analytics9

10 Assumptions Akkus et al.Non-tracking Web Analytics10 Potentially malicious client – May try to distort results Potentially malicious publisher – May try to violate individual user privacy Honest-but-curious data aggregator – Follows the protocol – Doesn’t collude with publishers

11 Outline Motivation & Goals Components & Assumptions Non-tracking Analytics – Publisher as Proxy – Noise – Yes-No Queries – Auditing Implementation & Evaluation Conclusion Akkus et al.Non-tracking Web Analytics11

12 Today Not anonymous; need a proxy… …, but don’t want a new component Publisher already interacts with clients! Akkus et al.Non-tracking Web Analytics12

13 Publisher as Anonymizing Proxy 4.Aggregator counts anonymous answers and returns results 1.Publisher distributes queries to be executed 2.Publisher collects encrypted answers 3.Publisher forwards answers to the aggregator Clients never exposed to the data aggregator 1. Queries 2. Encrypted Answers 3. Encrypted Answers 4. Results Akkus et al.Non-tracking Web Analytics13

14 Identifiers in Responses Rare attributes – Job: CEO of ACME Enc(CEO of ACME) Enc(CEO of ACME) CEO of ACME visits my site! CEO of ACME visits example.com Akkus et al.Non-tracking Web Analytics example.com 14

15 Noise 2. Encrypted Answers 4. Noisy Encrypted Answers 6. Double-noisy Result 3. Add Noise_Publisher 5. Add Noise_Aggregator 7. Remove Noise_Publisher Both entities obtain noisy results Both entities obtain noisy results Result with Noise_Aggregator Result with Noise_Publisher Akkus et al.Non-tracking Web Analytics15

16 Differentially-private Noise Hides the existence of an individual answer CEO: real or noise?? Requires numerical values ? Akkus et al.Non-tracking Web Analytics16

17 Yes-No Questions Convert queries to binary & count answers “What is your job?”  “Is your job ‘CEO’?”  Noise as additional answers – Enc(‘Yes’), Enc(‘No’) Bonus: limits a malicious client – Either +1 or 0 Many possible values  Many questions – Job: ‘CEO’, ‘Student’, ‘Gardener’,... Akkus et al.Non-tracking Web Analytics17

18 Buckets Multiple yes-no questions with one query 1.Enumerate possible answer values – Job: {‘CEO’, ‘Student’, `Gardener’, `Teacher’,...} 2.A fixed number of ‘Yes’ answers – Job: 1 3.Clients choose ‘Yes’ for the matching bucket – Enc(‘CEO = Yes’) 4.Publisher generates additional answers – Enc(‘CEO = Yes’), Enc(‘Student = Yes’),... Akkus et al.Non-tracking Web Analytics18

19 Impracticalities of Differential Privacy Requires a privacy budget – Stop answering when budget expires – No answers from clients  low-utility results Assumes a static database; our setting is dynamic – User population of a publisher changes – Certain user data may change  Clients keep answering queries Akkus et al.Non-tracking Web Analytics19

20 Malicious Publishers Isolation attacks – Isolate a user’s response – Repeat the same query – Cancel out noise 1.Specific query conditions or buckets – Monitoring and approval by the data aggregator 2.Selectively dropping client responses Akkus et al.Non-tracking Web Analytics20

21 Isolation via Dropping Responses Enc(CEO) Enc(Student) Enc(Gardener) Enc(CEO) Enc(Student) Enc(Gardener) Enc(Driver) Enc(Mechanic) Enc(Driver) Mechanic: 1 + noise Driver: 2 + noise CEO: 1 + noise User in the middle is a CEO! Akkus et al.Non-tracking Web Analytics21 example.com

22 Auditing Enc(CEO) Enc(Student) Enc(CEO) Enc(Student) Enc(nonce) Enc(Driver) Enc(Mechanic) Enc(Driver) Enc(nonce) Enc(example.com, nonce) Enc(example.com, nonce) Akkus et al.Non-tracking Web Analytics22 example.com nonce? example.com

23 Outline Motivation & Goals Components & Assumptions Non-tracking Analytics – Publisher as Proxy – Noise – Yes-No Answer – Auditing Implementation & Evaluation Conclusion Akkus et al.Non-tracking Web Analytics23

24 Implementation 2000 lines of code in total – Client: Firefox extension – Publisher software: Piwik plugin – Aggregator software: simple server Deployed and tested with over 200 users RSA public key cryptosystem Akkus et al.Non-tracking Web Analytics24

25 Evaluation – Decryption Overhead Aggregator: 2.4 GHz CPU, 2048-bit key Publisher: 50K users, 2 sets of queries/week 1.Information currently provided – Demographics, other sites – 3.6 CPU hours/week 2.Information available through our system – # pages browsed, search engines, visit frequency to other sites – 3 CPU hours/week Akkus et al.Non-tracking Web Analytics25

26 Evaluation – Client Overhead Bandwidth overhead – <100KB/week to download 11 queries – 8KB/week for all query responses CPU overhead for encryption – Google Chrome: 380 enc/sec – Firefox: 20 enc/sec Akkus et al.Non-tracking Web Analytics26

27 Summary Extended analytics without tracking – Differential privacy guarantees for users – Aggregate information for publishers & aggregators No new organizational component Practical & feasible to deploy Akkus et al.Non-tracking Web Analytics27


Download ppt "Non-tracking Web Analytics Istemi Ekin Akkus 1, Ruichuan Chen 1, Michaela Hardt 2, Paul Francis 1, Johannes Gehrke 3 1 Max Planck Institute for Software."

Similar presentations


Ads by Google