Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 1 Privacy Week 9 - March 15, 17
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 2 Privacy policies Policies let consumers know about site’s privacy practices Consumers can decide whether practices are acceptable, when to opt-out Presence increases consumer trust Make companies subject to FTC privacy- related enforcement Rapid adoption * * G.R. Milne and M.J. Culnan Using the Content of Online Privacy Notices to Inform Public Policy: A Longitudinal Analysis of the US Web Surveys. The Information Society 18, 5,
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 3 Privacy policy problems BUT policies are often difficult to understand hard to find take a long time to read change without notice
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 4 Privacy policy components Identification of site, scope, contact info Types of information collected Including information about cookies How information is used Conditions under which information might be shared Information about opt-in/opt-out Information about access Information about data retention policies Information about seal programs Security assurances Children’s privacy There is lots of information to conveys -- but policy should be brief and easy-to-read too! What is opt-in? What is opt-out?
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 5 How are online privacy concerns different from offline privacy concerns?
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 6 Web privacy concerns Data is often collected silently Web allows large quantities of data to be collected inexpensively and unobtrusively Data from multiple sources may be merged Non-identifiable information can become identifiable when merged Data collected for business purposes may be used in civil and criminal proceedings Users given no meaningful choice Few sites offer alternatives
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 7 Browser Chatter Browsers chatter about IP address, domain name, organization, Referring page Platform: O/S, browser What information is requested URLs and search terms Cookies To anyone who might be listening End servers System administrators Internet Service Providers Other third parties Advertising networks Anyone who might subpoena log files later
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 8 Typical HTTP request with cookie GET /retail/searchresults.asp?qu=beer HTTP/1.0 Referer: User-Agent: Mozilla/4.75 [en] (X11; U; NetBSD 1.5_ALPHA i386) Host: Accept: image/gif, image/jpeg, image/pjpeg, */* Accept-Language: en Cookie: buycountry=us; dcLocName=Basket; dcCatID=6773; dcLocID=6773; dcAd=buybasket; loc=; parentLocName=Basket; parentLoc=6773; ShopperManager%2F=ShopperManager%2F=66FUQULL0 QBT8MMTVSC5MMNKBJFWDVH7; Store=107; Category=0
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 9 Referer log problems GET methods result in values in URL These URLs are sent in the referer header to next host Example: rder?name=Tom+Jones&address=here +there&credit+card= & PIN=1234&->index.html Access log example
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 10 Cookies What are cookies? What are people concerned about cookies? What useful purposes do cookies serve?
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 11 Cookies 101 Cookies can be useful Used like a staple to attach multiple parts of a form together Used to identify you when you return to a web site so you don’t have to remember a password Used to help web sites understand how people use them Cookies can do unexpected things Used to profile users and track their activities, especially across web sites
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 12 How cookies work – the basics A cookie stores a small string of characters A web site asks your browser to “set” a cookie Whenever you return to that site your browser sends the cookie back automatically browsersite Please store cookie xyzzy First visit to site browsersite Here is cookie xyzzy Later visits
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 13 How cookies work – advanced Cookies are only sent back to the “site” that set them – but this may be any host in domain Sites setting cookies indicate path, domain, and expiration for cookies Cookies can store user info or a database key that is used to look up user info – either way the cookie enables info to be linked to the current browsing session Database Users … … Visits … Send me with any request to x.com until 2008 Send me with requests for index.html on y.x.com for this session only User=Joe = x.com Visits=13 User=
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 14 Cookie terminology Cookie Replay – sending a cookie back to a site Session cookie – cookie replayed only during current browsing session Persistent cookie – cookie replayed until expiration date First-party cookie – cookie associated with the site the user requested Third-party cookie – cookie associated with an image, ad, frame, or other content from a site with a different domain name that is embedded in the site the user requested Browser interprets third-party cookie based on domain name, even if both domains are owned by the same company
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 15 Web bugs Invisible “images” (1-by-1 pixels, transparent) embedded in web pages and cause referer info and cookies to be transferred Also called web beacons, clear gifs, tracker gifs,etc. Work just like banner ads from ad networks, but you can’t see them unless you look at the code behind a web page Also embedded in HTML formatted messages, MS Word documents, etc. For software to detect web bugs see:
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 16 How data can be linked Every time the same cookie is replayed to a site, the site may add information to the record associated with that cookie Number of times you visit a link, time, date What page you visit What page you visited last Information you type into a web form If multiple cookies are replayed together, they are usually logged together, effectively linking their data Narrow scoped cookie might get logged with broad scoped cookie
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 17 Ad networks Ad company can get your name and address from CD order and link them to your search Ad search for medical information set cookie buy CD replay cookie Search ServiceCD Store
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 18 What ad networks may know… Personal data: address Full name Mailing address (street, city, state, and Zip code) Phone number Transactional data: Details of plane trips Search phrases used at search engines Health conditions “It was not necessary for me to click on the banner ads for information to be sent to DoubleClick servers.” – Richard M. Smith
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 19 Online and offline merging In November 1999, DoubleClick purchased Abacus Direct, a company possessing detailed consumer profiles on more than 90% of US households. In mid-February 2000 DoubleClick announced plans to merge “anonymous” online data with personal information obtained from offline databases By the first week in March 2000 the plans were put on hold Stock dropped from $125 (12/99) to $80 (03/00)
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 20 Offline data goes online… The Cranor family’s 25 most frequent grocery purchases (sorted by nutritional value)!
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 21 Subpoenas Data on online activities is increasingly of interest in civil and criminal cases The only way to avoid subpoenas is to not have data In the US, your files on your computer in your home have much greater legal protection that your files stored on a server on the network
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 22 Original Idea behind P3P A framework for automated privacy discussions Web sites disclose their privacy practices in standard machine-readable formats Web browsers automatically retrieve P3P privacy policies and compare them to users’ privacy preferences Sites and browsers can then negotiate about privacy terms P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 23 P3P history Idea discussed at November 1995 FTC meeting Ad Hoc “Internet Privacy Working Group” convened to discuss the idea in Fall 1996 W3C began working on P3P in Summer 1997 Several working groups chartered with dozens of participants from industry, non-profits, academia, government Numerous public working drafts issued, and feedback resulted in many changes Early ideas about negotiation and agreement ultimately removed Automatic data transfer added and then removed Patent issue stalled progress, but ultimately became non-issue P3P issued as official W3C Recommendation on April 16, 2002 P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 24 P3P1.0 – A first step Offers an easy way for web sites to communicate about their privacy policies in a standard machine-readable format Can be deployed using existing web servers This will enable the development of tools that: Provide snapshots of sites’ policies Compare policies with user preferences Alert and advise the user P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 25 The basics P3P provides a standard XML format that web sites use to encode their privacy policies Sites also provide XML “policy reference files” to indicate which policy applies to which part of the site Sites can optionally provide a “compact policy” by configuring their servers to issue a special P3P header when cookies are set No special server software required User software to read P3P policies called a “P3P user agent” P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 26 What’s in a P3P policy? Name and contact information for site The kind of access provided Mechanisms for resolving privacy disputes The kinds of data collected How collected data is used, and whether individuals can opt-in or opt-out of any of these uses Whether/when data may be shared and whether there is opt-in or opt-out Data retention policy P3P: Enabling your web site – overview and options
P3P/XML encoding <POLICY discuri=" name="policy"> <DATA <DATA ref="#business.contact-info.online.uri"> Web Privacy With P3P We keep standard web server logs. P3P version Location of human-readable privacy policy P3P policy name Site’s name and contact info Access disclosure Statement Human-readable explanation How data may be used Data recipients Data retention policy Types of data collected
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 28 P3P1.0 Spec Defines A standard vocabulary for describing set of uses, recipients, data categories, and other privacy disclosures A standard schema for data a Web site may wish to collect (base data schema) An XML format for expressing a privacy policy in a machine readable way A means of associating privacy policies with Web pages or sites A protocol for transporting P3P policies over HTTP P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 29 A simple HTTP transaction Web Server GET /index.html HTTP/1.1 Host: Request web page HTTP/ OK Content-Type: text/html... Send web page P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 30 … with P3P 1.0 added Web Server GET /w3c/p3p.xml HTTP/1.1 Host: Request Policy Reference File Send Policy Reference File GET /index.html HTTP/1.1 Host: Request web page HTTP/ OK Content-Type: text/html... Send web page Request P3P PolicySend P3P Policy P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 31 Transparency P3P clients can check a privacy policy each time it changes P3P clients can check privacy policies on all objects in a web page, including ads and invisible images P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 32 P3P in IE6 Privacy icon on status bar indicates that a cookie has been blocked – pop-up appears the first time the privacy icon appears Automatic processing of compact policies only; third-party cookies without compact policies blocked by default P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 33 Users can click on privacy icon for list of cookies; privacy summaries are available at sites that are P3P-enabled P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 34 Privacy summary report is generated automatically from full P3P policy P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 35 P3P in Netscape 7 Preview version similar to IE6, focusing, on cookies; cookies without compact policies (both first-party and third-party) are “flagged” rather than blocked by default Indicates flagged cookie P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 36 Users can view English translation of (part of) compact policy in Cookie Manager P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 37 A policy summary can be generated automatically from full P3P policy P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 38 AT&T Privacy Bird Free download of beta from “Browser helper object” for IE 5.01/5.5/6.0 Reads P3P policies at all P3P-enabled sites automatically Puts bird icon at top of browser window that changes to indicate whether site matches user’s privacy preferences Clicking on bird icon gives more information Current version is information only – no cookie blocking P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 39 Chirping bird is privacy indicator P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 40 Click on the bird for more info P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 41 Privacy policy summary - mismatch P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 42 Users select warning conditions P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 43 Bird checks policies for embedded content P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 44 Why web sites adopt P3P Demonstrate corporate leadership on privacy issues Show customers they respect their privacy Demonstrate to regulators that industry is taking voluntary steps to address consumer privacy concerns Distinguish brand as privacy friendly Prevent IE6 from blocking their cookies Anticipation that consumers will soon come to expect P3P on all web sites Individuals who run sites value personal privacy P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 45 P3P early adopters News and information sites – CNET, About.com, BusinessWeek Search engines – Yahoo, Lycos Ad networks – DoubleClick, Avenue A Telecom companies – AT&T Financial institutions – Fidelity Computer hardware and software vendors – IBM, Dell, Microsoft, McAfee Retail stores – Fortunoff, Ritz Camera Government agencies – FTC, Dept. of Commerce, Ontario Information and Privacy Commissioner Non-profits - CDT P3P: Introduction
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 46 Impacts Somewhat early to evaluate P3P Some companies that P3P-enable think about privacy in new ways and change their practices Systematic assessment of privacy practices Concrete disclosures – less wiggle room Disclosures about areas previously not discussed in privacy policy Hopefully we will see greater transparency, more informed consumers, and ultimately better privacy policies P3P: The future
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 47 Discussion questions What elements are needed in order to facilitate a robust market for privacy of personal information? How can P3P help realize such a market?
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 49 Homework discussion Argue for or against placing privacy-related restrictions on public web cams. P3P and privacy policies What aspects of each privacy policy you liked and what aspects you did not like Compare the experience of reading the privacy policies with using each P3P user agent Pick one new-technology-related privacy concern that you believe to be particularly significant. Explain the privacy issue and why you think it is a significant concern. What might be done to mitigate the concern?
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 50 Degrees of anonymity Absolute privacy: adversary cannot observe communication Beyond suspicion: no user is more suspicious than any other Probable innocence: each user is more likely innocent than not Possible innocence: nontrivial probability that user is innocent Exposed (default on web): adversary learns responsible user Provably exposed: adversary can prove your actions to others More Less
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 51 The Anonymizer Acts as a proxy for users Hides information from end servers Sees all web traffic Adds ads to pages (free service; subscription service also available) Anonymizer Request Reply ClientServer
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 52 Cryptography Basics Encryption algorithm used to make content unreadable by all but the intended receivers E(plaintext,key) = ciphertext D(ciphertext,key) = plaintext Symmetric (shared) key cryptography A single key is used is used for E and D D( E(p,k1), k1 ) = p Management of keys determines who has access to content E.g., password encrypted
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 53 Public Key Cryptography Public Key cryptography Each key pair consists of a public and private component: k + (public key), k - (private key) D( E(p, k + ), k - ) = p D( E(p, k - ), k + ) = p Public keys are distributed (typically) through public key certificates Anyone can communicate secretly with you if they have your certificate E.g., SSL-base web commerce
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 54 B,kAkA CkBkB Mixes [Chaum81] Sender routes message randomly through network of “Mixes”, using layered public-key encryption. Mix A dest,msg kCkC CkBkB kCkC kCkC SenderDestination msg Mix C k X = encrypted with public key of Mix X Mix B
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 55 Crowds Users join a Crowd of other users Web requests from the crowd cannot be linked to any individual Protection from end servers other crowd members system administrators eavesdroppers First system to hide data shadow on the web without trusting a central authority
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 56 Crowds Crowd membersWeb servers
Computers and Society Carnegie Mellon University Spring 2005 Lorrie Cranor and Dave Farber 57 Anonymous Anonymous r ers allow people to send anonymously Similar to anonymous web proxies Send mail to r er, which strips out any identifying information (very controversial) Johan (Julf) Helsingius ~ Penet Some can be chained and work like mixes