Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hippocratic Databases Rakesh Agrawal Jerry Kiernan Ramakrishnan Srikant Yirong Xu.

Similar presentations


Presentation on theme: "Hippocratic Databases Rakesh Agrawal Jerry Kiernan Ramakrishnan Srikant Yirong Xu."— Presentation transcript:

1 Hippocratic Databases Rakesh Agrawal Jerry Kiernan Ramakrishnan Srikant Yirong Xu

2 Algorithms Algorithms Performance Graphs Performance Graphs Founding Principles Founding Principles New Challenges New Challenges Vision Paper

3 The Hippocratic Oath “What I may see or hear in the course of treatment or even outside of the treatment in regard to the life of men, which on no account [ought to be] spread abroad, I will keep to myself, holding such things shameful to be spoken about.” “What I may see or hear in the course of treatment or even outside of the treatment in regard to the life of men, which on no account [ought to be] spread abroad, I will keep to myself, holding such things shameful to be spoken about.” – Hippocratic Oath, 8 (circa 400 BC)

4 Privacy Violations Accidents: Accidents: –Kaiser, GlobalHealthrax Lax security: Lax security: –Massachusetts govt. Ethically questionable behavior: Ethically questionable behavior: –Lotus & Equifax, Lexis-Nexis, Medical Marketing Service, Boston University, CVS & Giant Food Illegal: Illegal: –Toysmart

5 Growing Privacy Concerns Popular Press: Popular Press: –Economist: The End of Privacy (May 99) –Time: The Death of Privacy (Aug 97) Govt. legislation Govt. legislation S. Garfinkel, "Database Nation: The Death of Privacy in 21st Century", O' Reilly, Jan 2000 S. Garfinkel, "Database Nation: The Death of Privacy in 21st Century", O' Reilly, Jan 2000 Special issue on internet privacy, CACM, Feb 99 Special issue on internet privacy, CACM, Feb 99

6 Related Work Statistical Databases Statistical Databases –Provide statistical information (sum, count, etc.) without compromising sensitive information about individuals. [AW89] Multilevel Secure Databases Multilevel Secure Databases –Multilevel relations, e.g., records tagged “secret”, “confidential”, or “unclassified”, e.g. [JS91] Wish to protect privacy in transactional databases that support daily operations. Wish to protect privacy in transactional databases that support daily operations. –Cannot restrict queries to statistical queries. –Cannot tag all the records “top secret”.

7 Current Database Systems Ullman, “Principles of Database and Knowledgebase Systems” Ullman, “Principles of Database and Knowledgebase Systems” Fundamental: Fundamental: –Manage persistent data. –Access a large amount of data efficiently. Desirable: Desirable: –Support for data model, high-level languages, transaction management, access control, and resiliency. Similar list in other database textbooks. Similar list in other database textbooks.

8 The Vision We propose Hippocratic Databases that include responsibility for the privacy of data they manage as a founding tenet. We propose Hippocratic Databases that include responsibility for the privacy of data they manage as a founding tenet.

9 Approach Derive founding principles from current privacy legislation. Derive founding principles from current privacy legislation. Strawman Design Strawman Design Challenges & Open Problems Challenges & Open Problems

10 Caveats Technology alone cannot address all concerns about privacy. Technology alone cannot address all concerns about privacy. –Solution has to be a mix of laws, societal norms, markets and technology. –But by advancing technology, we can influence the overall quality of the solution. Not all the world’s data lives in database systems. Not all the world’s data lives in database systems. –Additional inducement for data to move to its right home. –Hippocratic databases can serve as guide for other types of data repositories.

11 Privacy Legislation Fair Information Practices Act (US, 1974) Fair Information Practices Act (US, 1974) OECD Guidelines (Europe, 1980) OECD Guidelines (Europe, 1980) Canadian Standards Association’s Model Code for Protection of Personal Information (1995) Canadian Standards Association’s Model Code for Protection of Personal Information (1995) Australian Privacy Amendment (2000) Australian Privacy Amendment (2000) Japan: proposed legislation (2003) Japan: proposed legislation (2003)

12 The Ten Principles Collection Group Collection Group –Purpose Specification, Consent, Limited Collection Use Group Use Group –Limited Use, Limited Disclosure, Limited Retention, Accuracy Security & Openness Group Security & Openness Group –Safety, Openness, Compliance

13 Collection Group 1. Purpose Specification –For personal information stored in the database, the purposes for which the information has been collected shall be associated with that information. 2. Consent –The purposes associated with personal information shall have consent of the donor of the personal information. 3. Limited Collection –The information collected shall be limited to the minimum necessary for accomplishing the specified purposes.

14 Use Group 4. Limited Use –The database shall run only those queries that are consistent with the purposes for which the information has been collected. 5. Limited Disclosure –Personal information shall not be communicated outside the database for purposes other than those for which there is consent from the donor of the information.

15 Use Group (2) 6. Limited Retention –Personal information shall be retained only as long as necessary for the fulfillment of the purposes for which it has been collected. 7. Accuracy –Personal information stored in the database shall be accurate and up-to-date.

16 Security & Openness Group 8. Safety –Personal information shall be protected by security safeguards against theft and other misappropriations. 9. Openness –A donor shall be able to access all information about the donor stored in the database. 10. Compliance –A donor shall be able to verify compliance with the above principles. Similarly, the database shall be able to address a challenge concerning compliance.

17 Talk Outline Motivation Motivation Founding Principles Founding Principles Strawman Design Strawman Design New Challenges New Challenges

18 Strawman Architecture Privacy Policy Data Collection QueriesOther Store

19 Architecture: Policy Privacy Policy Privacy Metadata Creator Store Privacy Metadata For each purpose & piece of information (attribute): External recipients Retention period Authorized users Different designs possible. Converts privacy policy into privacy metadata tables. Limited Disclosure Limited Retention

20 Privacy Policies Table PurposeTableAttribute External- recipients Authorized- users Retention purchasecustomername {delivery, credit-card} {shipping, charge} 1 month purchasecustomeremailempty{shipping}1 month registercustomernameempty{registration}3 years registercustomeremailempty{registration}3 years recomme ndations orderbookempty{mining}10 years

21 Architecture: Data Collection Data Collection Store Privacy Constraint Validator Audit Info Audit Trail Privacy Metadata Privacy policy compatible with user’s privacy preference? Audit trail for compliance. Compliance Consent

22 Architecture: Data Collection Data Collection Store Privacy Constraint Validator Data Accuracy Analyzer Audit Info Audit Trail Privacy Metadata Data cleansing, e.g., catch typos in address. Record Access Control Associate set of purposes with each record. Purpose Specification Accuracy

23 Architecture: Queries Queries Store Attribute Access Control Privacy Metadata Record Access Control 2. Query tagged “telemarketing” cannot see credit card info. 3. Telemarketing query only sees records that include “telemarketing” in set of purposes. Safety Limited Use 1. Telemarketing cannot issue query tagged “charge”. Safety

24 Architecture: Queries Queries Store Audit Info Audit Trail Query Intrusion Detector Attribute Access Control Privacy Metadata Record Access Control Telemarketing query that asks for all phone numbers. Compliance Training data for query intrusion detector Safety Compliance

25 Architecture: Other Store Privacy Metadata Other Data Retention Manager Encryption Support Delete items in accordance with privacy policy. Additional security for sensitive data. Data Collection Analyzer Analyze queries to identify unnecessary collection, retention & authorizations. Limited Retention Limited Collection Safety

26 Strawman Architecture Privacy Policy Data Collection Queries Privacy Metadata Creator Store Privacy Constraint Validator Data Accuracy Analyzer Audit Info Audit Info Audit Trail Query Intrusion Detector Attribute Access Control Privacy Metadata Other Data Retention Manager Record Access Control Encryption Support Data Collection Analyzer

27 Talk Outline Privacy Privacy Founding Principles Founding Principles Strawman Design Strawman Design New Challenges New Challenges

28 New Challenges General General –Language –Efficiency Use Use –Limited Collection –Limited Disclosure –Limited Retention Security and Openness Security and Openness –Safety –Openness –Compliance

29 Language Need a language for privacy policies & user preferences. Need a language for privacy policies & user preferences. P3P can be used as starting point. P3P can be used as starting point. –Developed primarily for web shopping. –What about richer domains? How do we balance expressibility and usability? How do we balance expressibility and usability? contact emailphone homework P3P recipients: P3P recipients: –Arrange concepts in hierarchy or subsumption relationship. Purpose: Purpose: Ours Same Delivery Unrelated Public

30 Language (2) How do we accommodate user negotiation models? How do we accommodate user negotiation models? –User willing to disclose information only if fairly compensated. –Value of privacy as coalitional game [KPR2001]

31 Efficiency How do we minimize the cost of privacy checking? How do we minimize the cost of privacy checking? How do we incorporate purpose into database design and query optimization? How do we incorporate purpose into database design and query optimization? Tradeoffs between space & running time. Tradeoffs between space & running time. Only tag records in customer table with purpose, not all records. But now need to do a join when scanning records in order table. Only tag records in customer table with purpose, not all records. But now need to do a join when scanning records in order table. How does the secure databases work on decomposition of multilevel relations into single- level relations [JS91] apply here? How does the secure databases work on decomposition of multilevel relations into single- level relations [JS91] apply here?

32 Limited Collection How do we identify attributes that are collected but not used? How do we identify attributes that are collected but not used? –Assets are only needed for mortgage when salary is below some threshold. What’s the needed granularity for numeric attributes? What’s the needed granularity for numeric attributes? –Queries only ask “Salary > threshold” for rent application. How do we generate minimal queries? How do we generate minimal queries? –Redundancy may be hidden in application code.

33 Limited Disclosure Can the user dynamically determine the set of recipients? Can the user dynamically determine the set of recipients? Example: Alice wants to add EasyCredit to set of recipients in EquiRate’s database. Example: Alice wants to add EasyCredit to set of recipients in EquiRate’s database. Digital signatures. Digital signatures.

34 Limited Retention Completely forgetting some information is non- trivial. Completely forgetting some information is non- trivial. How do we delete a record from the logs and checkpoints, without affecting recovery? How do we delete a record from the logs and checkpoints, without affecting recovery? How do we continue to support historical analysis and statistical queries without incurring privacy breaches? How do we continue to support historical analysis and statistical queries without incurring privacy breaches?

35 Safety Encryption provides additional layer of security. Encryption provides additional layer of security. How do we index encrypted data? How do we index encrypted data? How do we run queries against encrypted data? How do we run queries against encrypted data? [SWP00], [HILM02] [SWP00], [HILM02]

36 Openness A donor shall be able to access all information about the donor stored in the database. A donor shall be able to access all information about the donor stored in the database. How does the database check Alice is really Alice and not somebody else? How does the database check Alice is really Alice and not somebody else? –Princeton admissions office broke into Yale’s admissions using applicant’s social security number and birth date. How does Alice find out what databases have information about her? How does Alice find out what databases have information about her? –Symmetrically private information retrieval [GIKM98].

37 Compliance Universal Logging Universal Logging –Can we provide each user whose data is accessed with a log of that access, along with the query reading the data? –Use intermediaries who aggregate and analyze logs for many users. Tracking Privacy Breaches Tracking Privacy Breaches –Insert “fingerprint” records with emails, telephone numbers, and credit card numbers. –Some data may be more valuable for spammers or credit card theft. How do we identify categories to do stratified fingerprinting rather than randomly inserting records?

38 Summary Database systems that take responsibility for the privacy of data they manage. Database systems that take responsibility for the privacy of data they manage. Key privacy principles Key privacy principles Strawman design Strawman design Technical challenges Technical challenges

39 Closing Thoughts “Code is law … it is all a matter of code: the software and hardware that rule the internet” “Code is law … it is all a matter of code: the software and hardware that rule the internet” -- L. Lessig We can architect cyberspace to protect values we believe are fundamental, or we can architect it to allow those values to disappear. We can architect cyberspace to protect values we believe are fundamental, or we can architect it to allow those values to disappear. Where does the database community want to go from here? Where does the database community want to go from here?

40 Strawman Architecture Privacy Policy Data Collection Queries Privacy Metadata Creator Store Privacy Constraint Validator Data Accuracy Analyzer Audit Info Audit Info Audit Trail Query Intrusion Detector Attribute Access Control Privacy Metadata Other Data Retention Manager Record Access Control Encryption Support Data Collection Analyzer

41 Privacy Privacy is the right of individuals to determine for themselves when, how and to what extent information about them is communicated to others. Privacy is the right of individuals to determine for themselves when, how and to what extent information about them is communicated to others. -- Alan Westin -- Alan Westin


Download ppt "Hippocratic Databases Rakesh Agrawal Jerry Kiernan Ramakrishnan Srikant Yirong Xu."

Similar presentations


Ads by Google