Presentation is loading. Please wait.

Presentation is loading. Please wait.

Paper Authors: Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu Presented By: Camille Gaspard Originally taken from: pages.cpsc.ucalgary.ca/~hammad/Fall04-00_files/Reg_Hippocratic%20Databases2.ppt.

Similar presentations


Presentation on theme: "Paper Authors: Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu Presented By: Camille Gaspard Originally taken from: pages.cpsc.ucalgary.ca/~hammad/Fall04-00_files/Reg_Hippocratic%20Databases2.ppt."— Presentation transcript:

1 Paper Authors: Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu Presented By: Camille Gaspard Originally taken from: pages.cpsc.ucalgary.ca/~hammad/Fall04-00_files/Reg_Hippocratic%20Databases2.ppt

2 2 Privacy with respect to databases Related work Founding principles of Hippocratic databases. Strawman design Future research Limiting Disclosure in Hippocratic Databases Hippocratic Oath: “And about whatever I may see or hear in treatment, or even without treatment, in the life of human beings – things that should not ever be blurted out outside – I will remain silent, holding such things to be unutterable.”

3 Dramatic and escalating increase in digital data Lack of privacy awareness, lax security We need to re-architect database systems to include responsibility for the privacy of data. 3 Digitization Storage Processors Networks Techniques Automatic collection

4 Privacy is the right of individuals to determine for themselves when, how and to what extent information about them is communicated to others. Computer security is the prevention of, or protection against, access to information by unauthorized recipients, and intentional but unauthorized destruction or alteration of that information 4

5 Traditional databases Statistical databases Secure databases 5

6 Fundamental to a database system is 1. Ability to manage persistent data. 2. Ability to access a large amount of data efficiently. Universal capabilities of a database system 1. Support for at least one data model. 2. Support for certain high-level languages that allow the user to define the structure of data, access data, and manipulate data. 3. Transaction management, the capability to provide correct, concurrent access to the database by many users at once. 4. Access control, the ability to deny access to data by unauthorized users and the ability to check the validity of the data. 5. Resiliency, the ability to recover from system failures without losing data. 6

7 Hippocratic databases require all the capabilities provided by current database systems Different focus Need to rethink data definition and query languages, query processing, indexing and storage structures, and access control mechanisms 7 Privacy Consented sharing Efficiency Maximizing Concurrency Resiliency

8 Goal: Provide statistical information (sum, count, average, maximum, minimum, pth percentile) without compromising sensitive information about individuals Query restriction Restricting the size of query results Controlling the overlap among successive queries Keeping audit trails of all answered queries and checking for possible compromises Data perturbation Swapping values between records Replacing the original database by a sample from the same distribution Adding noise to the values in the database Adding noise to the results of a query Hippocratic databases share the goal of preventing disclosure of private information but the class of queries for Hippocratic databases is much broader. 8

9 Sensitive information is transmitted over a secure channel and stored securely Access controls Encryption Multilevel secure databases Multiple levels of security (top secret, secret, confidential, unclassified) Eg: Bell-LaPadula model (no read up, no write down) 9

10 United States Privacy Act of 1974 requires federal agencies to 1. permit an individual to determine what records pertaining to him are collected, maintained, used, or disseminated; 2. permit an individual to prevent records pertaining to him obtained for a particular purpose from being used or made available for another purpose without his consent; 3. permit an individual to gain access to information pertaining to him in records, and to correct or amend such records; 4. collect, maintain, use or disseminate any record of personally identifiable information in a manner that assures that such action is for a necessary and lawful purpose, that the information is current and accurate for its intended use, and that adequate safeguards are provided to prevent misuse of such information; 5. permit exemptions from the requirements with respect to the records provided in this Act only in those cases where there is an important public policy need for such exemption as has been determined by specific statutory authority; and 6. be subject to civil suit for any damages which occur as a result of willful or intentional action which violates any individual’s right under this Act. 10

11 Recent privacy documents 1996 Health Insurance Portability and Accountability Act (HIPAA) 1999 Gramm-Leach-Bliley Financial Services Modernization Act 1995 Canada Personal Information Protection and Electronic Documents Act (PIPEDA) 11

12 Collection Retention Use Disclosure Example: Grad student information at the university 12

13 1. Purpose Specification. For personal information stored in the database, the purposes for which the information has been collected shall be associated with that information. 2. Consent. The purposes associated with personal information shall have consent of the donor of the personal information. 3. Limited Collection. The personal information collected shall be limited to the minimum necessary for accomplishing the specified purposes. 4. Limited Use. The database shall run only those queries that are consistent with the purposes for which the information has been collected. 5. Limited Disclosure. The personal information stored in the database shall not be communicated outside the database for purposes other than those for which there is consent from the donor of the information. 13

14 6. Limited Retention. Personal information shall be retained only as long as necessary for the fulfillment of the purposes for which it has been collected. 7. Accuracy. Personal information stored in the database shall be accurate and up-to-date. 8. Safety. Personal information shall be protected by security safeguards against theft and other misappropriations. 9. Openness. A donor shall be able to access all information about the donor stored in the database. 10. Compliance. A donor shall be able to verify compliance with the above principles. Similarly, the database shall be able to address a challenge concerning compliance. 14

15 Use scenario: Mississippi Bookstore Mississippi is an on-line bookseller who needs to obtain certain minimum personal information to complete a purchase transaction. This information includes name, shipping address, and credit card number. Mississippi also needs an email address to notify the customer of the status of the order. Mississippi uses the purchase history of customers to offer book recommendations on its site. It also publishes information about books popular in the various regions of the country (purchase circles). 15

16 16 Name: Super Mario Privacy pragmatist Likes the convenience of providing his email and shipping address only once by registering at Mississippi. He also likes recommendations but he does not want his transactions used for purchase circles.

17 17 Name: Princess Peach Privacy fundamentalist Does not want Mississippi to retain any information once her purchase transaction is complete.

18 18 Name: Luigi Mississippi employee with questionable ethics The database and privacy officer must ensure that he is not able to obtain more information that she is supposed to.

19 Privacy Metadata Schema Database Schema Privacy-Policies Table 19

20 Privacy-Authorizations Table 20 This object is copied from the original paper.

21 This design may be too restrictive or may not fit in some situations. May create a new table with the columns User Table Attribute Purpose External recipient 21

22 22 This object is copied from the original paper.

23 Privacy Constraint Validator checks whether the business’s privacy policy is acceptable to the user Example: If Princess Peach required a 2 week retention period, the database would reject the transaction Data is inserted with the purpose for which it may be used Data Accuracy Analyzer addresses the Principle of Accuracy. For example, verify that the postal code corresponds to the street address. 23

24 Submitted to the database along with their purpose. Example: recommendations Before query execution: Attribute Access Control checks privacy-authorizations table for a match on purpose, attribute and user. During query execution: Record Access Control ensures that only records whose purpose attribute includes the query’s purpose will be visible to the query. 24

25 After query execution: Query Intrusion Detector is run on the query results to spot queries whose access pattern is different from the usual access pattern for queries with that purpose and by that user. Detector uses the Query Intrusion Model built by analyzing past queries for each purpose and each authorized user. An audit trail of all queries is maintained for external privacy audits, as well as addressing challenges regarding compliance. 25

26 Data Retention Manager deletes data items that have outlived their purpose. Data Collection Analyzer examines the set of queries for each purpose to determine if any information is being collected but not used. This supports the Principle of Limited Collection. DRA determines if data is being kept for longer than necessary. (Limited Retention) DCA determines if people have unused (unnecessary) authorizations to issue queries with a given purpose. (Limited Use) Encryption Support allows some data items to be stored in encrypted form to guard against snooping. 26

27 Platform for Privacy Preferences (emerging standard developed by the WWW Consortium) P3P provides a way for a web site to encode its data- collection practices in an XML P3P policy The sites policy is programmatically compared to a user’s privacy preferences How to enforce? Integrate with Hippocratic databases The Electronic Privacy Information Center (EPIC) has been critical of P3P and believes P3P makes it too difficult for users to protect their privacy 27

28 28

29 P3P language insufficient Developed for web shopping  language restricted P3P is a good starting for a language which can be used in a wider variety of environments such as finance, insurance, and health care Difficult to find balance between expressibility and usability Work is being done to arrange purposes in a hierarchy rather than the flat space that P3P uses Other research treats the problem as a coalition game [Kleinberg et al. 2001]where information is valuable and users are compensated for revealing some private information. 29

30 Current databases have undergone years of optimization without concern for privacy. What type of performance hit will integrated privacy checking entail? Some techniques from multilevel secure databases will apply Technique that can reduce the cost of each check: Assume number of purposes is 32 or less Encode the set of purposes by setting a bit in a word Then the Record Access Control check only requires a bit-wise AND and XOR of two words with a check that the result is 0 Storage of purpose – space versus efficiency 30

31 Access Analysis: Analyze the queries for each purpose and identify attributes that are collected for a given purpose but not used. Problem: Necessity of one attribute may depend on others Granularity Analysis: Analyze the queries for each purpose and numeric attribute and determine the granularity at which information is needed. Minimal Query Generation: Generate the minimal query that is required to solve a given problem. 31

32 Strawman design as it exists now does not protect against a dynamically determined set of recipients Example: EquiRate is a credit rating agency that maintains a database of credit ratings Princess Peach has asked EquiRate to only disclose her credit rating to companies that she has contacted Luigi has stolen Princess Peach’s identity and has contacted EasyCredit pretending to be her. EasyCredit contacts EquiRate with Princess Peach’s credentials and obtains her credit rating Both companies have adhered to the protocol but it has failed 32

33 Potential solution 1. Princess Peach digitally signs EasyCredit’s company ID with her cryptographic private key 2. She provides the result to EasyCredit 3. EasyCredit contacts EquiRate and gives them Alice’s signed permission 4. EquiRate verifies the signature and provides the information to EasyCredit 33

34 Deleting a record from a database is simple but how does one completely remove the information? Deleting information for all logs, checkpoints, and backups is difficult 34

35 How do we protect against someone whose access controls prevent them from accessing tables through the DBMS but they have access to the files via other means? Example: Suppose Luigi has no access to the DBMS but has root privileges on the system Encryption incurs performance penalties How can one index or query encrypted columns 35

36 A person challenging information about themselves that was not provided by them (eg: credit report) Super Mario wishes to find out what databases have information about him If the database does not have information about him, it should not know who issued the query Related work: symmetrically private information retrieval 36

37 Provide each user whose data is accessed with a log of that access along with the query reading the data, without paying a large performance penalty Fingerprinting – sign up with a company that inserts fingerprint records into the database with an email address, phone number and credit card number along with various approved purposes. If one is used for an unauthorized purpose it would be immediately detected and the company notified that a breach has occurred Challenge: use the least number of fingerprint records to maximum effect. 37

38 Presented a vision, inspired by the Hippocratic Oath, of databases that preserve privacy Enunciated key privacy principles Discussed a strawman design for a Hippocratic database Identified technical challenges Discussed related work 38

39 Paper Authors: Kristen LeFevrey, Rakesh Agrawaly, Vuk Ercegovac, Raghu Ramakrishnan, Yirong Xuy and David DeWitt Presented By: Camille Gaspard

40 It’s essential to manage disclosure in a more fine-grain manner. Cell-level disclosure schemes have not been well studied before this work. 40

41 Investigate the alternative semantics for limited disclosure in relational databases and present two models of cell-level limited disclosure: table semantics and query semantics, weighing the relative semantic and performance tradeoffs implicit in the two models. Rather than viewing the disclosure control problem as one of checking against a list of privileges, they transform it into a query modification problem. Experimental evaluation. 41

42 The table semantics model conceptually defines a view of each data table for each purpose-recipient pair, based on the disclosure constraints specified in the privacy policy. These views combine to produce a coherent relational database for each purpose- recipient pair, independent of any queries, and queries are evaluated against this database. 42

43 The query semantics model takes the query into account when enforcing disclosure. Both models mask prohibited values using SQL's null value. 43

44 Policy definition: Privacy policies are expressed electronically and stored in the database where they can be used to enforce limited disclosure. Privacy meta-data: The rules and conditions prescribed by the privacy policy are stored a privacy meta-data tables in the database. Application context: Each query must be associated with a purpose and recipient. In their system, this information is inferred based on the context of the application issuing the query. The query interceptor infers the purpose and recipient of the query based on context information stored in an additional meta-data table. 44

45 Query modifier: SQL queries issued to the database are intercepted and augmented to reflect the privacy policy rules regarding the purpose and recipient issuing the query. The results of this new query are returned to the issuer. Disclosure model: The result of executing a privacy modified query will reflect one of two cell-level limited disclosure models. 45

46 46

47 47

48 48

49 49

50 Limited Disclosure. Can we really limit it? Limited Retention. Can we really delete the data? Isn’t it better to keep such complex privacy mechanism as part of the application? Why not using new techniques in writing stored procedures? Databases are only part of the whole system. Securing the database helps but not prevent attacks. A holistic solution is what might be needed. 50

51 51


Download ppt "Paper Authors: Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu Presented By: Camille Gaspard Originally taken from: pages.cpsc.ucalgary.ca/~hammad/Fall04-00_files/Reg_Hippocratic%20Databases2.ppt."

Similar presentations


Ads by Google