Inference Problem
Access Control Policies Direct access Information flow Not addressed: indirect data access CSCE Farkas 2 Lecture 19
CSCE Farkas 3 Lecture 19 Indirect Information Flow Channels Covert channels Inference channels
CSCE Farkas 4 Lecture 19 Inference Channels + Meta-data Sensitive Information Non-sensitive information =
CSCE Farkas 5 Lecture 19 Inference Channels Statistical Database Inferences General Purpose Database Inferences
CSCE Farkas 6 Lecture 19 Statistical Databases Goal: provide aggregate information about groups of individuals E.g., average grade point of students Security risk: specific information about a particular individual E.g., grade point of student John Smith Meta-data: Working knowledge about the attributes Supplementary knowledge (not stored in database)
CSCE Farkas 7 Lecture 19 Types of Statistics Macro-statistics: collections of related statistics presented in 2- dimensional tables Micro-statistics: Individual data records used for statistics after identifying information is removed Sex\Year Sum Female415 Male Sum SexCourseGPAYear FCSCE M CSCE FCSCE
CSCE Farkas 8 Lecture 19 Statistical Compromise Exact compromise: find exact value of an attribute of an individual (e.g., John Smith’s GPA is 3.8) Partial compromise: find an estimate of an attribute value corresponding to an individual (e.g., John Smith’s GPA is between 3.5 and 4.0)
CSCE Farkas 9 Lecture 19 Methods of Attacks and Protection Small/Large Query Set Attack C: characteristic formula that identifies groups of individuals If C identifies a single individual I, e.g., count(C) = 1 Find out existence of property If count(C and D)=1 means I has property D If count(C and D)=0 means I does not have D OR Find value of property Sum(C, D), gives value of D
CSCE Farkas 10 Lecture 19 Small/Large Query Set Attack cont. Protection from small/large query set attack: query-set-size control A query q(C) is permitted only if N-n |C| n, where n 0 is a parameter of the database and N is all the records in the database
CSCE Farkas 11 Lecture 19 Tracker attack TrackerC C1 C2 C=C1 and C2 T=C1 and ~C2 q(C)=q(C1) – q(T) q(C) is disallowed
CSCE Farkas 12 Lecture 19 Tracker attack Tracker C C1 C2 C=C1 and C2 T=C1 and ~C2 D C and D q(C and D)= q(T or C and D) – q(T) q(C and D) is disallowed
CSCE Farkas 13 Lecture 19 Query overlap attack C1 C2 John Kathy Max Fred Eve Paul Mitch Q(John)=q(C1)-q(C2) Protection: query-overlap control
CSCE Farkas 14 Lecture 19 Insertion/Deletion Attack Observing changes overtime q 1 =q(C) insert(i) q 2 =q(C) q(i)=q 2 -q 1 Protection: insertion/deletion performed as pairs
CSCE Farkas 15 Lecture 19 Statistical Inference Theory Give unlimited number of statistics and correct statistical answers, all statistical databases can be compromised (Ullman)
Privacy Preserving Data Mining Related to statistical DB privacy We will cover it later in the semester CSCE Farkas 16 Lecture 19
CSCE Farkas 17 Lecture 19 Inferences in General-Purpose Databases Queries based on sensitive data Inference via database constraints Inferences via updates
CSCE Farkas 18 Lecture 19 Queries based on sensitive data Sensitive information is used in selection condition but not returned to the user. Example: Salary: secret, Name: public Name Salary=$25,000 Protection: apply query of database views at different security levels
How to mitigate this problem? Time of evaluation Architecture CSCE Farkas 19 Lecture 19
CSCE Farkas 20 Lecture 19 Database Constraints Integrity constraints Database dependencies Key integrity
CSCE Farkas 21 Lecture 19 Integrity Constraints C=A+B A=public, C=public, and B=secret B can be calculated from A and C, i.e., secret information can be calculated from public data
CSCE Farkas 22 Lecture 19 Database Dependencies Metadata: Functional dependencies Multi-valued dependencies Join dependencies etc.
CSCE Farkas 23 Lecture 19 Functional Dependency FD: A B, that is for any two tuples in the relation, if they have the same value for A, they must have the same value for B. Example: FD: Rank Salary Secret information: Name and Salary together Query1: Name and Rank Query2: Rank and Salary Combine answers for query1 and 2 to reveal Name and Salary together See slides in dissertation-farkas-rotated.pdf
CSCE Farkas 24 Lecture 19 Key integrity Every tuple in the relation have a unique key Users at different levels, see different versions of the database Users might attempt to update data that is not visible for them
CSCE Farkas 25 Lecture 19 Example Name (key)SalaryAddress Black P38,000 PColumbia S Red S42,000 SIrmo S Secret View Name (key)SalaryAddress Black P38,000 PNull P Public View
CSCE Farkas 26 Lecture 19 Updates Public User: Name (key)SalaryAddress Black P38,000 PNull P 1.Update Black’s address to Orlando 2.Add new tuple: (Red, 22,000, Manassas) If Refuse update: covert channel Allow update: Overwrite high data – may be incorrect Create new tuple – which data it correct (polyinstantiation) – violate key constraints
CSCE Farkas 27 Lecture 19 Updates Name (key)SalaryAddress Black P38,000 PColumbia S Red S42,000 SIrmo S Secret user: 1.Update Black’s salary to 45,000 If Refuse update: denial of service Allow update: Overwrite low data – covert channel Create new tuple – which data it correct (polyinstantiation) – violate key constraints
CSCE Farkas 28 Lecture 19 Inference Problem No general technique is available to solve the problem Need assurance of protection Hard to incorporate outside knowledge