Quantification of Integrity

Quantification of Integrity
Michael Clarkson and Fred B. Schneider Cornell University SnT Seminar University of Luxembourg September 9, 2010 Joint work with FBS at Cornell University. The title says it all. This work is about quantification of integrity. But in more detail, I mean…

Goal Information-theoretic Quantification of programs’ impact on Integrity of Information [Denning 1982] Idea of applying info theory to programs isn’t new. But looking at integrity is a quite recent development. And what *is* integrity?

What is Integrity? Common Criteria: Biba (1977): Databases:
Protection of assets from unauthorized modification Biba (1977): Guarantee that a subsystem will perform as it was intended Isolation necessary for protection from subversion Dual to confidentiality Databases: Constraints that relations must satisfy Provenance of data Utility of anonymized data …no universal definition //Biba said “adheres to well-defined code of conduct” by analogy to personal integrity. So we take 2 starting points.

Our Notions of Integrity
Corruption: damage to integrity Starting Point Corruption Measure Taint analysis Contamination Program correctness Suppression

Corruption: damage to integrity Starting Point Corruption Measure Taint analysis Contamination Program correctness Suppression …I’ll start by talking about contamination. Contamination: bad information present in output Suppression: good information lost from output …distinct, but interact

Contamination Goal: model taint analysis Program untrusted Attacker
User User Untrusted = tainted; trusted = untainted. This model has active attacker who provides inputs, not just an attacker who observes output.

Untrusted input contaminates trusted output
Contamination Goal: model taint analysis Untrusted input contaminates trusted output Program untrusted Attacker Attacker trusted User User If this looks familiar---it should.

Contamination u contaminates o o:=(t,u)

(Can’t u be filtered from o?)
Contamination u contaminates o (Can’t u be filtered from o?) o:=(t,u) Answer: Yes. Just like in real life, contaminants can be filtered out. But contamination is about the amount of contaminant present in the output---not about what user does with output.

Quantification of Contamination
Use information theory: information is surprise X, Y, Z: distributions I(X,Y): mutual information between X and Y (in bits) I(X,Y | Z): conditional mutual information An event with probability ½ conveys 1 bit of information. E.g., coin flip.

Program User Attacker trusted untrusted

Uin Program User Attacker trusted untrusted Tin Tout Uin and Tin characterize probability with which agents choose inputs. Tout is a function of the input distributions and the program, which might itself behave randomly.

Contamination = I(Uin,Tout | Tin) Uin Program User Attacker trusted untrusted Tin Tout Why condition on Tin? To model the fact that the user can observe the trusted inputs---since he does supply them himself. [Newsome et al. 2009] Dual of [Clark et al. 2005, 2007]

Example of Contamination
o:=(t,u) Contamination = I(U, O | T) = k bits if U is uniform on [0,2k-1]

Corruption: damage to integrity Starting Point Corruption Measure Taint analysis Contamination Program correctness Suppression Contamination: bad information present in output Suppression: good information lost from output

Program Suppression Goal: model program (in)correctness Specification
Sender Receiver correct (Specification must be deterministic)

Program Suppression Goal: model program (in)correctness Specification
Sender Receiver correct Implementation untrusted Attacker Attacker Attacker and Sender could share choosing the Spec’s Inputs. How to divide Input between Attacker and Sender? Doesn’t matter; just need joint probability distribution on Spec and Impl to ensure that inputs are the same in individual runs. Likewise, can match up differing named, typed outputs via joint distribution. trusted Sender Receiver real

Program Suppression Goal: model program (in)correctness
Specification Sender Receiver correct Implementation untrusted Attacker Attacker Implementation might be buggy; attacker might influence output. Normally think of any incorrectness as bad. Here, we quantify---more nuanced. trusted Sender Receiver real Implementation might suppress information about correct output from real output

Example of Program Suppression
Spec. for (i=0; i<m; i++) { s := s + a[i]; } a[0..m-1]: trusted

Spec. for (i=0; i<m; i++) { s := s + a[i]; } a[0..m-1]: trusted Impl. 1 for (i=1; i<m; i++) { s := s + a[i]; } Suppression—a[0] missing No contamination

Spec. for (i=0; i<m; i++) { s := s + a[i]; } a[0..m-1]: trusted Impl. 1 Impl. 2 for (i=1; i<m; i++) { s := s + a[i]; } for (i=0; i<=m; i++) { s := s + a[i]; } a[m]: untrusted Suppression—a[0] missing No contamination Suppression—a[m] added Contamination

Suppression vs. Contamination
output := input Contamination Attacker Attacker * * Suppression

Quantification of Program Suppression
Specification Sender Receiver Implementation untrusted Attacker Attacker trusted Sender Receiver

In Spec Specification Sender Receiver Implementation untrusted Attacker Attacker trusted Sender Receiver

In Spec Specification Sender Receiver Uin Implementation untrusted Attacker Attacker trusted Sender Receiver Tin Impl

In Spec Specification Sender Receiver Uin Implementation untrusted Attacker Attacker trusted Sender Receiver Tin Impl Program transmission = I(Spec , Impl)

H(X): entropy (uncertainty) of X H(X|Y): conditional entropy of X given Y Program Transmission = I(Spec, Impl) = H(Spec) − H(Spec | Impl)

H(X): entropy (uncertainty) of X H(X|Y): conditional entropy of X given Y Program Transmission = I(Spec, Impl) = H(Spec) − H(Spec | Impl) Total info to learn about Spec

H(X): entropy (uncertainty) of X H(X|Y): conditional entropy of X given Y Program Transmission = I(Spec, Impl) = H(Spec) − H(Spec | Impl) Info actually learned about Spec by observing Impl Total info to learn about Spec

H(X): entropy (uncertainty) of X H(X|Y): conditional entropy of X given Y Program Transmission = I(Spec, Impl) = H(Spec) − H(Spec | Impl) Info actually learned about Spec by observing Impl Total info to learn about Spec Info NOT learned about Spec by observing Impl

H(X): entropy (uncertainty) of X H(X|Y): conditional entropy of X given Y Program Transmission = I(Spec, Impl) = H(Spec) − H(Spec | Impl) Program Suppression = H(Spec | Impl)

Spec. for (i=0; i<m; i++) { s := s + a[i]; } Impl. 1 Impl. 2 for (i=1; i<m; i++) { s := s + a[i]; } for (i=0; i<=m; i++) { s := s + a[i]; } Of course, we can quantify suppression of impl. 2 exactly, given specific probability distributions. I’ve defined contamination and suppression now for mutual information based definitions. Can extend to richer scenario of beliefs. Suppression = H(A) Suppression ≤ H(A) A = distribution of individual array elements

Echo Specification Tin Tin output := input trusted Sender Receiver

Echo Specification Tin Tin output := input trusted Sender Receiver Uin
Implementation untrusted Attacker Attacker trusted Sender Receiver Tin Tout

Simplifies to information-theoretic model of channels, with attacker
Echo Specification Tin Tin output := input trusted Sender Receiver Uin Implementation untrusted Attacker Attacker trusted Sender Receiver Tin Tout Simplifies to information-theoretic model of channels, with attacker

Channel Suppression Channel transmission = I(Tin,Tout)
Uin Channel untrusted Attacker Attacker trusted Sender Receiver Tin Tout Channel transmission = I(Tin,Tout) Channel suppression = H(Tin | Tout) (Tout depends on Uin)

Belief-based Metrics What if user’s/receiver’s distribution on unobservable inputs is wrong?  Belief-based information flow [Clarkson et al. 2005] Belief-based generalizes information-theoretic: On single executions, the same In expectation, the same …if user’s/receiver’s distribution is correct Correct means that the distribution is NOT wrong. What about relationship between confidentiality and integrity?

Suppression and Confidentiality
Declassifier: program that reveals (leaks) some information; suppresses rest Leakage: [Denning 1982, Millen 1987, Gray 1991, Lowe 2002, Clark et al. 2005, 2007, Clarkson et al. 2005, McCamant & Ernst 2008, Backes et al. 2009] Thm. Leakage + Suppression is a constant  What isn’t leaked is suppressed This is for a single execution. Constant determined by distribution on secret inputs---in particular, I(h) where h is the high input event for the execution.

Database Privacy Statistical database anonymizes query results: …sacrifices utility for privacy’s sake response Anonymizer Database User User query anonymized response

Database Privacy Statistical database anonymizes query results: …sacrifices utility for privacy’s sake …suppresses to avoid leakage response Anonymizer Database User User query anonymized response anon. resp. := resp.

Database Privacy Statistical database anonymizes query results: …sacrifices utility for privacy’s sake …suppresses to avoid leakage …sacrifices integrity for confidentiality’s sake response Anonymizer Database User User query anonymized response

K-anonymity DB. Every individual must be anonymous within set of size k [Sweeney 2002] Iflow. Every output corresponds to k inputs …no bound on leakage or suppression

L-diversity DB. Every individual’s sensitive information should appear to have L (roughly) equally likely values [Machanavajjhala et al. 2007] DB. Entropy L-diversity: H(anon. block) ≥ log L [Øhrn and Ohno-Machado 1999, Machanavajjhala et al. 2007] Iflow. H(Tin | tout) ≥ log L (if Tin uniform) …implies suppression ≥ log L

Differential Privacy DB. [Dwork et al. 2006, Dwork 2006]
No individual loses privacy by including data in database Anonymized data set reveals no information about an individual beyond what other individuals already reveal Iflow. Output reveals almost no information about individual input beyond what other inputs already reveal …implies almost all information about individual suppressed …quite similar to noninterference

Summary Measures of information corruption:
Contamination (generalizes taint analysis, dual to leakage) Suppression (generalizes program correctness, no dual) Application: database privacy (model anonymizers; relate utility and privacy; security conditions)

More Integrity Measures
Attacker- and program-controlled suppression Granularity: Average over all executions Single executions Sequences of executions …interaction of attacker with program Application: Error-correcting codes

Quantification of Integrity
Michael Clarkson and Fred B. Schneider Cornell University SnT Seminar University of Luxembourg September 9, 2010 Joint work with FBS at Cornell University. The title says it all. This work is about quantification of integrity. But in more detail, I mean…

Quantification of Integrity

Similar presentations

Presentation on theme: "Quantification of Integrity"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Quantification of Integrity

Similar presentations

Presentation on theme: "Quantification of Integrity"— Presentation transcript:

Similar presentations

About project

Feedback