International Technology Alliance in Network & Information Sciences Knowledge Inference for Securing and Optimizing Secure Computation Piotr (Peter) Mardziel,

Slides:



Advertisements
Similar presentations
Polylogarithmic Private Approximations and Efficient Matching
Advertisements

Estimating Distinct Elements, Optimally
Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.
Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)
Operating System Security
Announcements You survived midterm 2! No Class / No Office hours Friday.
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.
Secure Multiparty Computations on Bitcoin
Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
1 CS101 Introduction to Computing Lecture 17 Algorithms II.
Wysteria: A Programming Language for Generic, Mixed-Mode Multiparty Computations Aseem Rastogi Matthew Hammer, Michael Hicks (University of Maryland, College.
Fast Algorithms For Hierarchical Range Histogram Constructions
Short course on quantum computing Andris Ambainis University of Latvia.
PROBABILISTIC COMPUTATION FOR INFORMATION SECURITY Piotr (Peter) Mardziel (UMD) Kasturi Raghavan (UCLA)
DYNAMIC ENFORCEMENT OF KNOWLEDGE-BASED SECURITY POLICIES Piotr (Peter) Mardziel, Stephen Magill, Michael Hicks, and Mudhakar Srivatsa.
Yan Huang, Jonathan Katz, David Evans University of Maryland, University of Virginia Efficient Secure Two-Party Computation Using Symmetric Cut-and-Choose.
What Are Partially Observable Markov Decision Processes and Why Might You Care? Bob Wall CS 536.
Great Theoretical Ideas in Computer Science.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
Co-operative Private Equality Test(CPET) Ronghua Li and Chuan-Kun Wu (received June 21, 2005; revised and accepted July 4, 2005) International Journal.
Analysis of Algorithms1 Estimate the running time Estimate the memory space required. Time and space depend on the input size.
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.
Describing Syntax and Semantics
School of Computer ScienceG53FSP Formal Specification1 Dr. Rong Qu Introduction to Formal Specification
NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision.
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
Control of Personal Information in a Networked World Rebecca Wright Boaz Barak Jim Aspnes Avi Wigderson Sanjeev Arora David Goodman Joan Feigenbaum ToNC.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Statistics for Managers Using Microsoft® Excel 7th Edition
How to play ANY mental game
DYNAMIC ENFORCEMENT OF KNOWLEDGE-BASED SECURITY POLICIES Michael Hicks University of Maryland, College Park Joint work with Piotr Mardziel, Stephen Magill,
Secure Cloud Database using Multiparty Computation.
1 Lesson 3: Choosing from distributions Theory: LLN and Central Limit Theorem Theory: LLN and Central Limit Theorem Choosing from distributions Choosing.
Analysis of Algorithms
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Secure Computation (Lecture 7-8) Arpita Patra. Recap >> (n,t)-Secret Sharing (Sharing/Reconstruction) > Shamir Sharing > Lagrange’s Interpolation for.
Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA.
Secure sharing in distributed information management applications: problems and directions Piotr Mardziel, Adam Bender, Michael Hicks, Dave Levin, Mudhakar.
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
Secure two-party computation: a visual way by Paolo D’Arco and Roberto De Prisco.
Annual Conference of ITA ACITA 2010 Secure Sharing in Distributed Information Management Applications: Problems and Directions Piotr Mardziel, Adam Bender,
Ebrahim Tarameshloo, Philip W.L.Fong, Payman Mohassel University of Calgary Calgary, Alberta, Canada {etarames, pwlfong, On Protection.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Confidentiality-preserving Proof Theories for Distributed Proof Systems Kazuhiro Minami National Institute of Informatics FAIS 2011.
Simulation is the process of studying the behavior of a real system by using a model that replicates the behavior of the system under different scenarios.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
PROBABILISTIC PROGRAMMING FOR SECURITY Michael Hicks Piotr (Peter) Mardziel University of Maryland, College Park Stephen Magill Galois Michael Hicks UMD.
Alternative Wide Block Encryption For Discussion Only.
Simulation is the process of studying the behavior of a real system by using a model that replicates the system under different scenarios. A simulation.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
Quantification of Integrity Michael Clarkson and Fred B. Schneider Cornell University IEEE Computer Security Foundations Symposium July 17, 2010.
Belief in Information Flow Michael Clarkson, Andrew Myers, Fred B. Schneider Cornell University 18 th IEEE Computer Security Foundations Workshop June.
Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.
KNOWLEDGE-ORIENTED MULTIPARTY COMPUTATION Piotr (Peter) Mardziel, Michael Hicks, Jonathan Katz, Mudhakar Srivatsa (IBM TJ Watson)
Cryptography Lecture 3 Arpita Patra © Arpita Patra.
Round-Efficient Multi-Party Computation in Point-to-Point Networks Jonathan Katz Chiu-Yuen Koo University of Maryland.
Statistical Properties of Digital Piecewise Linear Chaotic Maps and Their Roles in Cryptography & Pseudo-Random Coding Li ShujunLi Shujun 1, Li Qi 2, Li.
AP National Conference, AP CS A and AB: New/Experienced A Tall Order? Mark Stehlik
On the Size of Pairing-based Non-interactive Arguments
CS 3343: Analysis of Algorithms
A Verified DSL for MPC in
Differential Privacy in Practice
Knowledge Inference for Optimizing Secure Multi-party Computation
Asst. Dr.Surasak Mungsing
Presentation transcript:

International Technology Alliance in Network & Information Sciences Knowledge Inference for Securing and Optimizing Secure Computation Piotr (Peter) Mardziel, Michael Hicks, Aseem Rastogi, Matthew Hammer, Jonathan Katz (UMD) Mudhakar Srivatsa (IBM TJ Watson) With Towsley et al (Umass), Kasturi Rangan (UCLA) Annual Meeting of the ITA October 2013

Sharing between coalition domains is critical for mission success 2 Scout (Coalition A) Supporting force (Coalition B) Unmanned Air Vehicle (UAV) (Coalition A) Back-office Data Analyst (Coalition A) Satellite Communications – backhaul (Coalition A) X X X X Y Y X’X’ X’X’ X’X’ X’X’ X’X’ X’X’ Y Y Z Z Z Z Y Y Y Y Y Y Mixed force (Coalition A, C)

ITA Technologies facilitate sharing  ITA has developed many excellent technologies for sharing information –Gaian DB –Information fabric –Controlled English Store  All harness information and make it available to coalition partners –Provide a query or pub/sub interface  But: there may be risk in sharing all information –Might like to allow some queries but not others If the query would reveal too much information about the raw data If a sequence of queries would do so, even if one would not 3

Our research: Knowledge inference  Key idea: use program analysis (of the query) –to understand what the answer reveals about sensitive information to (a rational) recipient  We call this analysis knowledge inference  We have used knowledge inference in a variety of applications 4

Summary of Results (outline)  Knowledge-based security [CSF’11, NIPSPP’12, JCS’13, HOTNETS’13] –Enforce a security policy based on adversary’s (accumulated) knowledge –Implementation and experimental evaluation –Proof of soundness: will never underestimate adversary knowledge  Knowledge-based security for SMC [PLAS’12] –Adapt knowledge inference to consider multiple parties’ secrets –Proof of soundness  Optimizing SMC [PLAS’13] –Identify inferrable values by knowledge inference Do not bother to compute these using SMC –Leads to 30x speedup –Proof of correctness of technique 5

Papers on ITACS  [JCS’13] Piotr Mardziel, Stephen Magill, Mike Hicks and Mudhakar Srivatsa, Dynamic Enforcement of Knowledge-based Security Policies, Journal of Comp. Security, Feb’12, –  [NIPSPP’12] Piotr Mardziel and Kasturi Rangan, Probabilistic Computation for Information Security, NIPS Probabilistic Programming Workshop, Dec’12, –  [PLAS’12] P. Mardziel, M. Hicks, J. Katz and M. Srivatsa, Knowledge-Oriented Secure Multiparty Computation, Programming Languages and Analyses for Security, June’12, –  [PLAS’13] Aseem Rastogi, Piotr Mardziel, Michael Hicks and Matthew Hammer, Knowledge Inference for Optimizing Secure Multi-party Computation, Programming Languages and Analyses for Security, June’13. –  [HOTNETS’13] Z. Shafiq, F. Le, M. Srivatsa and D. Towsley. Cross-Path Inference Attacks on Multipath TCP, ACM HotNets, July’13. – 6

Knowledge about the world  Learning about the world from observations : Today = not-raining 0.5 : Today = raining weather Outlook 0.82 : Today = not-raining 0.18 : Today = raining Outlook = sunny inference

Knowledge about secrets  Characterize adversary knowledge. 8 Secret system Public Output Public Output = “login failed” inference … 0.01 : Secret = : Secret = : Secret = 43 …

Levels of knowledge?  Characterize system as safe vs. unsafe. 9 … 0.05 : Secret = : Secret = : Secret = 43 … … 0.02 : Secret = : Secret = : Secret = 43 … … 0.01 : Secret = : Secret = : Secret = 43 … 1.00 : Secret = 42 inference approx. inference unsafe safe

Soundness of knowledge  Soundly approximate level of knowledge. 10 … 0.05 : Secret = : Secret = : Secret = 43 … … 0.02 : Secret = : Secret = : Secret = 43 … … 0.01 : Secret = : Secret = : Secret = 43 … 1.00 : Secret = 42 actual inference sound approx. inference unsafe safe

Technology: probabilistic programming  Programs –whose inputs and outputs may be distributions rather than values –which may contain uses of probabilistic choice  Effectively represent algorithmic description of a probabilistic model –conditional probability distribution relating inputs and outputs 11 Pr [ Outlook = sunny | Today = not-raining ] = 0.9 weather(today) { if (today == “not-raining”) { if (flip 0.9) return “sunny” else return “overcast” } else if (today == “raining”) { if (flip 0.8) return “overcast” else return “sunny” } CODE

Maintain a representation of each querier’s belief about secret’s possible values Each query result revises the belief; reject if actual secret becomes too likely Cannot let rejection defeat our protection. time Q1 Q3 … … Q2 Reject 12 Belief ≜ probability distribution Bayesian reasoning to revise belief OK (answer) Knowledge-based security

Policy = knowledge threshold  Answer a query if, for querier’s revised belief, Pr[my secret] < t –Call t the knowledge threshold  Choice of t depends on the risk of revelation 13

αProb: Implementation (CSF’11, JCS’13)  Queries are simple imperative programs  Approach: abstract interpretation for implementing probabilistic operations. Building blocks: –lattice point enumeration –integer programming  Key idea: abstract interpretation is sound –Never underestimate the knowledge –But may overestimate it Improves audit time May reject some legal queries  Application to sensor networks, location [NIPSPP’12] –Gave demo earlier in the week  Application to MPTCP [HOTNETS’13] 14

Current activity: Modeling time/change  Secrets can change over time.  In progress: formal model, theorems about knowledge of both the stream of secrets and the delta function 15 Pr [ Secret 2 = 42 | Secret 1 = 42 ] = delta(secret 1 ) { if (flip 0.9) return secret 1 else return (uniform 0,255) } CODE Pr [ Secret 1 = 42 ] = 1.0

Other activities  Expand expressiveness, improve performance –Model continuous distributions, not just discrete ones –Employ other forms of approximation  More applications –Multiparty TCP flows –Sensor networks –Mobility 16

Joint computations over secrets  Rather than asymmetric queries, may want to compute joint results –Coalitions each have sensor networks; use them to answer queries while hiding details –Coalitions perform joint mission planning; staff mission without knowing total resources 17 Q = Some function x y Q (X,Y) “attack at dawn”

Secure multiparty computation  Multiple parties have secrets to protect.  Want to compute some function over their secrets without revealing them. 18 x y Q(x,y) True / False Q = if x ≥ y then out := True else out := False

Secure multiparty computation  Use trusted third party. 19 x y T Q(x,y) Q = if x ≥ y then out := True else out := False True

Secure multiparty computation  SMC lets the participants compute this without a trusted third party. 20 T x y Q(x,y) True Q = if x ≥ y then out := True else out := False

Secure multiparty computation  Nothing is learned beyond what is implied* by the query output. 21 x y Q(x,y) True / False Q = if x ≥ y then out := True else out := False

Secure multiparty computation  Nothing is learned beyond what is implied* by the query output. –* what is implied can be a lot 22 x = ? x y=2 Q(x,2) Q = if x ≥ y then out := True else out := False False A B

Secure multiparty computation  Nothing is learned beyond what is implied* by the query output. –* what is implied can be a lot 23 x = 1 Q(x,2) Q = if x ≥ y then out := True else out := False False x A y=2 B

Secure multiparty computation  Nothing is learned beyond what is implied* by the query output. –* what is implied can be a lot 24 x = ? Q(x,3) Q = if x ≥ y then out := True else out := False False x A y=3 B

Secure multiparty computation  Nothing is learned beyond what is implied* by the query output. –* what is implied can be a lot 25 x ∈ {1,2} Q(x,3) Q = if x ≥ y then out := True else out := False False x A y=3 B

Secure multiparty computation  Nothing is learned beyond what is implied* by the query output. –* what is implied can be a lot 26 x = ? Q(x, ∞) Q = if x ≥ y then out := True else out := False False x A y=∞ B

Secure multiparty computation  Nothing is learned beyond what is implied* by the query output. –* what is implied can be a lot 27 x ≥ 1 Q(x, ∞) Q = if x ≥ y then out := True else out := False False x A y=∞ B

Knowledge-based security for SMC (PLAS’12)  Results (details in paper): –Adapt knowledge inference to SMC setting –Enforce threshold-based policies Two techniques: Belief sets and SMC-based belief tracking –Proof that our methods are sound (never underapproximate adversary knowledge)  Implementation not sufficiently performant for use on-line 28

Goal: Make SMC more performant (PLAS’13)  SMC is an appealing technology, but it is very slow –Implementation based on “garbled circuits” –Several orders of magnitude slower than normal computation  Recent work has developed general methods to improve SMC performance –Circuit-level optimizations –Pipelining circuit generation and execution (increases parallelism and decreases memory) –But: ultimately SMC is always going to be much slower than normal computation  Idea: use knowledge inference to find opportunities to replace SMC with normal computation in particular programs, with no loss to security 29

Example – Joint Median Computation { A 1, A 2 }, { B 1, B 2 } Assume: A 1 < A 2 and B 1 < B 2 and Distinct( A 1, A 2, B 1, B 2 ) a = A 1 ≤ B 1 ; b = a ? A 2 : A 1 ; c = a ? B 1 : B 2 ; d = b ≤ c ; output = d ? b : c ; 10/27/ Can show that Alice and Bob can infer a and d

Secure Computation 10/27/ output = d ? b : c ; dd a = A 1 ≤ B 1 ; b = a ? A 2 : A 1 ; c = a ? B 1 : B 2 ; d = b ≤ c ; Knowledge leads to optimized protocol

Median Example – Analysis from Bob’s Perspective 10/27/ a = A 1 ≤ B 1 ; b = a ? A 2 : A 1 ; c = a ? B 1 : B 2 ; d = b ≤ c ; output = d ? b : c ; A 1 ≤ B 1 ∧ A 2 ≤ B 1 A 1 ≤ B 1 ∧ A 2 > B 1 A 1 > B 1 ∧ A 2 ≤ B 1 A 1 > B 1 ∧ A 2 > B 1 d = ( output ≠ B 1 Ʌ output ≠ B 2 ) Recall: Distinct( A 1, A 2, B 1, B 2 ) a = ( output ≤ B 1 ) Recall: B 1 < B 2

Formalization of Knowledge 10/27/ x can be uniquely determined by p ’s inputs I and outputs O Party p knows x if: Two program executions that agree on I and O, also agree on x

Knowledge in Median Example Let states σ map program variables to values 10/27/ a = A 1 ≤ B 1 ; b = a ? A 2 : A 1 ; c = a ? B 1 : B 2 ; d = b ≤ c ; output = d ? b : c ; Bob knows a, if for all final states σ 1 and σ 2 s.t. σ 1 [ B 1 ] = σ 2 [B 1 ], σ 1 [ B 2 ] = σ 2 [B 2 ], and σ 1 [output] = σ 2 [ output ], we have, σ 1 [a] = σ 2 [a]

Results (details in paper)  Make the previous definition into an algorithm by using an idea called self-composition –Allows us to create a formula that, if satisfiable, says whether a variable is known –Can give this formula to an SMT solver –Result: implementation and proof of correctness (sound and relatively complete)  We have also developed an algorithm that is constructive –Computes formula that witnesses knowledge of the variable 35

Ongoing work  Building SMC compiler –Novel programming language for expressing mixed mode multiparty computation (M3PC) Combination of joint and local computations –Will employ knowledge-inference optimization to transform SMC programs to M3PC programs –Developing novel back end based on garbled circuits (standard mechanism) and oblivious RAM 36

Summary  Research agenda based on knowledge inference –Determining what a party can learn about a secret given a run of a program –Can use this for enforcing security, and optimizing computation  Ongoing work continues this agenda –Time-varying secrets –New applications (greater expressiveness) –New computational platform 37

BACKUP 38

Expressibility  Prior work [CSF’11, JCS‘13], supported limited language features. –distributions: piecewise bounds over discrete domains –possible but inconvenient to express other distributions 39 discrete distributions upper bounds lower bounds

Expressibility: continuous distributions  Continuous distributions for modeling real world processes. 40

Polynomial approximation  Improve precision by polynomial bounds (as opposed to constant). 41

Scales better than enumeration = 0 ≤ bday ≤ ≤ byear ≤ 1992 = 0 ≤ bday ≤ ≤ byear ≤ pp > 1 pp 42 each equally likely bday1 small bday 1 large

43 Birthday query 1+2+special Performance/precision tradeoff

Intervals very fast generally QueryIntervalsOctagonsPolyhedra Bday1 (small) Bday1+2 (small) Bday1+2+spec Bday1 (large) Bday1+2 (large) Bday1+2+spec Pizza Photo Travel Times in seconds All achieve maximum precision when given unlimited polyhedra

LattE is the performance bottleneck 45

Merging order matters for precision 46 Each point represents a different merging order for the given bound Median precision point depicted as a box Semi-interquartile range given in gray Best precision possible is at the very bottom (about 3.8 * )

Knowledge-based security for SMC (PLAS’12)  Approach: –Adapt knowledge inference to SMC setting –Develop means to enforce threshold-based policies  Knowledge inference: –Each party A Knows his own secret, estimates what others know about it Estimates something about each other party’s secret –Goal Define how a query result revises each party’s belief, and each party’s estimate of other beliefs about his secret  Threshold security: –Using knowledge inference, accept/reject query based on inferred knowledge (of others) 47

THE MINOR INCONVENIENCE What you learn depends on your secret. 48

Secure multiparty computation  Knowledge depends on secret. 49 Q(x,?) Q = if x ≥ y then out := True else out := False False x ∈ {1,…,42} Peter y=? The audience y

Secure multiparty computation  Knowledge depends on secret.  Knowledge leaks information about secret. 50 Q(x,?) Q = if x ≥ y then out := True else out := False False x ∈ {1,…,42} Peter y=? The audience x ∈ {1,…,42} y=?

Secure multiparty computation  Knowledge depends on secret.  Knowledge leaks information about secret. 51 Q(x,?) Q = if x ≥ y then out := True else out := False False x ∈ {1,…,42} Peter y=43 The audience x ∈ {1,…,42} y=43

THE MEDIOCRE INCONVENIENCE I am not allowed to know what knowledge you attain. 52

It gets worse…  Your Knowledge depends on your secret.  Your knowledge leaks information about your secret.  My knowledge-based policy depends on your knowledge.  My policy decision leaks information about your knowledge.  Therefore: my policy decision leaks information about your secret.

THE BIG INCONVENIENCE I am not allowed to know whether my knowledge-based policy permits you to see the output of a query. 54

55 Peter The audience I give up

THE MINOR IDEA Option 1 Be (very) conservative 56

x A B Approach 1: Belief sets  Generalize the asymmetric case –Party A estimates B’s knowledge as an enumeration B knows his own secret perfectly, and knows A’s secret imperfectly Thus: enumerate all possible values of B’s secret according to A’s estimation –Perform knowledge inference for each element of the enumeration 57 Q 1 (x,y) δ δ’δ’ δ δ’δ’ y δ δ’δ’ y=1 y=2 y=3 … Belief set

x A B Belief sets: assessment  Pros –Straightforward generalization, can reuse existing technology  Cons –Conservative: threshold security rejects in the worst (rare) case –Expensive: for N parties with size-M beliefs, we must try N*M possibilities 58 Q 1 (x,y) δ δ’δ’ δ δ’δ’ y δ δ’δ’ y=1 y=2 y=3 … Belief set

x Peter Approach 2: knowledge inference in SMC 59 Q 1 (x,y) NOPE Q 2 (x,y) TIME δ2δ2 δ2’δ2’ δ2’δ2’ δ 2 ’’ … okay δ1δ1 δ1’δ1’ δ2’δ2’ δ 2 ’’ The audience y Q 1 (x,y) = true

T x Peter SMC like a trusted third party Q 2 (x,y) TIME δ2δ2 δ2’δ2’ δ2’δ2’ δ 2 ’’ δ1δ1 δ1’δ1’ δ2’δ2’ The audience y True NOPE Q 1 (x,y)Q 2 (x,y) NOPE

T x Peter Answer may depend on secret Q 2 (x,y) TIME δ2δ2 δ2’δ2’ δ2’δ2’ δ 2 ’’ δ1δ1 δ1’δ1’ δ2’δ2’ The audience y True NOPE Q 1 (x,y)Q 2 (x,y)

SMC belief tracking: assessment  Pros –More precise: SMC can know each party’s secrets exactly  Cons –Party not allowed to know others’ outcomes, so knowledge estimate incomplete (kept by pseudo-PT) –Expensive – SMC s bad enough without adding probabilistic programming. Supporting it still a matter of research 62