Presentation is loading. Please wait.

Presentation is loading. Please wait.

Secure Query Processing in an Untrusted (Cloud) Environment.

Similar presentations


Presentation on theme: "Secure Query Processing in an Untrusted (Cloud) Environment."— Presentation transcript:

1 Secure Query Processing in an Untrusted (Cloud) Environment

2 Agenda Introduction to the model Overview of different approaches

3 Model Data owner / Users Database Service Provider (SP) Data Database Provides professional database service Backup Performance … EmpIDHourlyRateWorkingHour 24036 Get back data from SP for own use Find Alice’s record

4 Introduction: security concern Data owner / Users Database Service Provider (SP) Sensitive Data Trusted PartyUntrusted Party Objectives of our work: (1)Protect sensitive data from being seen by untrusted party (including SP) (2)Users still enjoy the database service from SP March 2009, Google Docs allowed unintended access to some private documents June 2013, Facebook bug leaks contact info of 6 million users

5 Secure database system - overview Encrypt data before sending to SP 5730923749300489226453 EmpIDHourlyRateWorkingHour 13023 24036 Data owner (DO) / User Service provider (SP) EmpIDHourlyRateWorkingHour 79826334164104547322019 5730923749300489226453 Q: SELECT * WHERE HourlyRate * WorkingHour > 900 Q’ 24036 Transformed queries, with some ‘trapdoors’ to help SP to compute the answer

6 Approaches to solve the problem Hardware-based solutions – Trusted DB [SIGMOD 2011], Cipherbase [SIGMOD 2013] Homomorphic-encryption-based solutions – CryptDB [CACM 2012], MONOMI [PVLDB 2013] Secure Multiparty Computation (SMC) approach – ShareMind [PAISI 2012] Our solution Secure indexing approaches – Orthogonal to above solutions, can be integrated to any of them – Domain partitioning [SIGMOD 2002]

7 Before we discuss different approaches… SP is assumed to be more powerful. Users are trusted. They can see plain data. – A baseline solution exists Users retrieve the entire encrypted database Decrypt it, then do whatever they want – Problems of the baseline solution High communication cost and high processing cost at users What different approaches are trying to do – Delegate the query processing job to SP Utilize the power of SP – Users obtain the (encrypted) query answers only Low communication cost and low processing cost at users – We can always revert back to baseline method!

8 Hardware-based solutions Use of secure co-processor Can store a key on it – No party can observe the key stored on it Provides API for cryptographic actions using the stored key Tamper-resistance – Cannot hack the device through physical intrusion

9 Use of secure coprocessor Users SP Data Database Secure co-processor(s) is/are installed at SP side Data Encryption using a secret key The key is sent to the secure coprocessor through secure channel Note: the key is known to users and the secure co-processor only Find Alice’s record Decrypt the records one by one and process the query Result Can be encrypted or plain. In this example, just return yes/no Answer Decrypt the answer

10 Optimization strategies Add more secure co-processors for parallel processing Compute the part of query that does not involve encrypted data on DBMS first – Example: SELECT * FROM T WHERE A>10 and B<20 If A is encrypted while B is not, the DMBS first processes the predicate B<20

11 Pros and cons Pros – Strong security protection as long as the secure coprocessor is not compromised – Can process any query Cons – Require special hardware – Expensive In USD (Data obtained on 7 Feb 2014)

12 Homomorphic-encryption-based solutions Homomorphic encryption – A special type of encryption which allows certain type of operations (on plain values) to be executed on encrypted values Let E be an encryption function – Homomorphic property E(f(x, y)) = g(E(x), E(y)) – Examples RSA – E(a)*E(b) = E(a*b) OPES [SIGMOD 04] – E(a) > E(b) iff a > b

13 Using homomorphic encryptions E(35) by OPES ejAAS Users SP EmpIDHourlyRateWorkingHour 15023 23036 EIDHRWH 1ka6fjh3a45 2d2s2aAnm24 Sensitive By OPES By RSA EIDHRWH 1Hj%345877 2Ks12#AA244 OPES RSA SELECT HR*WH WHERE HR > 35 HR > 35 > < HR*WH z%^#5 HR*WH 1150

14 Pros and cons Pros – Low overheads in query processing at SP Example: just need multiplication on RSA-encrypted data without encryption or decryption Cons – Multiple encrypted versions of the same data may be needed – Does not support composition of operations Without data interoperability Example: cannot compute A*B > C

15 Without data interoperability RSA: E 1 (x) * E 1 (y) = E 1 (x*y) * E 1 (x)E 1 (y)E 1 (a) = E 1 (x*y) OPES: E 2 (a) > E 2 (b) if a > b Supports multiplication over encrypted data Supports comparison over encrypted data > E 2 (a) E 2 (b) How to compute x+y > b over encrypted data? User Operate on different space decrypt E 1 (a) then encrypt E 2 (a)

16 Secure Multiparty Computation (SMC) approach EIDWH 218 213 25 Users SP #1 SP #2 SP #3 EIDHRWH 15023 23036 EIDHRWH 16028 23118 EIDHRWH 14056 205 EIDHRWH 15039 29913 Secret sharing v = v 1 + v 2 + v 3 mod 100 Each SP can’t derive the plain value v by having one share v i only SELECT EID, WH WHERE HR > 35 By exchanging some information (may involve multiple rounds), the result can be computed securely EIDWH 236

17 Example: addition protocol z s 3 + r 1 SP #1 SP #2 SP #3 xy x1x1 y1y1 xy x3x3 y3y3 xy x2x2 y2y2 Operation: z = x + y s 1 = x 1 +y 1 -r 1 v = v 1 + v 2 + v 3 mod n s 2 = x 2 +y 2 -r 2 z s 1 + r 2 z s 2 + r 3 s 3 = x 3 +y 3 -r 3 z 1 + z 2 + z 3 = x + y

18 Pros and cons Pros – Theoretically support any computations – Usually low processing cost at SPs Most protocols do not need cryptographic operations Cons – High communication costs between SPs Multiple rounds of communication – The SPs must not be colluding – 3 times the cost due to multiple SPs

19 Our solution XY x 1a y 1a x 2a y 2a Users SP Row-idXY r1r1 x 1a y 1a r2r2 x 2a y 2a … …… Row-idXY E(r 1 )x 1b y 1b E(r 2 )x 2b y 2b … …… 2-party secret sharing Row-idX ck X Y ck y r1r1 r2r2 Column key for each column XY x 1b y 1b x 2b y 2b It incurs a high storage overhead to users Row-ids are encrypted by some existing encryption method Without knowing the shares at users, SP can’t recover the plain data A table of pseudo- random numbers

20 The actual storage at both sides A B Users Row keyAB 188 23229 Row keyAB E(1)931 E(2)2229 SP Users only remember the column keys (each contains two values) AB 23 41 Plain data v = v 1 v 2 mod n n = 35

21 Operation on our encrypted data A B Users Row keyAB E(1)931 E(2)2229 SP Similar to SMC, there will be some communications between user and SP But the communication is uni-directional (only user -> SP) Operation: C = A+B C C e = A’ + B’ E(1)20 E(2)5 Some ‘hints’ are sent to SP to help SP compute the operation

22 Retrieving the data SELECT C WHERE A * B + D > 20 A B C D Table schema, and column keys at user Row-idMatch? E(1)No E(2)Yes E(6)No E(4)No …… Find the answers Projection on C only Row keyC E(2)3 E(16)12 …… Encrypted answer sent back to user Row-ids must be there Row-idABCD E(1)………… E(2)………… Encrypted values at SP

23 Decrypting the result SELECT C WHERE A * B + D > 20 A B C D Table schema, and column keys at user v = v 1 v 2 mod n n=35 Row keyC E(2)3 E(16)12 …… Row keyC 231 1617 …… User computes own item keys Encrypted answers C 23 29 … Decrypt

24 Features of our algorithms User operates with column keys only, the share table is never computed during query processing – Low processing cost at user and low communication cost Data interoperability – Allows composition of operations – Support a wide range or queries

25 With data interoperability + E(x)E(y) > E(a) = E(x+y) E(b) How to compute x+y > b over encrypted data? Other examples: (x 1 – x 2 ) 2 + (y 1 – y 2 ) 2 can be computed using addition and multiplication only

26 Features of our design Uni-directional communication (user -> SP) – One round of communication for all operations Allow operations between plain and encrypted data – Encrypting everything is not suggested Overheads in processing on encrypted data – Queries may compose of both plain and encrypted data – Example: SELECT * WHERE A*B > C. A is encrypted, B and C are not.

27 Cons Incur high processing cost to SP, due to massive cryptographic operations Still under development – Currently focus on integer type data – Query plan optimization

28 END.

29 ADDITIONAL MATERIALS

30 Secure item key generator INPUT: row key r, column key – All are kept private System parameter: n, g – Selected by DO, n is public, g is not Generation function: v k = mg xr mod n Security: – Extension of RSA function – Even if an attacker observes several item keys, it is computationally hard to derive the secret parameters and hence other item keys

31 Illustration 1: Multiplication of 2 columns AB 123 241 Plain data A B AeAe BeBe 1931 22229 Table schema, and column keys at DO Encrypted values at SP n=35 g=2 C CeCe 134 28 Result: C 129 218 C=AB 6 4 DO SP C e = A e B e

32 Proof of correctness We have a = m a g rx a a’ b = m b g rx b b’ Decryption on C m a m b g r(x a +x b ) (a’b’) = (m a g rx a a’)(m b g rx b b’) = ab A B AeAe BeBe E(r)a’b’ C CeCe E(r)a’b’ DO SP

33 Illustration 2 Addition C=A+B – Example: SELECT * WHERE salary + bonus > 40,000 Preparation stage – We add a constant column S to the plain database – S is encrypted, i.e., DO keeps a column key of S, SP keeps a column of encrypted values AB 23 41 ABS 231 411

34 ABS 231 411 DOSP Plain data C = A + B 5 5 C A B S AeAe BeBe SeSe E(1)9318 E(2)22294 p A = 15 p B = 2 A’ = q A A e S e p A B’ = q B B e S e p B E(1)2926 E(2)41 Row keyC 123 21 Item keys q A = 18 q B = 4 C e = A’ + B’ E(1)20 E(2)5 Storage at both sides DO gives hints to SP SP computes the encrypted answers p A = 13 -1 * (5-2) mod 24 p B = 13 -1 * (5-3) mod 24 q A = 2 * 11 15 * 4 -1 mod 35 q B = 1 * 11 2 * 4 -1 mod 35 p A = x s -1 * (x c -x a ) mod Φ(n) p B = x s -1 * (x c -x b ) mod Φ(n) q A = m a * m S p a * m C -1 mod n q B = m b * m S p b * m C -1 mod n

35 Proof of correctness We have a = m a g rx a a’ b = m b g rx b b’ 1 = m s g rx s s’  s’ = m s -1 g -rx s Following the procedure, we have A B S AeAe BeBe SeSe E(r)a’b’s’ C CeCe E(r)c’ DO SP c’ = (q A a’s’ p A )+(q B b’s’ p B )c’ = (m a m s p A m c -1 ) a’ s’ p A + (m b m s p B m c -1 ) b’ s’ p B A’ = q A A e S e p A B’ = q B B e S e p B E(1)2926 E(2)41 C e = A’ + B’ E(1)20 E(2)5 q A = m a * m S p a * m C -1 mod n q B = m b * m S p b * m C -1 mod n c’ = (m a m c -1 ) a’ g -rx s p A + (m b m c -1 ) b’ g -rx s p B (m s -1 g -rx s ) p A = m s -p A g -rx s p A c’ = (m a m c -1 ) a’ g -r(x c -x a ) + (m b m c -1 ) b’ g -r(x c -x b ) p A = x s -1 * (x c -x a ) mod Φ(n) p B = x s -1 * (x c -x b ) mod Φ(n) c’ = m c -1 g -rx c (m a g rx a a’ + m b g rx b b’) c’ = m c -1 g -rx c (a + b) Decryption on c’ m c g rx c c’ = a + b


Download ppt "Secure Query Processing in an Untrusted (Cloud) Environment."

Similar presentations


Ads by Google