How to Keyword-Search Securely in Cloud Storage Service Kaoru Kurosawa Ibaraki University, Japan ICISC 2014, Dec. 3-5, Chung-Ang University, Korea.

How to Keyword-Search Securely in Cloud Storage Service Kaoru Kurosawa Ibaraki University, Japan ICISC 2014, Dec. 3-5, Chung-Ang University, Korea

Cloud Storage Service is now available ServiceProvider Amazon S3/Cloud Drive Amazon Google DriveGoogle OneDriveMicrosoft iCloudApple Dropbox and many more

We know that we should store encrypted documents. Then, we cannot even do keyword search. 3

A Searchable Symmetric Encryption (SSE) scheme solves this problem. It consists of a store phase and a search phase. 4

In the store phase, A client stores the encrypted files (or documents) and the encrypted Index on the server Client Server E(D 1 ), ⋯, E(D N ) E(Index) 5

In the search phase, The client sends an encrypted keyword to the server Client Server E(keyword) 6

The server somehow returns The encrypted files E(D 3 ), E(D 6 ), E(D 10 ) which contain the keyword Client Server E(keyword) E(D 3 ), E(D 6 ), E(D 10 ) 7

So the client can retrieve some of the encrypted files which contain a specific keyword, keeping the keyword secret Client Server E(keyword) E(D 3 ), E(D 6 ), E(D 10 ) 8

SSE has been studied by D.Song, D.Wagner, A.Perrig (2000) Eu-Jin Goh (2003) Golle, Staddon, Waters (2004) Y.Chang and M.Mitzenmacher (2005) Curtmola, Garay, Kamara and Ostrovsky (2006) Peishun Wang, Huaxiong Wang, Josef Pieprzyk (2008) Kamara, Papamanthou an Roeder (2012) Cash, Jarecki, Jutla, Krawczyk, Rosu, Steiner (2013) Cash and Tessaro (2014) 9

In this talk, UC-Secure Searchable Symmetric Encryption How to Update Documents Verifiably in Searchable Symmetric Encryption Garbled Searchable Symmetric Encryption 10

First UC-Secure Searchable Symmetric Encryption, Kaoru Kurosawa and Yasuhiro Ohtaki (FC 2012) 11

By Passive Attack A server tries to break the privacy: she tries to find the keyword and the documents Client Server E(keyword) E(D 3 ), E(D 6 ), E(D 10 ) Malicious 12

By Active Attack A server tries to break the reliability: she tries to forge and delete some files, or replace E(D 3 ) with another E(D 100 ). Client Server E(keyword) E(D 3 ), E(D 6 ), E(D 10 ) E(D 100 ) Malicious 13

Curtmola, Garay, Kamara and Ostrovsky (2006) showed a rigorous definition of security against passive attacks (privacy.) They also presented a scheme which satisfies their definition. 14

At FC 2012 PrivacyCurtmola et al. ReliabilityOur paper UC securityOur paper 15 We studied and proved that Privacy + Reliability = UC security

Curtmola et al. keywordDocuments AustinD 3, D 6, D 10 BostonD 8, D 10 WashingtonD 1, D 4, D 8 Showed an SSE scheme such as follows. Consider the following “Index” Index 16

The client first constructs E(Index) as follows. He chooses a pseudorandom permutation π. = E(Index) 17 π(1) π(2) π(3) …

He next computes π(Austin, 1), π(Austin, 2) and π(Austin, 3), Writes the indexes (3, 6, 10) in these addresses 3 6 10 Address π(Austin, 1) π(Austin, 2) π(Austin, 3) E(Index) 18

Do the same for each keyword 3 6 10 8 Address π(Austin, 1) π(Austin, 2) π(Austin, 3) π(Boston, 1) π(Boston, 2) E(Index) 19

In the store phase, The client stores this E(Index) and the ciphertext of each file to the server Client Server E(Index) E(D 1 ), ⋯, E(D N ) 20

In the search phase, The client sends a trapdoor information Client Server t(Austin)= ( π(Austin, 1), π(Austin, 2), π(Austin, 3) ) 3 6 10 8 E(Index) 21

The server finds the corresponding indexes Client Server π(Austin, 1), π(Austin, 2), π(Austin, 3) 3 6 10 8 E(Index) 22

and returns Client Server π(Austin, 1), π(Austin, 2), π(Austin, 3) E(D 3 ), E(D 6 ), E(D 10 ) 3 6 10 8 E(Index) 23

This scheme Is secure against passive attacks. But it is not secure against active attacks. 24

This scheme Is secure against passive attacks. But it is not secure against active attacks. We will show how to make this scheme verifiable. 25

A naive approach is to add MAC to each E(D i ) ClientServer π(Austin, 1), π(Austin, 2), π(Austin, 3) E(D 3 ), MAC(E(D 3 )), E(D 6 ), MAC(E(D 6 )), E(D 10 ), MAC(E(D 10 )) The server returns these files together with their MACs 26

But a malicious server will Client π(Austin, 1), π(Austin, 2), π(Austin, 3) E(D 3 ), MAC(E(D 3 )), E(D 6 ), MAC(E(D 6 )), E(D 10 ), MAC(E(D 10 )) Malicious Replace some pair with another pair of (file, MAC) E(D 100 ), MAC(E(D 100 )) 27

The client cannot detect this cheating Client π(Austin, 1), π(Austin, 2), π(Austin, 3) E(D 3 ), MAC(E(D 3 )), E(D 6 ), MAC(E(D 6 )), E(D 10 ), MAC(E(D 10 )) Malicious Because this is a valid pair of MAC E(D 100 ), MAC(E(D 100 )) 28

In our verifiable scheme π(Austin, 1) So the server returns E(D 3 ), Tag 3 =MAC(π(Austin, 1), E(D 3 )) We include π(Austin, 1) in the input of MAC 29

This method works π(Austin, 1) E(D 3 ), Tag 3 =MAC(π(Austin, 1), E(D 3 )) Because the MAC authenticates the whole communication 30

At the store phase, The client writes such MAC values in E(Index) 3, tag3=MAC( π(Austin, 1), E(D 3 ) ) 6, tag6=MAC( π(Austin, 2), E(D 6 ) ) 10, tag10=MAC( π(Austin, 3), E(D 10 ) ) π(Austin, 1) π(Austin, 2) π(Austin, 3) E(Index) 31

For a query π(Austin, 1) E(Index) π(Austin, 1) The server returns E(D 3 ) and Tag3 3, tag3=MAC( π(Austin, 1), E(D 3 ) ) 6, tag6=MAC( π(Austin, 2), E(D 6 ) ) 10, tag10=MAC( π(Austin, 3), E(D 10 ) ) 32

The client checks the validity of π(Austin, 1) tag3=MAC( π(Austin, 1), E(D 3 ) ) E(D 3 ) 33

We next consider the definition of security. The security against active attacks consists of privacy and reliability We define privacy similarly to Curtmola et al. as follows. 34

Minimum Leakage In the store phase, E(D 1 ), ⋯, E(D N ), E(Index) the server learns |D 1 |, …, |D N | and |{keywords}| 35

In the search phase, This means that the server knows the corresponding indexes {3, 6, 10} For t(keyword), the server returns t(keyword) C(keyword)=( E(D 3 ), E(D 6 ), E(D 10 ) ) Tag 36

We call these information |D 1 |, …, |D N | and |{keywords}| corresponding indexes {3, 6, 10} The minimum leakage 37

The Privacy definition requires that the server should not be able to learn any more information 38

The Privacy definition requires that the server should not be able to learn any more information To formulate this, we consider a real game and a simulation game 39

In the Real Game D = {D 1, …, D N } W={set of keywords} Index Distinguisher C= { E(D 1 ), ⋯, E(D N ) } I= E{ Index } Challenger 40

In the search phase keyword Distinguisher t(keyword) Challenger 41

Repeat keyword Distinguisher t(keyword) Challenger 42

Finally keyword Distinguisher t(keyword) Challenger b=0 or 1 43

In the Simulation Game D = {D 1, …, D N } W={set of keywords} Index Distinguisher Somehow computes the ciphertexts C= { E(D 1 ), ⋯, E(D N ) } I= E{ Index } ChallengerSimulator the minimum leakage |D 1 |, …, |D N | and |{keywords}| 44

In the search phase, keyword Distinguisher Somehow computes t(keyword) ChallengerSimulator the minimum leakage {3, 6, 10} 45

Repeat keyword Distinguiher Somehow computes t(keyword) ChallengerSimulator {3, 6, 10} 46

Finally keyword Distinguisher t(keyword) ChallengerSimulator {3, 6, 10} b=0 or 1 47

We say that Privacy is satisfied if there exists a simulator such that the real game ≈ the simulation game 48

This Def. of privacy Was given by Curtmola et al. But it looks artificial. Who is the distinguisher ? 49

Server ? No. Client ? No. D = {D 1, …, D N } W={set of keywords} Index Distinguisher C= { E(D 1 ), ⋯, E(D N ) } I= E{ Index } Challenger 50

This question will be resolved When we consider UC security. From a view point of UC security, this is a very natural Def. of privacy. We will come back to this point later. 51

The client sends t(keyword) The honest server returns C(keyword)={E(D 3 ), E(D 6 ), E(D 10 )} Tag Next Reliability 52

We say that Reliability is satisfied if no server can forge (C(keyword)*, Tag*) such that C(keyword)* ≠ C(keyword) 53

By the way, Even if a protocol Σ is secure in stand-alone, it may not be secure if Σ is executed concurrently, Or if Σ is a part of a large protocol Client 1 Client 2 Server 54 Σ Σ

Universal Composability (UC) Is a framework which guarantees that A protocol Σ is secure Even if it is executed concurrently, and Even if it is a part of a large protocol 55

The notion of UC was introduced by Canetti. He proved that UC-security is maintained under a general protocol composition. 56

We formulated the UC security of verifiable SSE scheme. To do so, we defined the ideal functionality F vSSE as follows. 57

In the ideal world, dummy Client Ideal Functionality F vSSE Environment Z D={D 1, …, D N } W={set of keywords} Index 58

The dummy client relays them to F vSSE dummy Client Ideal Functionality F vSSE Environment Z D={D 1, …, D N } W={set of keywords} Index D={D 1, …, D N } W={set of keywords} Index 59

F vSSE keeps them dummy Client Ideal Functionality F vSSE Environment Z D={D 1, …, D N } W={set of keywords} Index UC adversary S 60

and sends the minimum leakage dummy Client Ideal Functionality F vSSE Environment Z D={D 1, …, D N } W={set of keywords} Index UC adversary S |D 1 |, …, |D N | |{keywords}| 61

In the search phase dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S 62

The dummy client relays it to F vSSE dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S 63

F vSSE sends the minimum leakage dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10} 64 D={D 1, …, D N } W={set of keywords} Index

The UC adversary S returns dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10} Accept or Reject 65 D={D 1, …, D N } W={set of keywords} Index

If S returns Reject, dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10} Reject 66

F vSSE sends Reject to the dummy client dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10} Reject 67

The dummy client relays it to Z dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10} Reject 68

If S returns Accept, dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10}Accept 69 D={D 1, …, D N } W={set of keywords} Index

F vSSE sends {D 3,D 6,D 10 } dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10}Accept {D 3,D 6,D 10 } 70 D={D 1, …, D N } W={set of keywords} Index

The dummy client relays them to Z dummy Client Ideal Functionality F vSSE Environment Z keyword UC adversary S {3,6,10}Accept {D 3,D 6,D 10 } 71

This is an ideal world Because (Correctness.) The dummy client receives {D 3,D 6,D 10 } correctly, or outputs Reject. (Security.) The UC adversary S learns only the minimum leakage. 72

Further S can corrupt dummy Client Ideal Functionality F vSSE Environment Z UC adversary S dummy Server 73 corrupt

Also Z can interact with S freely dummy Client Ideal Functionality F vSSE Environment Z UC adversary S dummy Server 74 corrupt

Z finally outputs 0 or 1 dummy Client Ideal Functionality F vSSE Environment Z UC adversary S dummy Server 75 corrupt

In the real world Client Server Environment Z D={set of documents} W={set of keywords} Index 76

Client Server Environment Z D, W, Index Then the client and the server run the store phase. 77

In the search phase Client Server Environment Z keyword 78

Client Server Environment Z keyword The client and the server run the search phase 79

Then the client sends D 3, D 6, D 10 to Z Client Server Environment Z keyword D 3, D 6, D 10 80

An adversary A can corrupt Client Server Environment Z Adversary A 81 corrupt

Further Z can interact with A freely Client Server Environment Z Adversary A 82 corrupt

Z finally outputs 0 or 1 Client Server Environment Z Adversary A 83 corrupt

We say that A verifiable SSE scheme is UC-secure if for any adversary A, there exists a UC-adversary S such that the real world ≈ the ideal world. 84

Equivalence (Our Theorem) A verifiable SSE scheme is UC-secure if and only if it satisfies privacy and reliability Here we consider non-adaptive adversaries. 85

Proof 86 Client Server Environment Z Adversary A keywordDocuments AustinD 3, D 6, D 10 BostonD 8, D 10 WashingtonD 1, D 4, D 8 D, W, In the real world,

The client sends Client Server Environment Z Adversary A 87 keywordDocuments AustinD 3, D 6, D 10 BostonD 8, D 10 WashingtonD 1, D 4, D 8 These ciphertexts E(D 1 ), …, E(D 10 ), E(Index) D, W,

Suppose that the adversary A Client Server Environment Z Adversary A 88 keywordDocuments AustinD 3, D 6, D 10 BostonD 8, D 10 WashingtonD 1, D 4, D 8 E(D 1 ), …, E(D 10 ), E(Index) corrupts D, W,

And sends these ciphertexts to Z Client Server Environment Z Adversary A 89 keywordDocuments AustinD 3, D 6, D 10 BostonD 8, D 10 WashingtonD 1, D 4, D 8 E(D 1 ), …, E(D 10 ), E(Index) corrupts E(D 1 ), …, E(D 10 ), E(Index) D, W,

In the Real Game of Privacy D, W, Index Distinguisher C= { E(D 1 ), ⋯, E(D N ) } I= E{ Index } Challenger 90

In the UC framework, let Client Server Environment Z Adversary A 91 E(D 1 ), …, E(D 10 ), E(Index) corrupts E(D 1 ), …, E(D 10 ), E(Index) challenger D, W, Index distinguisher

Equivalent to the real game of privacy Client Server Environment Z Adversary A 92 E(D 1 ), …, E(D 10 ), E(Index) corrupts E(D 1 ), …, E(D 10 ), E(Index) challenger D, W, Index distinguisher

In the ideal world dummy Client Ideal Functionality F vSSE Environment Z UC adversary S |D 1 |, …, |D N | |{keywords}| 93 relay D, W, Index

S must be able to send dummy Client Ideal Functionality F vSSE Environment Z UC adversary S |D 1 |, …, |D N | |{keywords}| 94 relay E(D 1 ), …, E(D 10 ), E(Index) D, W, Index

In the Simulation Game of Privacy D = {D 1, …, D N } W={set of keywords} Index Distinguisher Somehow computes C= { E(D 1 ), ⋯, E(D N ) } I= E{ Index } ChallengerSimulator the minimum leakage |D 1 |, …, |D N | and |{keywords}| 95

In the UC framework, let dummy Client Ideal Functionality F vSSE Environment Z UC adversary S |D 1 |, …, |D N | |{keywords}| 96 relay E(D 1 ), …, E(D 10 ), E(Index) challenger D, W, Index distinguisher simulator

Equivalent to the Sim. game of privacy dummy Client Ideal Functionality F vSSE Environment Z UC adversary S |D 1 |, …, |D N | |{keywords}| 97 relay E(D 1 ), …, E(D 10 ), E(Index) challenger D, W, Index distinguisher simulator

The proof of the equivalence proceeds in this way. 98

The proof of the equivalence proceeds in this way. At the first glance, the Def. of privacy looked artificial. But as we have seen now, it is very natural from a view point of UC 99

Lesson SSE is a good example to understand the notion of UC security. 100

Theorem Our scheme satisfies privacy and reliability if E is CPA secure and MAC is unforgeable 101

Corollary Our scheme is UC-secure. 102

Next How to Update Documents Verifiably in Searchable Symmetric Encryption, Kaoru Kurosawa and Yasuhiro Ohtaki (CANS 2013) 103

Kamara, Papamanthou and Roeder (2012) showed a dynamic SSE scheme such that the client can add, delete and modify the documents. However, their scheme is not verifiable.

Our contribution VerifiabileDynamic Curtmola et al.XX Our FC 2012 schemeOX Kamara et al.XO Our scheme of CANS 2013 OO

First we show A more efficient SSE cheme than Curtmola et al. and A more efficient verifiable SSE scheme than our FC 2012 scheme

Consider this example D1D2D3D4D5 Austin10101 Boston01010 Washingt on 11100

In our SSE scheme E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) The client computes where PRF means pseudorandom function.

and adds E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston) +PRF’(Washington)

The client stores this table E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston) +PRF’(Washington) The server

In the search pahse, E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston) The client sends

The server decrypts (10101) E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101)1) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston)

and returns E(D 1 ), E(D 3 ) and E(D 5 ) E(D 1 )E(D2)E(D 3 )E(D4)E(D 5 ) PRF(Austin)( 10101)1) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston)

In our verifiable SSE scheme, E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston) the client stores this table together with Tag A =MAC( PRF(Austin), E(D 1 ), E(D 3 ), E(D 5 ) ) Tag B =MAC(PRF(Boston), E(D 2 ), E(D 4 )) Tag W =MAC(PRF(Washington), E(D 1 ), E(D 2 ), E(D 3 ))

In our verifiable SSE scheme, E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101)1) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston) the client stores this table where Tag A =MAC( PRF(Austin), E(D 1 ), E(D 3 ), E(D 5 ) ) and so on

In the search phase, E(D 1 ), E(D 3 ), E(D 5 ), Tag A PRF(Austin) and PRF’(Austin)

The client accepts if E(D 1 ), E(D 3 ), E(D 5 ), Tag A =MAC(PRF(Austin), E(D 1 ), E(D 3 ), E(D 5 )) PRF(Austin) and PRF’(Austin)

Theorem The above verifiable SSE scheme satisfies privacy and reliability if E is CPA-secure, PRF and PRF’ are psuedorandom functions and MAC is unforgeable.

Now suppose that E(D 1 )E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston) The client wants to modify D 1 to D′ 1 D 1 contains Austin and Washington

Therefore in the update phase E(D1)E(D2)E(D3)E(D4)E(D5) PRF(Austin)( 10101) PRF(Boston)( 01010) PRF(Washing ton) ( 11100) +PRF’(Austin) +PRF’(Boston) the client must update E(D1) Tag A Tag W

We want to do this more efficiently In the proposed scheme, we break this part (PRF(Austin), E(D 1 ), E(D 3 ), E(D 5 )) down to (PRF(Austin), 1,3,5) (1, E(D 1 )) … (5, E(D 5 ))

The client authenticates each piece separately (PRF(Austin), 1,3,5) (1, E(D 1 )) … separately (5, E(D 5 ))

The last problem is How to timestamp on these (1, E(D 1 )) … (5, E(D 5 )) Remember that the client wants to update files.

We can solve this problem by using any authentication scheme which has the timestamp functionality such as – Merkle hash tree – Authenticated skip list – RSA accumulator (in this talk)

Let x 1 = H(1, E(D 1 )) x 2 = H(2, E(D 2 )) x 3 = H(3, E(D 3 )) x 4 = H(4, E(D 4 )) x 5 = H(5, E(D 5 )) A = g mod N(=pq) x 1 x 2 x 3 x 4 x 5 For simplicity, suppose that x 1 ～ x 5 are primes. Then the client computes and keeps A.

In the search phase Tag1 =MAC(PRF(Austin), 1,3,5) y= g x2 ・ x4 mod N (1,E(D 1 )), (3,E(D 3 )), (5,E(D 5 )), PRF(Austin) and PRF’(Austin)

In the search phase Tag1 =MAC(PRF(Austin), 1,3,5) y= g x2 ・ x4 mod N (1,E(D 1 )), (3,E(D 3 )), (5,E(D 5 )), PRF(Austin) and PRF’(Austin) The client verifies that Tag1 =MAC(PRF(Austin), 1,3,5) A= y x1 ・ x3 ・ x5 mod N

In the search phase Tag1 =MAC(PRF(Austin), 1,3,5) y= g x2 ・ x4 mod N (1,E(D 1 )), (3,E(D 3 )), (5,E(D 5 )), PRF(Austin) and PRF’(Austin) The client verifies that Tag1 =MAC(PRF(Austin), 1,3,5) A= y x1 ・ x3 ・ x5 mod N ( = g x1 … x5 mod N )

In the update phase, To modify D 1 to D 1 ’ the client sends only (1, E(D 1 ’)) to the server.

He then updates A to where x 1 ’= H(1, E(D 1 ’)) A’= g mod N(=pq) x 1 ’x 2 x 3 x 4 x 5

To delete D 1 Modify D 1 to D 1 ’=delete.

How to add files Please see the paper.

We defined the UC security of verifiable dynamic SSE schemes

We then proved that The proposed scheme is UC-secure against non-adaptive adversaries under the strong RSA assumption if – E is CPA-secure – PRF and PRF’ are pseudorandom functions – H is a collision-resistant hash function

Finally Garbled Searchable Symmetric Encryption Kaoru Kurosawa (FC 2014) 135

So far, I have talked about single keyword search SSE schemes.

Next I will talk about multiple keyword search SSE schemes.

Golle, Staddon and Waters (2004) showed a multiple keyword SSE scheme which has keyword fields. FromToSubject D1Keyword 1Keyword 2Keyword 4 D2Keyword 2Keyword 1Keyword 5 D3Keyword 3Keyword 2Keyword 6

Golle, Staddon and Waters (2004) A client can specify at most one keyword in each keyword field. FromToSubject D1Keyword 1Keyword 2Keyword 4 D2Keyword 2Keyword 1Keyword 5 D3Keyword 3Keyword 2Keyword 6

In such a scheme, however, It’s hard to retrieve files which contain both Alice and Bob somewhere in the keyword fields FromToSubject D1AliceBobKeyword 4 D2BobKeyword 5Alice D3Keyword 3Keyword 2Keyword 6

Wang et al. (2008) Showed a keyword field free SSE scheme But it works only for AND search.

Cash et al. (CRYPTO 2013) showed a keyword field free SSE scheme which can support any search formula (in the random oracle model).

However, the search formula is revealed to the server and the search phase requires 2 rounds. Search formula Search phaseSearch formula secrecy Wang et al.only AND 1 round No Cash et al.Any2 roundsNo

At FC 2014, I showed an SSE scheme such that even the search formula is kept secret. Search formula Search phase Search formula secrecy Wang et al.only AND1 roundNo Cash et al.Any2 roundsNo ProposedAny1 roundYes

Also, it can support any search formula and the search phase requires only 1 round. Search formula Search phase Search formula secrecy Wang et al.only AND1 roundNo Cash et al.Any2 roundsNo ProposedAny1 roundYes

The proposed SSE scheme is based on Yao’s garbled circuit.

Yao (1982) constructed A secure two-party protocol by using a garbled circuit and an oblivious transfer. AliceBob GC + OT x y f(x,y)

Since then, garbled circuits have found many applications: multi-party secure protocols, one-time programs, KDM-security, verifiable computation, homomorphic computations and others.

The proposed scheme is the first application of garbled circuits to SSE

A garbled circuit of f is an encoding garble(f) such that one can compute f(X) from garble(f) and label(X) without learning anything on f and X. garble(f) label(X) f(X)

However, if garble(f) or label(X) is reused, then some information on (f, X) is leaked. garble(f) label(X) f(X)

Recently Goldwasser et al. constructed a scheme such that garble(f) can be reused I constructed a scheme such that label(X) can be reused and applied it to multiple keyword SSE

High level overview of the proposed scheme w1w1 w2w2 w3w3 D1D1 111 D2D2 100 keywords files Consider this example.

Let w1w1 w2w2 w3w3 D1D1 (111)=X 1 D2D2 (100)=X 2

The client computes w1w1 w2w2 w3w3 D1D1 label(X 1 ) D2D2 label(X 2 )

The client also computes PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 )

and sends PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 ) Server

In the 1 st search phase, Suppose that the client wants to search on f(w 1,w 2,w 3 )=w 1 ⋀ w 2 ⋀ w 3 He computes the garbled circuits of f: Γ 1 for D 1 and Γ 2 for D 2.

PRF(w 1 ), …, PRF(w 3 ) Γ 1 Γ 2 counter=1 The client sends

PRF(w 1 ), …, PRF(w 3 ) Γ 1 Γ 2 counter=1 The server has this table PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 )

PRF(w 1 ), …, PRF(w 3 ) Γ 1 Γ 2 counter=1 The server computes f(X 1 ) from PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 ) counter=1, label(X 1 ) Γ1Γ1 f(X 1 )=1 garbled circuit

PRF(w 1 ), …, PRF(w 3 ) Γ 1 Γ 2 counter=1 Similarly she computes f(X 2 ) PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 ) Γ2Γ2 counter=1 label(X 2 ) f(X 2 )=0 garbled circuit

The server returns E(D 1 ) Since f(X 1 )=1 and f(X 2 )=0,

In the 2nd search phase, Suppose that the client wants to search on g(w 1,w 2,w 3 )=w 1 ⋁ w 2 ⋁ w 3 He computes the garbled circuits of g: Δ 1 for D 1 and Δ 2 for D 2.

PRF(w 1 ), …, PRF(w 3 ) Δ 1 Δ 2 counter=2 The client sends

and returns E(D 1 ), E(D 2 ) The server computes g(X 1 )=g(X 2 )=1,

Note that label(X 1 ) is reused for Γ 1 and Δ 1 label(X 1 ) Γ1Γ1 Δ1Δ1 f(X 1 )=1 g(X 1 )=1

and label(X 2 ) is reused for Γ 2 and Δ 2 label(X 2 ) Γ2Γ2 Δ2Δ2 f(X 2 )=0 g(X 2 )=1

More details Bellare et al. (2012) defined Kurosawa （ 2014 ） extended them to garbling schemesextended garbling schemes Input-circuit privacylabel reusable privacy

The difference is that counter is included in the extended GC generation algorithm (eGC.gen) and in the extended GC evaluation algorithm (eGC.eval)

XOR AND 1 OR 4 2 3 This is a Boolean circuit f

1 4 2 3 This is the topological circuit f-

Label.gen algorithm chooses 2 random strings (v i 0, v i 1 ) for each wire i such that the lsbs are different: lsb(v i 0 ) ≠ lsb(v i 1 ) XOR AND v 1 0, v 1 1 OR v 2 0, v 2 1 v 3 0, v 3 1 v 4 0, v 4 1

label(0000) is XOR AND v 1 0, v 1 1 OR v 2 0, v 2 1 v 3 0, v 3 1 v 4 0, v 4 1 this vector.

label(1111) is XOR AND v 1 0, v 1 1 OR v 2 0, v 2 1 v 3 0, v 3 1 v 4 0, v 4 1 this vector.

eGC.gen algorithm takes XOR AND v 1 0, v 1 1 OR v 2 0, v 2 1 v 3 0, v 3 1 v 4 0, v 4 1 eGC.gen counter a boolean circuit f All the strings

and outputs a garbled circuit Γ XOR AND v 1 0, v 1 1 OR v 2 0, v 2 1 v 3 0, v 3 1 v 4 0, v 4 1 eGC.gen counter Γ a boolean circuit f All the strings

eGC.eval algorithm takes v10v10 v20v20 v31v31 v41v41 eGC.eval counter the topological circuit f- label(0011), for example GC Γ

and outputs f(0,0,1,1) v10v10 v20v20 v31v31 v41v41 eGC.eval counter the topological circuit f- label(0011), for example GC Γ f(0,0,1,1)

Label reusable privacy (informal) Even if label(x 1, …, x n ) = (v 1 x1, …, v n xn ) is reused for multiple garbled circuits Γ 1, Γ 2, …., no information on (x 1, …, x n ) and (f 1,f 2, … ) are leaked, where Γ i is a garbled circuit of f i

Our construction of the extended garbling scheme which satisfies label reusable privacy is the same as the usual construction of the garbling scheme except for that counter is included in the hash function H.

For simplicity, consider f(x 1,x 2 ) f(x 1,x 2 ) v 1 0, v 1 1 v 2 0, v 2 1 Each input wire has two labels

eGC.gen algorithm computes y 00 =H(counter, v 1 0, v 2 0 ) ⊕ f(0,0) y 01 =H(counter, v 1 0, v 2 1 ) ⊕ f(0,1) y 10 =H(counter, v 1 1, v 2 0 ) ⊕ f(1,0) y 11 =H(counter, v 1 1, v 2 1 ) ⊕ f(1,1)

Note that this part works as one-time pad y 00 =H(counter, v 1 0, v 2 0 ) ⊕ f(0,0) y 01 =H(counter, v 1 0, v 2 1 ) ⊕ f(0,1) y 10 =H(counter, v 1 1, v 2 0 ) ⊕ f(1,0) y 11 =H(counter, v 1 1, v 2 1 ) ⊕ f(1,1)

Roughly speaking, the garbled circuit Γ is a random permutation of (y 00, …, y 11 ). y 00 =H(counter, v 1 0, v 2 0 ) ⊕ f(0,0) y 01 =H(counter, v 1 0, v 2 1 ) ⊕ f(0,1) y 10 =H(counter, v 1 1, v 2 0 ) ⊕ f(1,0) y 11 =H(counter, v 1 1, v 2 1 ) ⊕ f(1,1)

More precisely lsb(v 1 0 )lsb(v 2 0 )y 00 lsb(v 1 0 )lsb(v 2 1 )y 01 lsb(v 1 1 )lsb(v 2 0 )y 10 lsb(v 1 1 )lsb(v 2 1 )y 11 Construct this table

If lsb(v 1 0 )=0, 0lsb(v 2 0 )y 00 0lsb(v 2 1 )y 01 1lsb(v 2 0 )y 10 1lsb(v 2 1 )y 11 then the 1st column is

If lsb(v 2 0 )=1 01y 00 00y 01 11y 10 10y 11 then the 2 nd column is

Then permute the rows in such a way that (00) ～ (11) appear here

00y 01 01y 00 10y 11 11y 10 The garbled circuit Γ is these 4 bits

eGC.eval algorithm takes counter eGC.eval label(11)= (v 1 1, v 2 1 ) y 01 =H(counter,v 1 0, v 2 1 ) ⊕ f(01) y 00 =H(counter, v 1 0, v 2 0 ) ⊕ f(00) y 11 =H(counter, v 1 1, v 2 1 ) ⊕ f(11) y 10 =H(counter, v 1 1, v 2 0 ) ⊕ f(10) Ｔ he garbled circuit Γ

Since lsb(v 1 1 )= １, lsb(v 2 1 )=0 counter eGC.eval y 01 =H(counter,v 1 0,v 2 1 ) ⊕ 0 y 00 =H(counter, v 1 0, v 2 0 ) ⊕ 0 y 11 =H(counter, v 1 1, v 2 1 ) ⊕ f(11) y 10 =H(counter, v 1 1, v 2 0 ) ⊕ 0 00 01 1010 11 look at the 3 rd row of Γ (v 1 1, v 2 1 )

Then we can compute f(1,1) from the given inputs counter eGC.eval y 01 =H(counter,v 1 0,v 2 1 ) ⊕ 0 y 00 =H(counter, v 1 0, v 2 0 ) ⊕ 0 y 11 =H(counter, v 1 1, v 2 1 ) ⊕ f(11) y 10 =H(counter, v 1 1, v 2 0 ) ⊕ 0 garbled circuit Γ 00 01 1010 11 f(1,1) label(11)= (v 1 1, v 2 1 )

Theorem The above construction satisfies label reusable privacy in the random oracle model

How to Apply Extended Garbling Scheme to Multiple Keyword SSE w1w1 w2w2 w3w3 D1D1 e 11 =1e 12 =1e 13 =1 D2D2 e 21 =1e 22 =0e 23 =0 Consider this example

The client computes v 11 0 =AES k (1,1,0) v 11 1 =AES k (1,1,1) w1w1 w2w2 w3w3 D1D1 e 11 =1e 12 =1e 13 =1 D2D2 e 21 =1e 22 =0e 23 =0

Since e 11 =1, let v 11 0 =AES k (1,1,0) v 11 = v 11 1 =AES k (1,1,1) w1w1 w2w2 w3w3 D1D1 e 11 =1e 12 =1e 13 =1 D2D2 e 21 =1e 22 =0e 23 =0

In this way, w1w1 w2w2 w3w3 D1D1 v 11 =v 11 1 v 12 =v 12 1 v 13 =v 13 1 D2D2 v 21 =v 21 1 v 22 =v 22 0 v 23 =v 23 0 the client computes each entry of this table.

Let w1w1 w2w2 w3w3 D1D1 (v 11 v 12 v 13 )=label(X 1 ) D2D2 (v 21 v 22 v 23 )=label(X 2 )

Namely for D 1, The client generates these strings by using AES (v 11 0, v 11 1 ), (v 12 0, v 12 1 ), (v 13 0, v 13 1 )

and chooses each element of label(X 1 ) from (111) (v 11 0, v 11 1 ), (v 12 0, v 12 1 ), (v 13 0, v 13 1 ) label(X 1 )=(v 11, v 22, v 33 ) w1w1 w2w2 w3w3 D1D1 111 D2D2 100

Similarly for D 2, The client generates these strings by using AES (v 21 0, v 21 1 ), (v 22 0, v 22 1 ), (v 23 0, v 23 1 )

and chooses each element of label(X 2 ) from (100) (v 21 0, v 21 1 ), (v 22 0, v 22 1 ), (v 23 0, v 23 1 ) label(X 2 )=(v 11, v 22, v 33 ) w1w1 w2w2 w3w3 D1D1 111 D2D2 100

The client further computes PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 )

and sends PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 ) The server

After the store phase, The clients keeps only the secret keys of AES, E, PRF and PRF’. He remembers nothing other than these.

In the search phase, Suppose that the client searches on f(w 1,w 2,w 3 )=w 1 ⋀ w 2 ⋀ w 3

For D 1, the client re-generates these strings (v 11 0, v 11 1 ), …, (v 13 0, v 13 1 ) by using AES in the same way as in the store phase.

f(w 1,w 2,w 3 )=w 1 ⋀ w 2 ⋀ w 3 counter eGC.gen and computes the garbled circuit Γ 1 (v 11 0, v 11 1 ), …, (v 13 0, v 13 1 ) Then the client runs eGC.gen on input

For D 2, The client computes the garbled circuit Γ 2 similarly

PRF(w 1 ), …, PRF(w 3 ) Γ 1 Γ 2 The topological circuit f- and counter The client sends

The server has this table PRF(w 1 )PRF(w 2 )PRF(w 3 ) E(D 1 )label(X 1 ) E(D 2 )label(X 2 )

The server runs eGC.eval on input eGC.eval and computes z 1 =f(X 1 ) label(X 1 ) the garbled circuit Γ 1 the topological circuit ｆ - counter

E(D 1 ) if z 1 =1 The server returns The same for D 2

Theorem In the proposed scheme, if the underlying extended garbling scheme satisfies label reusable privacy Then only the following information is leaked to the server (other than the minimum leakage)

The topological circuit f- (π(j 1 ), …, π(j c )), where π is a random permutation and {w j1, …, w jc } are the queried keywords

In the scheme of Cash et al. (2013) If 「 Japan AND Crypto 」 is searched, the following information is leaked to the server the search formula ＝ AND the search result of Japan or that of Crypto and some more information （ see Sec.5.3 of their paper ）

Communication overhead of the proposed scheme Let m = # of files c = # of search keywords s = # of gates of f In the search phase, the com. overhead is |counter|+(c+4m(s-1))×128+4m bits

If # of search keywords is 2 The communication overhead is |counter|+256+ 4× （ # of files ） bits

Computer simulation We used a computer such as follows. 2.4GHz CPU and 32G byte RAM OS = CentOS 6.5 C++ and NTL library The total # of keywords is 20. We generated Index randomly

The running time of the client in the search phase

The running time of the server in the search phase

In the proposed SSE scheme, Search formula Search phase Search formula secrecy Wang et al. (2008) Only AND 1 round--- Cash at al. (CRYPTO 2013) Any2 roundsleaked Kurosawa (FC 2014) Any1 roundsecret

Summary UC-Secure Searchable Symmetric Encryption How to Update Documents Verifiably in Searchable Symmetric Encryption Garbled Searchable Symmetric Encryption 224

Thank you !

How to Keyword-Search Securely in Cloud Storage Service Kaoru Kurosawa Ibaraki University, Japan ICISC 2014, Dec. 3-5, Chung-Ang University, Korea.

Similar presentations

Presentation on theme: "How to Keyword-Search Securely in Cloud Storage Service Kaoru Kurosawa Ibaraki University, Japan ICISC 2014, Dec. 3-5, Chung-Ang University, Korea."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

How to Keyword-Search Securely in Cloud Storage Service Kaoru Kurosawa Ibaraki University, Japan ICISC 2014, Dec. 3-5, Chung-Ang University, Korea.

Similar presentations

Presentation on theme: "How to Keyword-Search Securely in Cloud Storage Service Kaoru Kurosawa Ibaraki University, Japan ICISC 2014, Dec. 3-5, Chung-Ang University, Korea."— Presentation transcript:

Similar presentations

About project

Feedback