Searching Over Encrypted Data Charalampos Papamanthou ECE and UMIACS University of Maryland, College Park Research Supported By
Cloud computing today PROVIDERS Google Yahoo! Amazon Clients Industries Federal government Universities
Is there any privacy? Cloud provider uses its own keys to encrypt the clients’ data So, rest assured…?
Where we want to get at in the future We want clients to be in control of their data Encrypt at the client’s machine Cloud only gets to see the ciphertext! Google and Yahoo! already moving forward with such an approach See end-to-end (
How about searching? In theory possible, but impractical Fully-homomorphic encryption Oblivious RAM Two-party computation My group’s approach Theory: Searchable encryption Practice: Pmail (demo in the end)
Potential approaches y i = enc(f i ) search large bandwidth! All y i yiyi fifi y i = enc(f i ) file id y id yiyi fifi index large client space!
Searchable Encryption y i = enc(f i ) token t(w) ywyw yiyi fifi index First paper by Song, Wagner and Perrig in 2000 Encrypt files + index appropriately Search with encrypted tokens Return only the relative files Only for static indexes (or dynamic is not practical)
Caveat Searchable encryption leaks information –Search pattern –Access pattern
What is an index Microsoft Brown Berkeley Greece F2F10F11 F2F8F14 F1F2 F4F10F12 …(Microsoft, F2), (Microsoft, F10),…,(Brown,F8),…,(Greece,F4)….
What is a token Definition of token for word w hash function w K twtw Tokens are deterministic!
Basic scheme (NDSS 2014) (w, d) KEY = HASH (t w || count || 0 ) encoded hash table T initial index D (w, d)
Searching for keyword w Client: Sends t w Server: Looks up the entries mapping to t w –Learns nothing about keyword W twtw
Updating the index Important: Old tokens should not work for new files –Addressed in our NDSS 2014 paper
Research in my group Searchable encryption with support for updates (CCS 2012, NDSS 2014) Parallel algorithms for searchable encryption (FC 2013) [Ongoing research] –[Theory] New searchable encryption schemes that are more expressive (range and conjunctive search) leak less information (eliminating search pattern leakage) support more efficient updates (improving the polylogn bound) use weaker cryptographic assumptions (removing the random oracle) –[Practice] Devoping Pmail (plugin for Gmail) Pick the right SE scheme Web security issues (how to integrate securely with Gmail API) Usability issues (how can we design the interface so that more people can use it)
Pmail Demo