CryptDB: Protecting Confidentiality with Encrypted Query Processing Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan MIT CSAIL
Problem Confidential data leaks from databases E.g., Sony Playstation Network, impacted 77 million personal information profiles Threat 2: any attacks on all servers Threat 1: passive DB server attacks User 1 DB Server SQL Application User 2 System administrator User 3 Hackers
CryptDB in a nutshell Goal: protect confidentiality of data Threat 2: any attacks on all servers Threat 1: passive DB server attacks User 1 DB Server SQL Application User 2 on encrypted data User 3 Process SQL queries on encrypted data Use fine-grained keys; chain these keys to user passwords based on access control
Contributions First practical DBMS to process most SQL queries on encrypted data Hide DB from sys. admins., outsource DB Protects data of users logged out during attack, even when all servers are compromised Limit leakage from compromised applications Modest overhead: 26% throughput loss for TPC-C No changes to DBMS (e.g., Postgres, MySQL)
Threat 1: Passive attacks to DB Server Trusted Under attack Proxy DB Server plain query transformed query Application Encrypted DB decrypted results Stores schema, master key No data storage No query execution encrypted results Process queries completely at the DBMS, on encrypted database Process SQL queries on encrypted data
Randomized encryption Deterministic encryption Application Randomized encryption Deterministic encryption SELECT * FROM emp WHERE salary = 100 table1/emp SELECT * FROM table1 WHERE col3 = x5a8c34 col1/rank col2/name col3/salary Proxy x934bc1 x4be219 60 x5a8c34 x95c623 100 x84a21c ? x5a8c34 x2ea887 800 x5a8c34 x17cea7 100
Deterministic encryption Application Deterministic encryption OPE (order) encryption SELECT * FROM emp WHERE salary ≥ 100 table1 (emp) SELECT * FROM table1 WHERE col3 ≥ x638e54 col1/rank col2/name col3/salary Proxy x638e54 x922eb4 x1eab81 x934bc1 60 x5a8c34 100 x84a21c 800 x5a8c34 x638e54 x922eb4 100
Two techniques Use SQL-aware set of encryption schemes Adjust encryption of database based on queries Most SQL uses a limited set of operations
Encryption schemes Scheme Construction Function RND AES in CBC none Highest RND AES in CBC none HOM Paillier +, * e.g., sum SEARCH Song et al.,‘00 word search restricted ILIKE e.g., =, !=, IN, COUNT, GROUP BY, DISTINCT Security DET AES in CMC equality JOIN join our new scheme see paper e.g., >, <, ORDER BY, SORT, MAX, MIN OPE Boldyreva et al.’09 order first implementation
How to encrypt each data item? Encryption schemes needed depend on queries May not know queries ahead of time col1-RND col1-HOM col1-SEARCH col1-DET col1-JOIN col1-OPE rank ALL? ‘CEO’ ‘worker’ Leaks order!
Onions of encryptions SEARCH text value RND RND Onion Search each value DET OPE OR JOIN OPE-JOIN value value HOM int value Onion Equality Onion Order Onion Add Same key for all items in a column for same onion layer Start out the database with the most secure encryption scheme
Adjust encryption Strip off layers of the onions Proxy gives keys to server using a SQL UDF (“user-defined function”) Proxy remembers onion layer for columns Do not put back onion layer
Example: … … … SELECT * FROM emp WHERE rank = ‘CEO’; emp: rank name salary ‘CEO’ ‘worker’ table 1: RND … col1-OnionEq col1-OnionOrder col1-OnionSearch col2-OnionEq DET … JOIN RND RND SEARCH RND … ‘CEO’ RND RND SEARCH RND Onion Equality SELECT * FROM emp WHERE rank = ‘CEO’;
Example (cont’d) … … … SELECT * FROM emp WHERE rank = ‘CEO’; table 1 RND … col1-OnionEq col1-OnionOrder col1-OnionSearch col2-OnionEq DET DET … JOIN RND DET RND SEARCH RND … ‘CEO’ DET RND RND SEARCH RND Onion Equality SELECT * FROM emp WHERE rank = ‘CEO’; UPDATE table1 SET col1-OnionEq = Decrypt_RND(key, col1-OnionEq); SELECT * FROM table1 WHERE col1-OnionEq = xda5c0407;
Confidentiality level Queries encryption scheme exposed Encryption schemes exposed for each column are the most secure enabling queries amount of leakage equality predicate on a column DET repeats aggregation on a column HOM nothing no filter on a column RND nothing common in practice Never reveals plaintext
Application protection Arbitrary attacks on any servers User 1 Passive attacks DB Server Proxy User 2 Application SQL User 3 User password gives access to data allowed to user by access control policy Protects data of logged out users during attack
Challenge: data sharing SPEAKS_FOR msg_id SPEAKS_FOR msg_id ENC_FOR msg_id msg_id sender receiver msg_id message 5 Alice Bob 5 “secret message” Km5 Km5 Km5 How to enforce access control cryptographically? Alice-pass Bob-pass Capture read access policy of application at SQL level? Key chains from user passwords Process queries on encrypted data Annotations
(user-defined functions) Implementation SQL Interface Server Unmodified DBMS query transformed query CryptDB SQL UDFs (user-defined functions) Application CryptDB Proxy results encrypted results No change to the DBMS Portable: from Postgres to MySQL with 86 lines One-key: no change to applications Multi-user keys: annotations and login/logout
Evaluation Does it support real queries/applications? What is the resulting confidentiality? What is the performance overhead?
Queries not supported More complex operators, e.g., trigonometry Operations that require combining incompatible encryption schemes e.g., T1.a + T1.b > T2.c Extensions: split queries, precompute columns, or add new encryption schemes
Real queries/applications Total columns Encrypted columns phpBB 563 23 HotCRP 204 22 grad-apply 706 103 TPC-C 92 sql.mit.edu 128,840 # cols not supported 1,094 Annotations + lines of code changed 38 31 113 Multi-user keys One-key SELECT 1/log(series_no+1.2) … … WHERE sin(latitude + PI()) …
Resulting confidentiality Application Total columns Encrypted columns phpBB 563 23 HotCRP 204 22 grad-apply 706 103 TPC-C 92 sql.mit.edu 128,840 Min level is RND 21 18 95 65 80,053 Min level is DET 1 6 19 34,212 Min level is OPE 1 2 8 13,131 Multi-user keys One-key Most columns at RND Most columns at OPE analyzed were less sensitive
Performance Hardware: 2.4 GHz Intel Xeon E5620 – 8 cores, 12 GB RAM DB server throughput MySQL: Application 1 Plain database Application 2 Latency CryptDB: Application 1 CryptDB Proxy Encrypted database Application 2 CryptDB Proxy Hardware: 2.4 GHz Intel Xeon E5620 – 8 cores, 12 GB RAM
TPC-C performance Latency (ms/query): 0.10 MySQL vs. 0.72 CryptDB Throughput loss 26%
TPC-C microbenchmarks Homomorphic addition No cryptography at the DB server in the steady state! Encrypted DBMS is practical
Related work Cryptography proposals Fully homomorphic encryption (starting with [Gentry’10]) Search on encrypted data (e.g., [Song et al.,’00]) Systems proposals (e.g., [Hacigumus et al.,’02]) Lower degree of security, rewrite the DBMS, client-side processing Query integrity (e.g., [Nguyen et al.,’07], [Sion’05])
Conclusions CryptDB: The first practical DBMS for running most standard queries on encrypted data Protects data of users logged out during attack even when all servers are compromised Modest overhead and no changes to DBMS Website: http://css.csail.mit.edu/cryptdb/ Demo at poster session! Thanks!