Need for Privacy Enhancing Technologies 1 What is challenging about standard encryption?
Challenge: Privacy versus Data Utilization Dilemma Client Storage on the cloud Sensitive data! Outsource the data SEARCH? ANALYZE? (encrypted) Standard Encryption CAN’T SEARCH! CAN’T ANALYZE! 2 IMPACT
Searchable Encryption (Generic Framework) 3 f1f1 fnfn Client Cloud... c1c1 cncn Extract keywords w1w1 wnwn... t1t1 Data Structure t1t1 tntn... Searchable Representation Search keyword: w 1 t 1 Trapdoors tntn... t1t1 Update file: f i (z i, V) (z i, V) c1c1 f1f1
Curtmola et al. (CCS 2006) (+) Efficient encrypted searches (-) No update on files (addition/removal not possible) Variants of CCS 2006 with various properties: Ranked, multi-keyword, wildcard, … (-) No update and inefficient Kamara et. al. (CCS 2012) (+) Updates: New files can be added/removed (-) Update leaks information (insecure updates) Kamara et. al. (FC 2013) (+) Secure updates (-) Searchable words are fixed (cannot add a new keyword later) (-) Extremely large cloud storage (multi TBs, impractical) 4 Prior Work on Searchable Encryption (Milestones)
A. A. Yavuz, J. Guajardo, A. Ragi “Dynamic and Parallelizable Symmetric Searchable Encryption with Secure Updates” Patent filed (disclosure allowed 10^5 keywords, 10^6 files, compared to Kamara et. al. FC 2013: 5120 times smaller storage at the cloud 20 times faster update 680 times smaller communication overhead Both files and keywords can be added/removed securely Contribution: A New Dynamic Symmetric SE Scheme 5
Searchable Representation: Binary matrix I Row i, {1,…,m} keyword w i, column j, {1,…,n} file f j If I[i,j]=1 then keyword w i appears in file f j, otherwise not Integrates index and inverted index, simple yet efficient Search via row operations inverted index Update via column operations index (i,j)12...n m Our Scheme: Searchable Representation Files f 1 f 2... f n Keywords w 1 w 2... w m
(i,j) …n m Our Scheme: Map keyword/file to the matrix Keyword w {1,…, m} and file f {1, …, n} : Dynamic and efficient Map a keyword to a row i: Open address hash tables: Collision-free (one-to-one), O(1) access Map a file to column j: TF1, z 100 2,z ,z l … 257, z r … n,z 6 TW 1,t 55 2, t 300. m, t 2
Derive row key Encrypt each row i with r i (AES 128 CTR mode) Our Scheme: Encrypt Searchable Representation (i,j) n m Achieving Dynamic Keywords: Static schemes: Derived keys from keywords Break static relation between keys and keywords 8 r1r1 rmrm......
Search keyword w on I’ : Our Scheme: Search on Encrypted Representation 9 Client Cloud Decrypt i’th row of I’[i,*] with r i I[i,*] I’ n i0 1 1 m1 0 1 I[i,j]=1 then ciphertext c j contains t w I n i c 1 c 55 c 253 c n Decrypt with k 4 Get f 1,f 55,…,f n
Add a new file f to I’ : Our Scheme: Update on Encrypted Representation 10 Client Cloud Replace new column with j’th column of I’ I’1...j n m … … 0 0 … … 0 E(.)
File Update SecuritySecure Update Keyword Universe Update Comm. Update timeIndex Size Kamara FC 2013 YesCKA2+YesFixed (2 z k) O(n log2(m))O(n/p log2(m))(2 z k ) O(n m) 1000 MB GB Our Scheme YesCKA2+YesDynamicbO(n)O(n/p) O(n m) 1.5 MB112 GB 11 n=10^5 keywords, z=32 bit (pointer size) m=10^6 files, n’=10^3, *# of keywords existing in an updated file k=80 security parameter, b= 128 bits, symmetric block size p=4 CPU cores r=200 (# of files containing keyword Dynamic keyword universe Secure and efficient update Smallest index size with CKA2+ security Comparison with State-of-Art
12 OperationAvg time (msec) #keyword : 1,000,000 #file : 5,000 Avg time (msec) #keyword : 200,000 #file : 50,000 Avg time (msec) #keyword : 2,000 #file : 2,000,000 Build Index Search Keyword Add File Delete File Implementation ( Benchmarking Results ) Enron dataset, Ubuntu OS, 4 GB RAM, Intel i5 processor, 256 GB harddisk All operations are practical Search under a msec, and only 10 msec for 2 millions of files Update various 8 msec to 2 sec
Security Analysis of Our DSSE (Very Brief) 13 Confidentiality focus (integrity/auth can be added) Access Pattern: File identifiers that satisfy a search query (search results) Search Pattern: History of searches (whether a search token used at past) IND-CKA2 (Adaptive Chosen Keyword Attacks): Given {I’, c 0,..,c n, z 0, …,z n, t 0,…,t m }, no adversary can learn any information about f 0,…,f n and w 0,…,w m other than the access and search pattern, even if queries are adaptive. Theorem 1: Our DSSE scheme (L1,L2)-secure in ROM based on IND- CKA2, where L1 and L2 leak access and search pattern, respectively. Real and simulated views are indistinguishable due to PRF and IND-CPA cipher.
14 C/C++ Own Lines of code : Tomcrypt API Symmetric Key Encryption: AES-CTR 128-bit MAC: CMAC-128 Key Derivation Function : CMAC-128 File encryption : CCM (Counter with CBC-MAC) Intel AESNI sample library For AES implementation using assembly language instructions. As KDF, we further exploit AES-ASM by using CMAC. Hash tables, Google open source static C++ data structure Implementation Details of Our DSSE
Outline Privacy Enhancing Technologies for Big Data Analytics Privacy versus data utilization dilemma A new searchable encryption scheme Efficient Security Mechanisms for Smart-Infrastructures Security challenges: Smart-grid, inter/intra car systems Fast and scalable authentication: ER, ETA, PISB, ESCAR, patents Heart of Secure Systems: Protecting Audit Logs (PhD Thesis) Research challenges and contributions Research OSU Towards Secure Smart-Infrastructures Towards Practical PETs 15
Reliable Cyber-Physical Systems (e.g., smart-grid) are vital Susceptible: Northeast blackout (2003), 50 million people, $10 billion cost Attacks: False data injection [Yao CCS09’], over 200 cyber-attacks in 2013 Vulnerability: Commands and measurements are not authenticated Requirements for a security method Real-time Extremely fast processing (a few ms) Limited bandwidth Compact Several components Scalability Limitations of Existing Methods PKC is not yet feasible (computation, storage, tag size) Symmetric crypto is not scalable (key management ) Security Challenges for Smart-Infrastructures 16
Security Challenges for Smart-Infrastructures (II) Fast, compact and scalable security is needed! 17 Internet ECU Vulnerability: Commands and measurements are not authenticated Security for Inter-car Networks Manipulate direction/velocity, crashes Security for Intra-car Networks Large attack surface [Usenix '11] ECUs of break/acceleration, airbag Challenges Strict safety requirements Limited bandwidth, real-time processing The state-of-art cannot address (as discussed)
Contributions: Secure Intra-car Systems (I) 18 Motivation: Secure communication among ECUs in the car Challenges: Safety requirements, extremely limited resources Contributions A. A. Yavuz, J. Guajardo, “Efficient UMACs for CAN systems via key update mechanisms”, May 2012 (patent) J. Guajardo, A. A. Yavuz, “Bandwidth Efficient Symmetric Encryption Methods”, June 2012 (patent) A. A. Yavuz, “Signal-based Automotive Communication Security and Its Interplay with Safety Requirements", Embedded Security in Cars Conference, Germany, November 2012 (with B. Glas, J. Guajardo, H. Hacioglu, M. Ihle, K. Wehefritz) Impact: Embedded crypto software, deployment for OEMs (2018) Customers: GM, BMW
Contributions: Secure Smart- Infrastructures (II) A. A. Yavuz, “Emergent Response (ER): An Efficient and Scalable Real- time Broadcast Authentication for Command and Control Messages“ Patent + IEEE Transactions Information Sec. 19 A. A. Yavuz, “Practical Immutable Signatures (PISB)”, LNCS DBSec 2013 Immutable and 40 times faster than state-of-art Idea: Leverage SA-RSA to compute umbrella signature on C-RSA, eliminates interaction, more efficient A. A. Yavuz, “Efficient and Tiny Authentication (ETA)”, ACM WiSec 2013 A magnitude of times more efficient than RSA/ECDSA Smallest key/signature sizes (240 bits, 320 bits) Idea: Tailor Schnorr signatures, O(1) size pre-computation tokens, proof in ROM to DLP
Rapid Authentication – Motivation and Preliminary Work Fast Broadcast Authentication: Minimum end-to-end crypto delay Limitations of the State-of Art Online-offline and OTSs: Very large signature and key sizes DLP-based Methods (DSA tokens): Signer efficient but verifier costly RSA/Rabin: Verifier efficient but signer costly Both signer and verifier efficiency with a compact signature? 20 Pre-computation for RSA without linear overhead? Both signer and verifier efficient! Condensed-RSA (C-RSA) aggregates RSA signatures
Rapid Authentication – Basic Idea Messages have structure by some protocols: Can be leveraged? 21 Source IP (32 bits)Destination IP (32 bits)Command (6 bits)Value (6 bits)Options (5 bits) signatures Pre-compute Signature tables signatures cmd1, …,cmd64 64 signatures val1, …,val64 64 signatures 1) Pre-compute RSA signatures on each sub-message in fields (offline phase) Source IP (32 bits)Destination IP (32 bits)Command (6 bits)Value (6 bits)Options (5 bits) “increase”“level 5”“voltage” opt1, …,opt32 32 signatures 2) C-RSA pre-computed signatures according to message (online signing) 3) Verify Condensed-RSA signature “increase”“level 5”“voltage”
Improved RA: Structure-Free RA (SCRA) 22 Sign messages without assuming structure or length (Message||s), |s|=80 one-time rand. num. HASH Function any length 160 bits (truncate) Field 1 (8 bits) Field 2 (8 bits) Field 10 (8 bits) ………… Problems: Structured message, table might be large Intel(R) Core(TM) i7 Q720 at 1.60GHz CPU and 2GB RAM running Ubuntu (MIRACL library) Execution times in µsec Pre-compute signature table (offline)