Computer Science 1 Dynamic Authenticated Index Structures for Outsourced Databases Feifei Li, Marios Hadjieleftheriou, George Kollios, Leonid Reyzin Boston.

Slides:



Advertisements
Similar presentations
Signatures for Network Coding Denis Charles Kamal Jain Kristin Lauter Microsoft Research.
Advertisements

Secure Naming structure and p2p application interaction IETF - PPSP WG July 2010 Christian Dannewitz, Teemu Rautio and Ove Strandberg.
CSC 774 Advanced Network Security
Advanced Security Constructions and Key Management Class 16.
CSC 774 Advanced Network Security
CS 483 – SD SECTION BY DR. DANIYAL ALGHAZZAWI (4) Information Security.
Query Assurance on Data Streams  Ke Yi (AT&T Labs, now at HKUST)  Feifei Li (Boston U, now at Florida State)  Marios Hadjieleftheriou (AT&T Labs) 
B+-tree and Hashing.
CMSC 414 Computer and Network Security Lecture 7 Jonathan Katz.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
Digital Signature Algorithm (DSA) Kenan Gençol presented in the course BIL617 Cryptology instructed by Asst.Prof.Dr. Nuray AT Department of Computer Engineering,
Kemal AkkayaWireless & Network Security 1 Department of Computer Science Southern Illinois University Carbondale CS 591 – Wireless & Network Security Lecture.
CMSC 414 Computer and Network Security Lecture 16 Jonathan Katz.
On-The-Fly Verification of Rateless Erasure Codes Max Krohn (MIT CSAIL) Michael Freedman and David Mazières (NYU)
Authenticating streamed data in the presence of random packet loss March 17th, Philippe Golle, Stanford University.
Privacy and Integrity Preserving in Distributed Systems Presented for Ph.D. Qualifying Examination Fei Chen Michigan State University August 25 th, 2009.
DSAC (Digital Signature Aggregation and Chaining) Digital Signature Aggregation & Chaining An approach to ensure integrity of outsourced databases.
Privacy-Preserving Computation and Verification of Aggregate Queries on Outsourced Databases Brian Thompson 1, Stuart Haber 2, William G. Horne 2, Tomas.
Authentic Publication The TRUTHSAYER Project Chip Martel Premkumar Devanbu Michael Gertz April Kwong Glen Nuckolls Stuart Stubblebine Department of Computer.
CMSC 414 Computer and Network Security Lecture 6 Jonathan Katz.
Cryptography1 CPSC 3730 Cryptography Chapter 9 Public Key Cryptography and RSA.
DSAC (Digital Signature Aggregation and Chaining) Digital Signature Aggregation & Chaining An approach to ensure integrity of outsourced databases.
Chapter 3 Encryption Algorithms & Systems (Part C)
Fall 2010/Lecture 311 CS 426 (Fall 2010) Public Key Encryption and Digital Signatures.
1 Pertemuan 08 Public Key Cryptography Matakuliah: H0242 / Keamanan Jaringan Tahun: 2006 Versi: 1.
Computer Science CSC 474Dr. Peng Ning1 CSC 474 Information Systems Security Topic 2.5 Public Key Algorithms.
CSE 597E Fall 2001 PennState University1 Digital Signature Schemes Presented By: Munaiza Matin.
Introduction to Public Key Cryptography
Public Key Model 8. Cryptography part 2.
CS5204 – Fall Cryptographic Security Presenter: Hamid Al-Hamadi October 13, 2009.
Page 1 Secure Communication Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Yin Yang, Dimitris Papadias, Stavros Papadopoulos HKUST, Hong Kong Panos Kalnis KAUST, Saudi Arabia Providence, USA, 2009.
Cong Wang1, Qian Wang1, Kui Ren1 and Wenjing Lou2
1 Network Security Lecture 6 Public Key Algorithms Waleed Ejaz
Computer Science Secure Hierarchical In-network Data Aggregation for Sensor Networks Steve McKinney CSC 774 – Dr. Ning Acknowledgment: Slides based on.
Qian Chen, Haibo Hu, Jianliang Xu Hong Kong Baptist University Authenticated Online Data Integration Services1.
Authentication: Owner and user OwnerUser Query: X > 6 Message m: Answer to X>6: X1, X5 Sign(m) DB.
Computer Science iBigTable: Practical Data Integrity for BigTable in Public Cloud CODASPY 2013 Wei Wei, Ting Yu, Rui Xue 1/40.
02/22/2005 Joint Seminer Satoshi Koga Information Technology & Security Lab. Kyushu Univ. A Distributed Online Certificate Status Protocol with Low Communication.
Certification asynchrone à grande échelle avec des arbres de vérification de certificats Josep Domingo-Ferrer Universitat Rovira i Virgili
Computer Science Integrity Assurance for Outsourced Databases without DBMS Modification DBSec 2014 Wei Wei, Ting Yu 1.
Digital Signatures A primer 1. Why public key cryptography? With secret key algorithms Number of key pairs to be generated is extremely large If there.
Midterm Review Cryptography & Network Security
CS526: Information Security Prof. Sam Wagstaff September 16, 2003 Cryptography Basics.
Lecture 3.4: Public Key Cryptography IV CS 436/636/736 Spring 2013 Nitesh Saxena.
Computer Science 1 Authentication in Outsourced Database Systems With Feifei Li 1, Marios Hadjieleftheriou 2, and Leonid Reyzin 1 1 Boston University 2.
Public Key Cryptography. symmetric key crypto requires sender, receiver know shared secret key Q: how to agree on key in first place (particularly if.
1 Public-Key Cryptography and Message Authentication.
Cryptography and Network Security Chapter 13 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
ASYNCHRONOUS LARGE-SCALE CERTIFICATION BASED ON CERTIFICATE VERIFICATION TREES Josep Domingo-Ferrer, Marc Alba and Francesc Sebé Dept. of Computer Engineering.
PUBLIC-KEY CRYPTOGRAPH IT 352 : Lecture 2- part3 Najwa AlGhamdi, MSc – 2012 /1433.
This document is for academic purposes only. © 2012 Department of Computer Science, Hong Kong Baptist University. All rights reserved. 1 Authenticating.
Merkle trees Introduced by Ralph Merkle, 1979 An authentication scheme
Information Security CS 526
Chapter 3 – Public Key Cryptography and RSA (A). Private-Key Cryptography traditional private/secret/single-key cryptography uses one key shared by both.
Public Key Algorithms Lesson Introduction ●Modular arithmetic ●RSA ●Diffie-Hellman.
Digital Signature Standard (DSS) US Govt approved signature scheme designed by NIST & NSA in early 90's published as FIPS-186 in 1991 revised in 1993,
Secure Data Outsourcing
EE 122: Lecture 24 (Security) Ion Stoica December 4, 2001.
9.2 SECURE CHANNELS JEJI RAMCHAND VEDULLAPALLI. Content Introduction Authentication Message Integrity and Confidentiality Secure Group Communications.
Authenticating streamed data in the presence of random packet loss February 8 th, 2001 Philippe Golle Nagendra Modadugu Stanford University.
Authenticated Join Processing in Outsourced Databases
Public Key Encryption and Digital Signatures
Josep Domingo-Ferrer Universitat Rovira i Virgili
Dynamic Authenticated Index Structures for Outsourced Databases
CS/ECE 478 Introduction to Network Security
Data Integrity: Applications of Cryptographic Hash Functions
Hash-based Primitives Credits: Dr. Peng Ning and Dr. Adrian Perrig
Ensuring Correctness over Untrusted Private Database
LAB 3: Digital Signature
Presentation transcript:

Computer Science 1 Dynamic Authenticated Index Structures for Outsourced Databases Feifei Li, Marios Hadjieleftheriou, George Kollios, Leonid Reyzin Boston University AT&T Labs-Research

H. Hacigumus, B. R. Iyer, and S. Mehrotra, ICDE022 Outsourced Database (ODB) Systems [HIM02] Owner(s): publish database Servers: host database and provide query services Clients: query the owner’s database through servers Security Issues: untrusted or compromised servers Owner Clients Servers

3 Query Example Client Select * from T where 5<A<11 Server AB r1r1 … …… r i-1 5 riri 6 r i+1 9 r i+2 12 Owner AB r1r1 … …… r i-1 5 riri 6 r i+1 9 r i+2 12 Return 6,9

4 Injection Client Select * from T where 5<A<11 Server AB r1r1 … …… r i-1 5 riri 6 r i+1 9 r i+2 12 Owner AB r1r1 … …… r i-1 5 riri 6 r i+1 9 r i+2 12 Returns 6, 7, 9

5 Drop Client Select * from T where 5<A<11 Server AB r1r1 … …… r i-1 5 riri 6 r i+1 9 r i+2 12 Owner AB r1r1 … …… r i-1 5 riri 6 r i+1 9 r i+2 12 Returns 6

6 Omission Client Select * from T where 5<A<11 Server AB r1r1 … …… r i-1 5 riri 6 r i+1 9 r i+2 12 Owner AB r1r1 … …… r i-1 5 riri 6 r i+1 8 r i+2 9 r i+3 12 Returns 6,9 Update

7 Query Authentication Query Correctness results do exist in the owner's database Query Completeness no answers have been omitted from the result Query Freshness results are based on the most current version of the database

8 General Approach for Query Authentication in ODB Systems Client Query Q Server Owner AB r1r1 … …… r i-1 5 riri 6 r i+2 9 r i+3 12 Authenticated Structures Returns both result for Q and associated VO VO: verifiable object

9 Cost Metrics The computation overhead for the owner The owner-server communication cost The storage overhead for the server The computation overhead for the server The client-server communication cost The computation cost for the client (for verification) The update cost

10 Outline Problem overview Cryptographic tools Merkle B (MB) Tree Embedded Merkle B (EMB) Tree Related Works Experiments

K. McCurley, American Mathematical Society, Collision-resistant hash functions It is computational hard to find x 1 and x 2 s.t. h(x 1 )=h(x 2 ) Computational hard? Based on well established assumptions such as discrete logarithms [M90] SHA1 [SHA195] Observations: Computation cost: 3-6 s Storage cost: 20 bytes Under Crypto++ [crypto] and OpenSSL [openssl]

12 Public key digital signature schemes Sender Recipient KeyGen (SK, PK)  m Ver(m, PK, )  valid? m  SK Sign(m, SK)   Insecure Channel

S. Goldwasser S. Micali R. Rivest SIAM Journal on Computing R. Rivest A. Shamir L. Adleman, Commun. ACM Public key digital signature schemes Formally defined by [GMR88] One such scheme: RSA [RSA78] Observations Computation cost: about 3-4 ms for signing and us for verifying Storage cost: 128 bytes Under Crypto++ [crypto] and OpenSSL [openssl]

R. C. Merkle. CRYPTO, Merkle Hash Tree [M89] r1r1 r2r2 r3r3 r4r4 r5r5 r6r6 r7r7 r8r8 h1h1 h2h2 h3h3 h4h4 h5h5 h6h6 h7h7 h8h8 h 12 h 34 h 56 h 78 h 1..4 h 5..8 h 1..8  Sign(h 1..8,SK) h 12 = H(h 1 |h 2 )

15 Outline Problem overview Cryptographic tools Merkle B (MB) Tree Embedded Merkle B (EMB) Tree Related Works Experiments

16 Merkle B(MB) Tree h0h0 p1p1 k1k1 p0p0 h1h1 … pfpf kfkf hfhf h 10 p 11 k 11 p 10 h 11 h 1 =Hash(h 10 |…|h 1f ) Given page size P, fanout of B+ tree f is: f=(P-|int|-|h|)/(2|int|+|h|) For root node, =Sign(h 0 |…|h f )

17 Range Selection Query in MB tree Query range q LB(q) RB(q) Query subtree LCA(q) Path LCA(q) Path: its hash path in Merkle B tree

18 Query path L2L2 L3L3 L4L4 L1L1 L5L5 L6L6 L8L8 L9L9 L 10 L7L7 L 11 L 12 … I2I2 I3I3 I4I4 I1I1 I5I5 I6I6 I8I8 I7I7 … Query q LB(q) return r i return h i

19 Query Example: f= h1h1 h2h2 h3h3 h4h4 h5h5 h6h6 h7h7 h8h8 h 12 h 34 h 56 h 78 h 1..4 h 5..8 h 1..8  Sign(h 1..8,SK) q LB(q) RB(q) Select * from T where 5<A<11 LCA(q) h 1..4 Path LCA(q) VO: 5, 12, h 1..4, 

20 Client Side Verification h5h5 h6h6 h7h7 h8h8 h 56 h 78 h 1..4 h 5..8 h 1..8 Valid? Ver(h 1..8,PK,  ) q Select * from T where 5<A<11 VO: 5, 12, h 1..4,  Query results: 6, 9 Unknown to the client Reconstruct query subtree

21 Query Example: f= ………… q VO: 5 LB(q) tuple 5, 10 RB(q) 10, hash of 1, 3, 12, 14, 16, hash of entry 20, 29, 42 8 hashes

22 VO size of MB tree Hash values for sibling entries for nodes along the two boundary paths of query subtree Hash values for sibling entries for nodes along the path LCA(q).

23 Outline Problem overview Cryptographic tools Merkle B (MB) Tree Embedded Merkle B (EMB) Tree Related Works Experiments

24 Improve c/s comm. cost We can show that is minimized when 2<f<3. so f=2 is optimal in practice. However, the query efficiency is the worst.

25 Embedded Merkle B (EMB) tree: A fractal structure h0h0 p1p1 k1k1 p0p0 h1h1 … pfpf kfkf hfhf h 10 p 11 k 11 p 10 h 11 … p 1f k 1f h 1f A MB tree with fanout f e built on this node

26 Query and Authentication MB tree with fanout f K Each node is built with a MB tree with fanout f e

27 EMB tree Analysis We can show that: Query cost is as a MB tree with fanout f k Authentication cost (c/s comm. cost and client verification cost) is as a MB tree with fanout f e, intuition: f k is smaller than a normal MB tree given a page size P

28 Query Example: f= ………… q VO: 5 LB(q) tuple 5, 10 RB(q) 10, hash of red circle nodes(2), 5 hashes hash of red circle node, hash of red circle nodes(2),

29 EMB tree’s variants Don’t store the embedded tree, build it on the fly – EMB - tree Fanout f k is as a normal MB tree, better query performance, better storage performance Use multi-way search tree instead of B + tree as embedded tree – EMB * tree Hash path in the embedded tree could stop in index level, not necessary to go to the leaf level, hence reduce the VO size

H. Pang, A. Jain, K. Ramamritham, and K.-L. Tan.SIGMOD, Signature-Based Approach: ASB Tree based on [PJR05] S(r 1 |r 2 )S(r 2 |r 3 )……S( n-2 |r n-1 )S(r n-1 |r n ) 1.order database tuples w.r.t query attribute 2.sign consecutive pairs 3.build B+ tree on top of it 4.return tuples [a-1, b+1] together with signatures in [a-1, b]. (query is [a, b]) (a, b here are index) 5.verify any two consecutive pairs B+ Tree

E. Mykletun, M. Narasimha, and G. Tsudik. NDSS'0431 Reduce S/C comm. Cost [MNT04] Aggregation Signature: m1m1 11 mkmk kk m1m1  mkmk =combine( 1,…,  k ) Overhead: computation cost of modular multiplication with big modular base number (approx. 100 us per multiplication)

C. Martel, G. Nuckolls, P. Devanbu, M. Gertz, A. Kwong, and S. Stubblebine. Algorithmica Extend Merkle Tree for DAG Model [DGMS03] [MNDGKS04] DAG: Directed Acyclic Graph Apply the same idea used in merkle tree to a DAG structure They have briefly mentioned the possibility of using B tree to improve the query efficiency: MB tree is a generalization of this idea

33 Experiments Experiment setup Crypto function – Crypto++ and OpenSSL Pagesize: 1KB 100,000 tuples 2.8GHz Intel Pentium 4 CPU Linux Machine

34 Construction Cost: time

35 Construction Cost: Size

36 Query specific I/O:

37 VO construction I/O:

38 Query Cost: Total I/O

39 Query Cost: VO computation time

40 VO size

41 Verification time

42 Update for ASB Tree

43 Update cost

44 Conclusion Authenticated index structures that achieve good balance between query efficiency and authentication efficiency Other query types Multi-dimensional query authentication

45 Thanks! Download the Authenticated Index Structure Library prototype at:

46 References [CRYPTO] Crypto++ Library. weidai/cryptlib.html. [DGMS00] P. Devanbu, M. Gertz, C. Martel, and S. G. Stubblebine. Authentic third- party data publication. In IFIP Workshop on Database Security, [DGMS03] P. Devanbu, M. Gertz, C. Martel, and S. Stubblebine. Authentic data publication over the internet. Journal of Computer Security, 11(3), [GR97] R. Gennaro, P. Rohatgi. How to Sign Digital Streams. In Crypto 97 [GMR88] S. Goldwasser, S. Micali, and R. L. Rivest. A digital signature scheme secure against adaptive chosen-message attacks. SIAM Journal on Computing, 17(2), April [HIM02] H. Hacigumus, B. R. Iyer, and S. Mehrotra. Providing database as a service. In ICDE, [M90] K. McCurley. The discrete logarithm problem. In Cryptology and Computational Number Theory, Proc. Symposium in Applied Mathematics 42. American Mathematical Society, [M89] R. C. Merkle. A certied digital signature. In CRYPTO, 1989.

47 References [MNDGKS04] C. Martel, G. Nuckolls, P. Devanbu, M. Gertz, A. Kwong, and S. Stubblebine. A general model for authenticated data structures. Algorithmica, 39(1), [MNT04] E. Mykletun, M. Narasimha, and G. Tsudik. Authentication and integrity in outsourced databases. In Symposium on Network and Distributed Systems Security (NDSS'04), [NT05] M. Narasimha and G. Tsudik. Dsac: Integrity of outsourced databases with signature aggregation and chaining. In CIKM, [OPENSSL] OpenSSL. [PT04] H. Pang and K.-L. Tan. Authenticating query results in edge computing. In ICDE, [PJR05] H. Pang, A. Jain, K. Ramamritham, and K.-L. Tan. Verifying completeness of relational query results in data publishing. In SIGMOD, [RSA78] R. L. Rivest, A. Shamir, and L. Adleman. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM, 21(2), [SHA195]National Institute of Standards and Technology. FIPS PUB180-1: Secure Hash Standard. pub-NIST, 1995.

48 Cost Analysis Merkle B Tree Construction cost O/S comm. cost Storage Cost Server computation cost 0 Query cost O(log f n)

49 Cost Analysis Merkle B Tree Update cost O(log f n) C H +C s Update comm. cost O(log f n) |h|+|| C/S comm. cost Client computation cost

50 Freshness? Client Server query Owner update new signature(s):  v Return VO constructed based on previous version:  v-1 (s) q+VO emm, it’s correct!

51 Solution to Freshness Must have client-owner communication Reduce this communication cost is the key issue Observation: this cost is correlated with the number of signatures maintained in the authentication structure used by the owner

52 Other Query Types Projection Basic authenticated unit for the tuple Join Authenticating one relation first, then authenticate a set of selection queries into the other relation Aggregate Based on Aggregation Index

53 Condensed RSA [MNT04] KeyGen: Choose two large primes, p and q, pq Set n=pq Compute (n)=(p-1)(q-1) Choose e s.t. 1<e<(n) and e is coprime to (n) Compute d s.t. de1 (mod (n)) (d, n) is the secret key and (e, n) is the public key

54 Condensed RSA [MNT04] Sign: Given m i, compute h i =H(m i ) Compute Verify: Given m i, compute h i =H(m i ) Check that:

55 Updates Batch update will help! Using standard bin and ball argument, we can show that number of affected nodes for k updates is: Cost for Per-update approach

56 Updates Batch update still has linear (number of signing operations) cost. In terms of number of signing operations: Insertion - Best case: k+2 Worst case: 2k Deletion - Best case: 1 Worst case: k

57 Cost Analysis ASB tree Construction cost nC s +C b O/S comm. cost Storage Cost Server computation cost 0 or |q|C mod_mutiplication Query cost log f n+|q|/f+|q|||/P

58 Cost Analysis ASB tree Update cost 2C s or C s Update comm. cost 2|| or || C/S comm. cost |q|||+|q| or ||+|q| Client computation cost |q|C v or C v +|q|C mod_mutiplication

M. Narasimha and G. Tsudik. CIKM, Multi-dimensional Range Query [NT05] Signature chaining -- r6r6 r5r5 r7r7 ++ -- r2r2 r5r5 r6r6 ++ -- r7r7 r5r5 r 12 ++ A1A1 A2A2 A3A3 Ordered

60 Tradeoff: query vs. authentication efficiency Key observations: Query efficiency vs. authentication efficiency Impossible to have one solution that optimizes all cost metrics