1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari.

1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari and W. Shapiro

Stanford Database Seminar2 What? Trusted Storage can be read and written only by trusted programs

Stanford Database Seminar3 Why? Digital Rights Management content contract

Stanford Database Seminar4 What? Revisited processor trusted storage volatile memory untrusted storage <50B

Stanford Database Seminar5 What? Refined Must protect also against accidental data corruption atomic updates efficient backups type-safe interface automatic index maintenance Must run in an embedded environment small footprint Must provide acceptable performance

Stanford Database Seminar6 What? Refined Can assume single-user workload none or a simple concurrency control optimized for response time, not throughput lots of idle time (can be used for database reorganization) Can assume a small database 100 KB to 10 MB can cache the working set –no-steal buffer management

Stanford Database Seminar7 A Trivial Solution Critique: does not protect metadata cannot use sorted indexes untrusted storage COTS dbms encryption, hashing key H(db) plaintext data trusted storage db

Stanford Database Seminar8 A Better Solution Critique: must scan, hash and crypt the entire db to read or write untrusted storage (COTS) dbms encryption, hashing key H(db) plaintext data trusted storage db

Stanford Database Seminar9 Yet A Better Solution Open issues: could we do better than a logarithmic overhead? could we integrate the tree search with data location? (COTS) dbms encryption, hashing key H(A) plaintext data H(B)H(C) H(D)H(E)H(F)H(G) A D CB EFG untrusted storage

Stanford Database Seminar10 TDB Architecture Trusted storage Untrusted storage Chunk Store encryption, hashing atomic updates Object Store object cache concurrency control Collection Store index maintenance scan, match, range Chunk byte sequence 100B--100KB Object abstract type Backup Store full / incremental validated restore Collections of Objects

Stanford Database Seminar11 Chunk Store - Specification Interface allocate() -> ChunkId write( ChunkId, Buffer ) read( ChunkId ) -> Buffer deallocate( ChunkId ) Crash atomicity commit = [ write | deallocate ]* Tamper detection raise an exception if chunk validation fails

Stanford Database Seminar12 Chunk Store – Storage Organization Log-structured Storage Organization no static representation of chunks outside of the log log in the untrusted storage Advantages traffic analysis cannot link updates to the same chunk atomic updates for free easily supports variable-sized chunks copy-on-write snapshots for fast backups integrates well with hash verification (see next slide) Disadvantages destroys clustering (cacheable working set) cleaning overhead (expect plenty of idle time)

Stanford Database Seminar13 Chunk Store - Chunk Map Integrates hash tree and location map Map: ChunkId  Handle Handle = ‹Hash, Location› MetaChunk = Array[Handle] trusted storage H(R) X Y meta chunks data chunks T S R

Stanford Database Seminar14 Chunk Store - Read Basic scheme: Dereference handles from root to X Derefence use location to fetch use hash to validate trusted storage H(R) X Y T S R cached Optimized trusted cache: ChunkId  Handle look for cached handle upward from X derefence handles down to X avoids validating entire path

Stanford Database Seminar15 Chunk Store - Write Basic: write chunks from X to root trusted storage H(R) X Y T S R dirty Optimized: buffer dirty handle of X in cache defer upward propagation

Stanford Database Seminar16 Chunk Store - Checkpointing the Map When dirty handles fill cache write affected meta chunks to log write root chunk last X... X S R T meta chunks trusted storage H(R)

Stanford Database Seminar17... Y Chunk Store - Crash Recovery Process log from last root chunk residual log checkpointed log Must validate residual log crash X... X S R T trusted storage H(R) residual log

Stanford Database Seminar18 Chunk Store - Validating the Log Keep incremental hash of residual log in trusted storage updated after each commit Hash protects all current chunks in residual log: directly in checkpointed log: through chunk map... Y crash X... X S R T trusted storage H*(residual-log) residual log

Stanford Database Seminar19... c.c. 74 X c.c. 73 Chunk Store - Counter-Based Log Validation A commit chunk is written with each commit contains a sequential hash of commit set signed with system secret key One-way counter used to prevent replays Benefits: allows bounded discrepancy between trusted and untrusted storage doesn’t require writing to trusted storage after each transaction crash X... X S R T residual log hash

Stanford Database Seminar20 Chunk Store - Log Cleaning Log cleaner creates free space by reclaiming obsolete chunk versions Segments Log divided into fixed-sized regions called segments ( ~100 KB) Segments are securely linked in the residual log for recovery Cleaning step read 1 or more segments check chunk map to find live chunk versions –ChunkId’s in the headers of chunk versions write live chunk versions to the end of log mark segments as free May not clean segments in residual log

Stanford Database Seminar21 Chunk Store - Multiple Partitions Partitions may use separate crypto parameters (algorithms, keys) Enables fast copy-on-write snapshots and efficient backups More difficult for the cleaner to test chunk version liveness Partition Map Position Maps Data chunks P Q Partition Map Position Maps Data chunks P Q D D2

Stanford Database Seminar22 Chunk Store - Cleaning and Partition Snapshots Q&P P.a P.bP.c PQ P.aP.b PQ P.c P.aP.bP.c Snaphot P  QP updates cCleaner moves Q’s c P.aP.bP.c...P.c...P.c... Checkpoint Crash!! Residual log

Stanford Database Seminar23 Backup Store Creates and restores backups of partitions Backups can be full or incremental Backup creation utilizes snapshots to guarantee backup consistency (wrt concurrent updates) without locking Supports full and incremental backups of partitions Backup Store must verify during a backup restore integrity of the backup (using a signature) correctness of incremental restore sequencing

Stanford Database Seminar24 Object Store Provides type-safe access to named C++ objects objects provide pickle and unpickle methods for persistence but no transparent persistence Implements full transactional semantics in addition to atomic updates Maps each object into a single chunk less data written and read from the log simplifies concurrency control Provides an in-memory cache of decrypted, validated, unpickled, type-checked C++ objects Implements no-steal buffer management policy

Stanford Database Seminar25 Collection Store Provides access to indexed collections of C++ objects using scan, exact match and range queries Performs automatic index maintenance during updates implements insensitive iterators Uses functional indices an extractor function is used to obtain a key from an object Collections and indexes are represented as objects index nodes locked according to 2PL

Stanford Database Seminar26 Performance Evaluation - Benchmark Compared TDB to BerkeleyDB using TPC-B Used TPC-B because: implementation included with BerkeleyDB BerkeleyDB functionality limited choice of benchmarks (e.g., 1 index per collection)

Stanford Database Seminar27 Performance Evaluation - Setup Evaluation platform 733 MHz Pentium II, 256 MB Windows NT 4.0, NTFS files EIDE disk, 8.9 ms (read), 10.9 ms write seek time 7200 RPM (4.2 ms avg. rot. latency) one-way counter: file on NTFS Both systems used a 4 MB cache Crypto parameters (for secure version of TDB): SHA-1 for hashing (hash truncated to 12 B) 3DES for encryption

Stanford Database Seminar28 Performance Evaluation - Results Response Time (avg over 100,000 transactions in a steady state): TDB utilization was set to 60% 6.8 3.8 5.8 0 1 2 3 4 5 6 7 8 BerkeleyDBTDBTDB-S avg. response time (ms)

Stanford Database Seminar29 Response Time vs. Utilization Measured response times for different TDB utilizations:

Stanford Database Seminar30 Related Work Theoretical work Merkle Tree 1980 Checking correctness of memory (Blum, et. al. 1992) Secure audit logs, Schneier & Kelsey 1998 append-only data read sequentially Secure file systems Cryptographic FS, Blaze ‘93 Read-only SFS, Fu et al. ‘00 Protected FS, Stein et al. ‘01

Stanford Database Seminar31 A Retrospective Instead of Conclusions Got lots of mileage from using log-structured storage Partitions add lots of complexity Cleaning not a big problem Crypto overhead small on modern PCs (< 6%) Code footprint too large for many embedded systems needs to be within 10 KB GnatDb (see a TR) For More Information: OSDI 2000 -- “How to Build a Trusted Database System on Untrusted Storage.” U. Maheshwari, R. Vingralek, W. Shapiro Technical Reports available at http://www.star-lab.com/tr/

Stanford Database Seminar32 Database Size vs. Utilization

Stanford Database Seminar33

1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari.

Similar presentations

Presentation on theme: "1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari.

Similar presentations

Presentation on theme: "1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari."— Presentation transcript:

Similar presentations

About project

Feedback