Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari.

Similar presentations


Presentation on theme: "1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari."— Presentation transcript:

1 1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari and W. Shapiro

2 Stanford Database Seminar2 What? Ÿ Trusted Storage Ÿcan be read and written only by trusted programs

3 Stanford Database Seminar3 Why? Digital Rights Management content contract

4 Stanford Database Seminar4 What? Revisited processor trusted storage volatile memory untrusted storage <50B

5 Stanford Database Seminar5 What? Refined ŸMust protect also against accidental data corruption atomic updates efficient backups type-safe interface automatic index maintenance ŸMust run in an embedded environment small footprint ŸMust provide acceptable performance

6 Stanford Database Seminar6 What? Refined ŸCan assume single-user workload none or a simple concurrency control optimized for response time, not throughput lots of idle time (can be used for database reorganization) ŸCan assume a small database 100 KB to 10 MB can cache the working set –no-steal buffer management

7 Stanford Database Seminar7 A Trivial Solution ŸCritique: does not protect metadata cannot use sorted indexes untrusted storage COTS dbms encryption, hashing key H(db) plaintext data trusted storage db

8 Stanford Database Seminar8 A Better Solution ŸCritique: must scan, hash and crypt the entire db to read or write untrusted storage (COTS) dbms encryption, hashing key H(db) plaintext data trusted storage db

9 Stanford Database Seminar9 Yet A Better Solution ŸOpen issues: could we do better than a logarithmic overhead? could we integrate the tree search with data location? (COTS) dbms encryption, hashing key H(A) plaintext data H(B)H(C) H(D)H(E)H(F)H(G) A D CB EFG untrusted storage

10 Stanford Database Seminar10 TDB Architecture Trusted storage Untrusted storage Chunk Store encryption, hashing atomic updates Object Store object cache concurrency control Collection Store index maintenance scan, match, range Chunk byte sequence 100B--100KB Object abstract type Backup Store full / incremental validated restore Collections of Objects

11 Stanford Database Seminar11 Chunk Store - Specification ŸInterface allocate() -> ChunkId write( ChunkId, Buffer ) read( ChunkId ) -> Buffer deallocate( ChunkId ) ŸCrash atomicity commit = [ write | deallocate ]* ŸTamper detection raise an exception if chunk validation fails

12 Stanford Database Seminar12 Chunk Store – Storage Organization ŸLog-structured Storage Organization no static representation of chunks outside of the log log in the untrusted storage ŸAdvantages traffic analysis cannot link updates to the same chunk atomic updates for free easily supports variable-sized chunks copy-on-write snapshots for fast backups integrates well with hash verification (see next slide) ŸDisadvantages destroys clustering (cacheable working set) cleaning overhead (expect plenty of idle time)

13 Stanford Database Seminar13 Chunk Store - Chunk Map ŸIntegrates hash tree and location map Map: ChunkId  Handle Handle = ‹Hash, Location› MetaChunk = Array[Handle] trusted storage H(R) X Y meta chunks data chunks T S R

14 Stanford Database Seminar14 Chunk Store - Read ŸBasic scheme: Dereference handles from root to X ŸDerefence use location to fetch use hash to validate trusted storage H(R) X Y T S R cached ŸOptimized trusted cache: ChunkId  Handle look for cached handle upward from X derefence handles down to X avoids validating entire path

15 Stanford Database Seminar15 Chunk Store - Write ŸBasic: write chunks from X to root trusted storage H(R) X Y T S R dirty ŸOptimized: buffer dirty handle of X in cache defer upward propagation

16 Stanford Database Seminar16 Chunk Store - Checkpointing the Map ŸWhen dirty handles fill cache write affected meta chunks to log write root chunk last X... X S R T meta chunks trusted storage H(R)

17 Stanford Database Seminar17... Y Chunk Store - Crash Recovery ŸProcess log from last root chunk residual log checkpointed log ŸMust validate residual log crash X... X S R T trusted storage H(R) residual log

18 Stanford Database Seminar18 Chunk Store - Validating the Log ŸKeep incremental hash of residual log in trusted storage updated after each commit ŸHash protects all current chunks in residual log: directly in checkpointed log: through chunk map... Y crash X... X S R T trusted storage H*(residual-log) residual log

19 Stanford Database Seminar19... c.c. 74 X c.c. 73 Chunk Store - Counter-Based Log Validation ŸA commit chunk is written with each commit contains a sequential hash of commit set signed with system secret key ŸOne-way counter used to prevent replays ŸBenefits: allows bounded discrepancy between trusted and untrusted storage doesn’t require writing to trusted storage after each transaction crash X... X S R T residual log hash

20 Stanford Database Seminar20 Chunk Store - Log Cleaning ŸLog cleaner creates free space by reclaiming obsolete chunk versions ŸSegments Log divided into fixed-sized regions called segments ( ~100 KB) Segments are securely linked in the residual log for recovery ŸCleaning step read 1 or more segments check chunk map to find live chunk versions –ChunkId’s in the headers of chunk versions write live chunk versions to the end of log mark segments as free ŸMay not clean segments in residual log

21 Stanford Database Seminar21 Chunk Store - Multiple Partitions ŸPartitions may use separate crypto parameters (algorithms, keys) ŸEnables fast copy-on-write snapshots and efficient backups ŸMore difficult for the cleaner to test chunk version liveness Partition Map Position Maps Data chunks P Q Partition Map Position Maps Data chunks P Q D D2

22 Stanford Database Seminar22 Chunk Store - Cleaning and Partition Snapshots Q&P P.a P.bP.c PQ P.aP.b PQ P.c P.aP.bP.c Snaphot P  QP updates cCleaner moves Q’s c P.aP.bP.c...P.c...P.c... Checkpoint Crash!! Residual log

23 Stanford Database Seminar23 Backup Store ŸCreates and restores backups of partitions ŸBackups can be full or incremental ŸBackup creation utilizes snapshots to guarantee backup consistency (wrt concurrent updates) without locking ŸSupports full and incremental backups of partitions ŸBackup Store must verify during a backup restore integrity of the backup (using a signature) correctness of incremental restore sequencing

24 Stanford Database Seminar24 Object Store ŸProvides type-safe access to named C++ objects objects provide pickle and unpickle methods for persistence but no transparent persistence ŸImplements full transactional semantics in addition to atomic updates ŸMaps each object into a single chunk less data written and read from the log simplifies concurrency control ŸProvides an in-memory cache of decrypted, validated, unpickled, type-checked C++ objects ŸImplements no-steal buffer management policy

25 Stanford Database Seminar25 Collection Store ŸProvides access to indexed collections of C++ objects using scan, exact match and range queries ŸPerforms automatic index maintenance during updates implements insensitive iterators ŸUses functional indices an extractor function is used to obtain a key from an object ŸCollections and indexes are represented as objects index nodes locked according to 2PL

26 Stanford Database Seminar26 Performance Evaluation - Benchmark ŸCompared TDB to BerkeleyDB using TPC-B ŸUsed TPC-B because: implementation included with BerkeleyDB BerkeleyDB functionality limited choice of benchmarks (e.g., 1 index per collection)

27 Stanford Database Seminar27 Performance Evaluation - Setup ŸEvaluation platform 733 MHz Pentium II, 256 MB Windows NT 4.0, NTFS files EIDE disk, 8.9 ms (read), 10.9 ms write seek time 7200 RPM (4.2 ms avg. rot. latency) one-way counter: file on NTFS ŸBoth systems used a 4 MB cache ŸCrypto parameters (for secure version of TDB): SHA-1 for hashing (hash truncated to 12 B) 3DES for encryption

28 Stanford Database Seminar28 Performance Evaluation - Results ŸResponse Time (avg over 100,000 transactions in a steady state): ŸTDB utilization was set to 60% 6.8 3.8 5.8 0 1 2 3 4 5 6 7 8 BerkeleyDBTDBTDB-S avg. response time (ms)

29 Stanford Database Seminar29 Response Time vs. Utilization ŸMeasured response times for different TDB utilizations:

30 Stanford Database Seminar30 Related Work ŸTheoretical work Merkle Tree 1980 Checking correctness of memory (Blum, et. al. 1992) ŸSecure audit logs, Schneier & Kelsey 1998 append-only data read sequentially ŸSecure file systems Cryptographic FS, Blaze ‘93 Read-only SFS, Fu et al. ‘00 Protected FS, Stein et al. ‘01

31 Stanford Database Seminar31 A Retrospective Instead of Conclusions ŸGot lots of mileage from using log-structured storage ŸPartitions add lots of complexity ŸCleaning not a big problem ŸCrypto overhead small on modern PCs (< 6%) ŸCode footprint too large for many embedded systems needs to be within 10 KB GnatDb (see a TR) ŸFor More Information: OSDI 2000 -- “How to Build a Trusted Database System on Untrusted Storage.” U. Maheshwari, R. Vingralek, W. Shapiro Technical Reports available at http://www.star-lab.com/tr/

32 Stanford Database Seminar32 Database Size vs. Utilization

33 Stanford Database Seminar33


Download ppt "1 Why and How to Build a Trusted Database System on Untrusted Storage? Radek Vingralek STAR Lab, InterTrust Technologies In collaboration with U. Maheshwari."

Similar presentations


Ads by Google