Carnegie Mellon December 2005 SRS Principal Investigator Meeting Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa.

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter reiter@cmu.edu Natassa Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Technical Objective To design, prototype and evaluate new protocols for implementing intrusion-tolerant services that scale better  Here, “scale” refers to efficiency as number of servers and number of failures tolerated grows Targeting three types of services  Read-write data objects  Custom “flat” object types for particular applications, notably directories for implementing an intrusion-tolerant file system  Arbitrary objects that support object nesting

Carnegie Mellon December 2005 SRS Principal Investigator Meeting The Problem Space Distributed services manage redundant state across servers to tolerate faults  We consider tolerance to Byzantine faults, as might result from an intrusion into a server or client  A faulty server or client may behave arbitrarily  We also make no timing assumptions in this work  An “asynchronous” system Primary existing practice: replicated state machines  Offers no load dispersion, requires data replication, and degrades as system scales in terms of # messages  When appropriate, we use Castro & Liskov’s BFT system to compare against

Carnegie Mellon December 2005 SRS Principal Investigator Meeting This Talk in Context January 2005 PI meeting  Focused on the basic R/W protocol July 2005 PI meeting  Focused on the Q/U protocol for implementing arbitrary “flat” objects This meeting  Discuss “lazy verification” extensions to R/W protocol  Discuss nested objects protocol

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Response time under load is fault-scalable Highlights: Read/Write Response Time 10 clients and up to 26 storage-nodes 2.8 GHz Pentium IV machines  Used as storage- nodes and clients 10 Clients, each with 2 reqs outstanding Mixed workload equal parts reads and writes 4 KB data-item size

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Highlights: The Q/U Protocol Working size of experiments fit in server memory Tests run for 30 seconds  Measurements taken in middle 10 Cluster of Pentium 4 2.8 GHz, 1GB RAM 1 Gb switched Ethernet  No background traffic

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Read/Write Failure Scenarios Two types of failures  Incomplete writes  Client writes data to subset of servers  Poisonous writes  Client writes data inconsistently to servers –Subsequent reader observes different values depending on which subset of servers they interact with  Replicated data: easy to handle (via hashes)  Erasure-coded data: more difficult to handle Protocols must verify writes to protect against incomplete and poisonous writes.

Carnegie Mellon December 2005 SRS Principal Investigator Meeting The Nature of Write Operations… Insight for protocol design 1) Single data version forces write-time verification  Versioning servers remove destructive nature of writes 2) Obsolescent writes common in storage systems  Read-time verification avoids unnecessary verifications 3) Low concurrency in most workloads  Optimistic concurrency control

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Original Read/Write Protocol Use versioning servers  Frees servers from verifying every write at write-time  Read-time verification performed by clients  Better scalability  Avoids verification for obsolescent writes  Client read earlier versions in case of incomplete/poisonous writes Optimism premised on low faults/concurrency Support erasure codes, Byzantine-tolerant, async Linearizable read/write ops on blocks

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Example Write and Read Time Completes write 2 Return version 2 Write 1 Write 2 Read Completes write 1 22 2 22 111 1 1

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Tolerating Client Crashes Time Different versions Write version 2 to rest of servers Write 2 Read Repair Client crashes… 2 2 2 22 111 1 1 

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Erasure Coding Reed-Solomon / information dispersal [Rabin89] Each fragment object size Total amount of data written: n fragments (160KB) Object (64KB) m fragments (64KB) Object (64KB) Example: 2-of-5 erasure coding Encode Decode Write Read

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Tolerating Byzantine Servers: Cross Checksum Cross checksum for 2-of-3 erasure coding Hash each fragment Generate fragments Concatenate to form cross checksum Append to each fragment

Carnegie Mellon December 2005 SRS Principal Investigator Meeting 1 =≠ 0 Tolerating Byzantine Clients Generate parity of {1,0} “Poisonous” 2-of-3 erasure coding of {1,0}  1 0 Value read depends on the set of fragments decoded {1,0}{1,1} {0,0}

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Validating Timestamps Embed cross checksum in timestamp Each server validates its write fragment Client validates cross checksum on read Logical timestamp Generate fragment Generate cross checksum Read fragments Validate cross checksum ≠

Carnegie Mellon December 2005 SRS Principal Investigator Meeting How Do You Get Rid of Old Versions? Two more pieces to complete the picture  Garbage collection (GC)  Unbounded number of incomplete/poisonous writes

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Lazy Verification Overview Servers can perform verification, lazily, in idle time  Shifts verification cost out of read/write critical path  Allows servers to perform GC Per-client, per-block limits on unverified writes  Limits number of incomplete/poisonous writes Maintains good R/W properties  Optimism  Verification-elimination for obsolescent writes

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Periodically, every server…  Scans through all blocks…  Performs a read (acting like a normal client) –Discover latest complete write timestamp (LCWTS) –Reconstruct block to check for poisonous writes  Deletes all versions prior to LCWTS Basic Garbage Collection Verification

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Limiting Unverified Writes Admin can set limits on # of unverified writes  Per-client, per-block, and per-client-per-block Limit = 1  write-time verification Limit = ∞  read-time verification

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Scheduling Background  In idle time (hence the lazy in lazy verification) On-demand  Verification limits reached  Low free space in history pool (cache or disk)

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Block Selection If verification invoked due to exceeded limit  No choice; verify that (clients’) block Else  Verify block with most versions  Maximize the verification cost amortization  Prefer to verify blocks in cache  No unnecessary disk write  No read to start verification  No cleaning of on-disk version structures

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Server Cooperation Simple: every server independently verifies every block ~n 2 messages Read request Read reply

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Server Cooperation (con’t) Cooperation: b +1 servers perform verification, share result ~b · n messages b = 1 Verification hint Read request Read reply

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Experimental Setup 2.8 GHz Pentium 4 machines  Used as servers and clients 1 Gb switched Ethernet, no background traffic In-cache only (to evaluate protocol cost) 16KB blocks Vary number of server Byzantine failures ( b )  n = 4b + 1 ( b +1)-of- n encoding  Maximal storage- and network-efficiency

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Response Time Experiment 1 client, 1 outstanding request Vary b from 1 to 5, to investigate changes in response times as we tolerate more server failures Alternate between reads and writes Idle time: 10ms between operations  Allows verification to occur in background

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Write Response Time

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Read Response Time

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Write Throughput b = 1  N = 5 4 clients, 8 outstanding requests each  No idle time Server working set: 4096 blocks (64MB) 100% writes In-cache only  Full history pool triggers lazy verification Vary the server history pool size to see effect of delaying verification

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Write Throughput (con’t)

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Nested Objects Goal: support nested method invocations among Byzantine fault- tolerant, replicated objects that are accessed via quorum systems  Semantics and programmer interface modeled after Java Remote Method Invocation (RMI) [http://java.sun.com/products/jdk/rmi/]  Distributed objects can be  Passed as parameters to method calls on other distributed objects  Returned from method calls on other distributed objects

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Java Remote Method Invocation (RMI) Standard Java mechanism to invoke methods on objects in other JVMs Local interactions are with a handle that implements interfaces of remote object invocation response handleremote object local client remote server

Carnegie Mellon December 2005 SRS Principal Investigator Meeting RMI: Nested Method Invocations Handles can be passed as parameters into method invocations on other remote objects A method invocation on one remote object could result in a method invocation on other remote objects

Carnegie Mellon December 2005 SRS Principal Investigator Meeting RMI: Handle Returned Handles can be returned from method invocations on other remote objects

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Replicated Objects Replicas behave as a single logical object Can withstand the Byzantine (arbitrary) failure of up to b servers Scales linearly with number of servers 1 2 ABCD handle replicas

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Quorum Systems Given a universe of n servers Quorum system is a set of subsets ( quorums ) of the universe, every pair of which intersect Scales well as a function of n, as quorum size can be significantly smaller than n Ex: Grid with n=144 1 quorum = 1 row + 1 column q1q1 q1q1 q2q2 q2q2

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Byzantine Quorum Systems Extend quorum systems to withstand the Byzantine failure of up to b servers Every pair of quorums intersect in >= 2 b +1 servers (>= b +1 correct servers) A new quorum must be selected if a response is not received from every server in a quorum Ex: Grid with n=144, b=3 1 quorum = 2 rows + 2 columns q1q1 q1q1 q2q2 q2q2

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Byzantine Quorum-Replicated Objects Method invocations sent to a quorum >= b+1 identical responses must be correct 1 2 ABCD handle replicas

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Nested Method Invocations 1 2 3 ABCD Handles can be passed as parameters into method invocations on other distributed objects

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Handle Returned 1 2 3 ABCD Handles can be returned from method invocations on other distributed objects

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Necessity of Authorization 1 2 3 ABCD Faulty replicas can invoke unauthorized methods Correct replicas might perform duplicate invocations

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Authorization Framework Requirements Method invocation authority can be delegated  Explicitly to other clients  Implicitly to other distributed objects  Handle passed as a parameter to a method invocation on a second object  Handle returned to a method invocation from a second object  Support arbitrary nesting depths

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Authorization Framework 1 2 3 ABCD 123 says (b+1 of {,,, } speak for ) 1234 (, ) = private/public key pair i = private key for i = certificate:

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Operation Ordering Protocol Worst-case 4-round protocol  Get  Suggest  Propose  Commit Extends protocol previously used in Fleet [Chockler et al. 2001] Operations are applied in batches, increasing throughput

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Operation Ordering Protocol - Client Side Fundamental challenge is the absence of a single trusted client  A trusted client could order all operations Instead, a single untrusted client replica drives the protocol Driving client:  Acts as a point of centralization to distribute authenticated server messages  Makes no protocol decisions  Is unable to cause correct servers to take conflicting actions  Can be unilaterally replaced by another client replica when necessary

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Experimental Setup Implemented object nesting as an extension of Fleet Pentium 4 2.8GHz processors 1000Mbps Ethernet (TCP, not multicast) Linux 2.4.27 Java HotSpot™ Server VM 1.5.0 Native Crypto++ Library for key generation, signing, and verification [http://www.cryptopp.com/]

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Latency for Non-Nested Invocation

Carnegie Mellon December 2005 SRS Principal Investigator Meeting A Real Byzantine Fault

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Impediments to Dramatic Increases Impossibility results  Load dispersion across quorums  Round complexity of protocols Strong consistency conditions  Weakening consistency is one place to look for big improvements

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa.

Similar presentations

Presentation on theme: "Carnegie Mellon December 2005 SRS Principal Investigator Meeting Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Carnegie Mellon December 2005 SRS Principal Investigator Meeting Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa.

Similar presentations

Presentation on theme: "Carnegie Mellon December 2005 SRS Principal Investigator Meeting Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa."— Presentation transcript:

Similar presentations

About project

Feedback