Carnegie Mellon Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor.

Carnegie Mellon Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter reiter@cmu.edu Natassa Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor

Carnegie Mellon Technical Objective To design, prototype and evaluate new protocols for implementing intrusion-tolerant services that scale better  Here, “scale” refers to efficiency as number of servers and number of failures tolerated grows Targeting three types of services  Read-write data objects  Custom “flat” object types for particular applications, notably directories for implementing an intrusion-tolerant file system  Arbitrary objects that support object nesting

Carnegie Mellon Expected Impact Significant efficiency and scalability benefits over today’s protocols for intrusion tolerance For example, for data services, we anticipate  At-least twofold latency improvement even at small configurations (e.g., tolerating 3-5 Byzantine server failures) over current best  And improvements will grow as system scales up  A twofold improvement in throughput, again growing with system size Without such improvements, intrusion tolerance will remain relegated to small deployments in narrow application areas

Carnegie Mellon The Problem Space Distributed services manage redundant state across servers to tolerate faults  We consider tolerance to Byzantine faults, as might result from an intrusion into a server or client  A faulty server or client may behave arbitrarily  We also make no timing assumptions in this work  An “asynchronous” system Primary existing practice: replicated state machines  Offers no load dispersion, requires data replication, and degrades as system scales in terms of # messages

Carnegie Mellon Evaluation Baseline for current work: the BFT library  Popular, publicly available implementation of Byzantine fault-tolerant state machine replication (by Castro & Liskov)  Reported to be an efficient implementation of that approach Two measures  Average latency of operations, from client’s perspective  Peak sustainable throughput of operations Our consistency definition: linearizability of invocations

Carnegie Mellon Data block Background - Read/Write protocol Servers provide read/write block interface Servers version blocks on every write Decentralized, optimistic, scalable, Byzantine fault-tolerant Client Servers DDDD DDDD

Carnegie Mellon R/W semantics R/W protocol appropriate for block storage But R/W protocol inappropriate for building general services  Doesn’t provide replicated state machine semantics A metadata service for a R/W-based block store motivated us to develop a protocol with stronger semantics

Carnegie Mellon Directory DDDD Client A Client B R/W semantics insufficient for metadata Consider 2 clients inserting a file in the same directory Last write wins; good for blocks, bad for directories Directory DDDD DDDD

Carnegie Mellon Query/Update (Q/U) protocol A protocol with replicated state machine semantics  Provides linearizable query and update operations Protocol properties  Decentralized  Handles Byzantine clients & server failures, asynchronous  Efficient common case operation  Optimistic protocol leverages versioning servers  Single-phase queries and updates, if concurrency- and failure-free  Avoids expensive cryptography (digital signatures)  Scalable  Avoids server-to-server broadcast  Atomic multi-object updates

Carnegie Mellon Outline Motivation Query/Update protocol  Overview  Query, update operations  Validation, object syncing, multi-object operations Evaluation

Carnegie Mellon Directory Read/conditional-write primitive Servers accept an update operation only if the object hasn’t been modified since read DDDD Client A Client B Directory DDDD directory

Carnegie Mellon Handling Byzantine clients For Byzantine fault-tolerance, clients must pass operation to servers  Constrains clients to narrow object interface Servers apply operation to old object to validate new object Op DDDD Directory DDDD directory directory + op

Carnegie Mellon Clients and objects Client just sends operations  Client does not read/write object Server applies operation to local object Op DDDD DDDD history op

Carnegie Mellon Query/Update protocol Servers host objects  Optimistic protocol  versioning Export an operation interface (more than read/write)  Can export any deterministic operation Server exports three types of operations: Server 5 4 3 A 1 0 B 9 8 C Read History (object) Returns timestamp vector Query (Object,Version) Read-only; returns object state; e.g., getattr Update (Object, OHS, Value) Mutating; updates object, conditioned on object not having been modified; e.g., setattr

Carnegie Mellon Read history operation Client requests version history of an object Each server replies with a list of timestamps 2 11 3 2 1 2 1 2 1 Object History Set (OHS) read-history history-reply Time 222 111 2

Carnegie Mellon Query operation Client performs read history operation  Constructs OHS and identifies Latest version that is complete Client queries Latest version at server 2 11 3 2 1 2 2 1 2 1 read-history history-reply query query-reply Latest Time 2 2 2 2 222 111 Object History Set (OHS)

Carnegie Mellon Update operation Client performs read-history operation  Constructs OHS and identifies Latest version that is complete Client sends operation and OHS to servers  Operation is conditioned on OHS 3 2 1 3 1 3 2 1 3 2 1 3 2 1 read-history history-reply update update-reply Time OHS Latest 222 111 2 Object History Set (OHS)

Carnegie Mellon Server validation for update operations A server needs to verify that the client conditioned operation on Latest Validation steps:  Ensure read/conditional-write semantics  Check that local history matches that in OHS  Classify Latest write version –Ensures operation is based on appropriate timestamp  Protection against Byzantine failures  Check authenticators –Ensures integrity of OHS

Carnegie Mellon Server validation example Earlier example of 2 clients concurrently updating same directory Servers reject client B’s operation, due to “stale” OHS 11 3 2 11 1 read-history history-reply update Time 111 111 Client A Client B 22 2 2

Carnegie Mellon Q/U protocol details Handling Byzantine clients and server faults  Through validating timestamps and OHS During classification of Latest, may require repair Incomplete operations: use barriers to fix failures Flexible protocol – can handle different types/# of faults  For asynchronous with Byzantine clients:  N = 3t + 2b + 1, to tolerate t server faults, b of which are Byzantine Object syncing Multi-object operations

Carnegie Mellon Object syncing A server may not have the latest version of an object If a server lacks latest version of object, the OHS contains information about which other servers have that version The server must sync the object with another server  Hashes in OHS allow server to validate the synced object

Carnegie Mellon Multi-object operation An update can span multiple objects A client must construct OHS for each object Servers perform validation for each object Operations perform atomically across multiple objects

Carnegie Mellon Prototype evaluation Built a counter object using Q/U and BFT protocols  inc method increments counter and returns new value  fetch method returns current counter value Light-weight operations to demonstrate network and computation overhead inherent to protocols Both Q/U and BFT implement efficient, optimistic queries  Evaluation focuses on updates Q/U common case: no concurrency; preferred quorums BFT common case: shared counter to allow batching

Carnegie Mellon Experimental setup Cluster of Pentium 4 2.8 GHz, 1GB RAM 1 Gb switched Ethernet, 18.3 Gbps/35.7 mpps switch  No background traffic Working size of experiments fit in server memory  To focus on protocol overhead, not on disk accesses Experiments are run for 30 seconds  Measurements from middle 10 seconds

Carnegie Mellon Fault scalability (1) Investigate throughput as the number of server faults ( b ) tolerated increases Measured saturated throughput  Ran with 1, 3, 5, …, 20 clients with 2 outstanding reqs  For each b, selected highest throughput value

Carnegie Mellon Fault scalability (2)

Carnegie Mellon Throughput and response time under load (1) Investigate throughput & response time under load Demonstrates protocol behavior beyond saturated throughput data point Increased number of clients from 1 to 20 for b = 1

Carnegie Mellon Throughput and response time under load (2)

Carnegie Mellon Conclusions Developed the Q/U protocol for accessing shared objects in a distributed system  Fault-scalable  Byzantine fault-tolerant  Optimistic, efficient  Atomic multi-object operations Evaluation  Protocol scales with number of failures tolerated  Throughput & response time consistent under load

Carnegie Mellon Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor.

Similar presentations

Presentation on theme: "Carnegie Mellon Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Carnegie Mellon Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor.

Similar presentations

Presentation on theme: "Carnegie Mellon Increasing Intrusion Tolerance Via Scalable Redundancy Mike Reiter Natassa Ailamaki Greg Ganger Priya Narasimhan Chuck Cranor."— Presentation transcript:

Similar presentations

About project

Feedback