Outline Why distributed computing? Atomic Broadcast The atom system Relevance for e-textiles What’s next? Q&A
Why Distributed Computing? Spread and balance the computational weight of applications Solve bigger problems Deal with problems locally instead of centralizing all the data
Example Space filtering vs. raw consensus –Acoustic Beam Forming: master collects information from slaves and decides according to the relevance of data –Consensus: no master, all processes decide upon one common value
Atomic Broadcast: Definition (1) Atomic Broadcast = the same set of messages is delivered by all the processes in the same order Consensus = all processes decide upon one common value among those proposed
Atomic Broadcast: Definition (2) Validity: If a correct process broadcasts a message m it will eventually receive it Uniform agreement: If a process delivers a message m then every correct process will deliver it Uniform integrity: Every message m is delivered at most once and only if it was reliably broadcasted by sender(m) Total order: If 2 correct processes p and q deliver 2 messages m and m’ then p delivers m before m’ iff q delivers m before m’
Atomic Broadcast: Bad News Impossibly to achieve in a totally asynchronous system [Fisher, Lynch, Patterson 85]
Atomic Broadcast: Good News Can be done using unreliable failure detectors Based on a Consensus algorithm described in [Chandra, Toueg 96]
Atom Open source Atomic Broadcast system
Atom One_run do_decide do_Consensus AB task 2 AB task 3 AB task1 RB FD trust FD suspect R-broadcast Producer Consumer A-deliver A-broadcast start cancel
Relevance to E-textiles Synchronization of data Coordination of decisions and actions Light-weight process Buffer sizes can be predicted
What’s Next? Scalability is a problem for classic fault- tolerant distributed algorithms Bimodal Multicast [Ken Birman, Mark Hayden, Oznur Ozkasap, Zhen Xiao, Mihai Budiu, Yaron Minsky – 1998] –Gossip protocol –Relaxes the “strong” reliability guarantees replacing them with probabilistic guarantees –Converges to “strong” reliability in the absence of failures –Scalable with steady throughput
Questions …