Download presentation
Presentation is loading. Please wait.
Published byAubrie Boyd Modified over 9 years ago
1
Improving Robustness in Distributed Systems Per Bergqvist per@synapse.se per@synapse.se Erlang User Conference 2001 (courtesy CellPoint Systems AB)
2
Design base Cluster of cooperating hosts Erlang and C COTS hardware based Unix based (i.e. Solaris or Linux) 10/100/1000 base-T back plane (”system area network”)
3
Cluster Shared, distributed, system configuration Each host have ONE cluster controller Dispatch and supervise worker tasks Master cluster controller: holds configuration database (persistent replica) Slave cluster controller: gets configuration from master cluster controllers Cluster is DOWN when all master cluster controllers are inaccessible
4
Typical system Firewall Switch Traffic Control
5
Cluster Key Benefits Single system view Enforces decoupling of parts of O&M from actual traffic processing
6
Implementing a cluster Cluster->Host->Node->NodeData Cluster global parameters Subscription mechanisms for conf. changes Mnesia as configuration database on master cluster controllers Homebrewn configuration distribution to slave controllers (NOT using mnesia) (Worker) node supervision
7
Mnesia gotchas First distributed node startup Disallow writes when all replicas not accessible Use timeout on table load and force load
8
... BUT... TCP based distribution Network partitioning
9
Network parameters Align TCP retransmission intervals w/ Erlang heartbeats Align TCP and IP rerouting parameters
10
Typical system II: Dual back plane Firewall Switch Traffic Control
11
Erlang multi-homing problem Host A Host B Host C
12
Multi-home Erlang w/ TCP Add an alias interface to loop back i/f Patch tcp distribution to bind to alias Publish alias interface on (all wanted) via real hw i/f’s Method 1: Static routes and gratuitous/proxy arp Method 2: Use new (routing) protocol
13
ARP method Implement a utility to: - broadcast unsolicited ARP responses - respond to ARP requests for the alias i/f address Add static routes on all far end systems NOTE: all real i/f needs to be on same IP subnet
14
New routing protocol Broadcast (Ethernet frames) what you have, including interface priority Let the far end select path based on what/when they receive Far end dynamically sets up host routes Use short retransmission intervals
15
Erlang multi-homing resolved ? Host A Host B Host C
16
Summing up Erlang can support multihoming with some additional work By using loop back alias i/f, link failure becomes a routing problem (peer-peer association is kept intact) Solaris TCP/IP stack parameters are: - hard to find (only in out-of-date app. notes) - hard to set ”right” - host global A distribution mechanism with built-in support for multi-homing preferred
17
Erlang Distribution over SCTP Per Bergqvist et al per@synapse.se per@synapse.se Erlang User Conference 2002
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.