Download presentation
Presentation is loading. Please wait.
1
Self Stabilizing Distributed File System Department of Computer Science, Ben-Gurion University A BGU – IBM joint project
2
DFS Motivation Performance.Performance. Fault tolerance, any server can take responsibility for any role.Fault tolerance, any server can take responsibility for any role. Place files closer to users (local file access).Place files closer to users (local file access).
3
What Is Self-stabilizing? A self-stabilizing system is a system that can automatically recover following the occurrence of (transient) faults. The idea is to design system that can be started in an arbitrary state and still converge to a desired behavior. Self-Stabilization/S. Dolev
4
Self Stabilization Motivation The combination and type of faults cannot be totally anticipated in on- going systems.The combination and type of faults cannot be totally anticipated in on- going systems. Any on-going system must be Self stabilizing (or manually monitored).Any on-going system must be Self stabilizing (or manually monitored). Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults.Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults.
5
Design File system replication servers are coordinated using a spanning tree.File system replication servers are coordinated using a spanning tree. Tree is constructed by self- stabilizing update algorithm using multicast messages.Tree is constructed by self- stabilizing update algorithm using multicast messages. Updates are propagated using self-stabilizing -synchronizer.Updates are propagated using self-stabilizing -synchronizer.
6
Design (Cont ’ ) Clients join the replication tree and forms a caching tree.Clients join the replication tree and forms a caching tree. File leases are used to provide cache consistency.File leases are used to provide cache consistency.
7
Replication Tree Using a layered self-stabilizing algorithm, we construct a single spanning tree consisting the file system servers.Using a layered self-stabilizing algorithm, we construct a single spanning tree consisting the file system servers.
8
Leader Election A single leader coordinates the construction of the spanning tree.A single leader coordinates the construction of the spanning tree. If no leader exists, a server becomes a leader.If no leader exists, a server becomes a leader. If more than one leader exist, the server with the minimal ID survivesIf more than one leader exist, the server with the minimal ID survives Message are periodical sent using global multicast (or broadcast).Message are periodical sent using global multicast (or broadcast).
9
Leader Election Algorithm Every T 1 do: – –If (p = leader) then send-multicast( ‘ I ’ m a leader ’ ) – –Leader-exists = true Every T 1 +T d do: – –If (not leader-exists) then leader = p – –Leader-exists = false Upon arrival of message do: – –If (p.volume=volume) then If (p=leader) then leader = min(leader,sender) Else leader = sender – –Leader-exists = true
10
Spanning Tree Construction A network version of the self- stabilizing update algorithm.A network version of the self- stabilizing update algorithm. Multicast messages with a limited -local TTL.Multicast messages with a limited -local TTL. Define Neighboring relation for the update algorithm.Define Neighboring relation for the update algorithm. Keep the communication graph connected.Keep the communication graph connected.
11
Induced Graph Example
12
Update Algorithm Collect routing tables from all neighbors in the induced graph.Collect routing tables from all neighbors in the induced graph. Build a distributed BFS spanning tree from the tables.Build a distributed BFS spanning tree from the tables. Select a manager (local leader) for the tree, a server with the minimal ID.Select a manager (local leader) for the tree, a server with the minimal ID.
13
Tree Optimization Update algorithm creates connected components for the communication graph that is induced by the radius.Update algorithm creates connected components for the communication graph that is induced by the radius. Goal: Find the minimal radius that keeps connectivity.Goal: Find the minimal radius that keeps connectivity. Increase by a factor of 2 until a single component spans the system.Increase by a factor of 2 until a single component spans the system. Run a 2 nd instance of update with < radius and compare outputs, if the same, decrease .Run a 2 nd instance of update with < radius and compare outputs, if the same, decrease . Search for using binary search.Search for using binary search.
14
Tree Structure
15
Replication Consistency A self-stabilizing -synchronizer verifies that the signatures of accessed files are identical in all servers.A self-stabilizing -synchronizer verifies that the signatures of accessed files are identical in all servers. If more than a single signature exist then there is a conflict.If more than a single signature exist then there is a conflict. The leader decides (user defined algorithm) on the correct file content and notifies the servers.The leader decides (user defined algorithm) on the correct file content and notifies the servers.
16
Caching Tree Clients extends the replication tree to a caching tree.Clients extends the replication tree to a caching tree. The same update algorithm construct both replication and caching tree (minor modification are required).The same update algorithm construct both replication and caching tree (minor modification are required).
17
Cache Tree Diagram
18
File Access Read request is sent to the tree parent (either a server or cache).Read request is sent to the tree parent (either a server or cache). Write request travels to the replication tree root (leader) and propagates by the -synchronizer.Write request travels to the replication tree root (leader) and propagates by the -synchronizer. Caching consistency depends on the propagation mechanism.Caching consistency depends on the propagation mechanism.
19
Read/Write Example
20
Linux Based bguFS (1) Application User Level Kernel Level VFS bguFSModule Cache: valid data? Local file system Kernel update SyncDaemon: Cache manager & Server Upcalls Network Communication Updates
21
Linux Based bguFS (2) Application User Level Linux libc library Library File Commands New implementation for “C” commands: fopen, fclose, fread, fwrite, etc … SyncDaemon: Cache manager & Server Upcalls Network Communication
22
Tasks Leader election and a radius based spanning tree.Leader election and a radius based spanning tree. Optimal radius (binary) search and beta-synchronizer.Optimal radius (binary) search and beta-synchronizer. Distributed file R/W (operations) implementation.Distributed file R/W (operations) implementation. Kernel VFS module (1).Kernel VFS module (1). C library “ hacking ” solution (2).C library “ hacking ” solution (2).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.