Download presentation
Presentation is loading. Please wait.
1
Self Stabilizing Distributed File System Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM
2
DFS Motivation Performance Fault tolerance Placing files closer to users
3
Related Work File systems NFS – network file system protocol AFS – Andrew file system – CMU(1988) Coda - CMU (1998) Intermezzo – Peter J. Braam, CMU Peer to peer (2000) Global storage: OceanStore – Berkeley Server less: Microsoft Farsite.
4
Talk Overview Self-stabilization Design Algorithms File system implementation Future work
5
Self Stabilization Self healing Adaptiveness Automatic recovery Autonomic computing Self Stabilization Dijkstra 1974
6
Self Stabilization A self-stabilizing system is a system that can automatically recover following the occurrence of (transient) faults. The idea is to design system that can be started in an arbitrary state and still converge to a desired behaviour. E.G., Self-stabilization / S. Dolev.
7
Self Stabilization Motivation totally The combination and type of faults cannot be totally anticipated in on-going systems must Any on-going system must be self stabilizing (or manually monitored) Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults
8
Design
9
Replication servers joined to a spanning tree A spanning tree is constructed File updates are propagated using self- stabilizing -synchronizer
10
Design (Cont’) Clients join the replication tree and form a caching tree File leases Global locking
11
Algorithms – Self Stabilizing Electing a leader (leader election) Collecting connectivity information Optimising communication costs - Synchronizer for file consistency
12
Leader Election A single leader coordinates construction If non exists, a server becomes a leader If more than one exists, one survives Message are periodically broadcasted
13
Leader Election Algorithm Every T 1 do: If (p = leader) then send-multicast(‘I’m a leader’) Leader-exists = true Every T1+Td do: If (not leader-exists) then leader = p Leader-exists = false Upon arrival of message do: If (p.volume=volume) then If (p=leader) then leader = min(leader,sender) Else leader = sender Leader-exists = true
14
Algorithms – Self Stabilizing Electing a leader (leader election) Collecting connectivity information Optimising communication costs - Synchronizer for file consistency
15
Induced Graph Example
16
Update Algorithm Collect routing tables from all neighbours in the induced graph Elect a manager (local leader) for the tree, a server with the minimal ID Build a distributed BFS spanning tree The algorithm converges
17
Algorithms – Self Stabilizing Electing a leader (leader election) Collecting connectivity information Optimising communication costs - Synchronizer for file consistency
18
Optimising Communication Costs Goal: find the minimal radius that keeps connectivity Increase by a factor of 2 Run a 2nd instance of update with < Searching for using binary search
19
Tree Structure
20
Caching Tree Extends the replication tree The update algorithm constructs both Servers execute two instances Caches execute one instance
21
Combined Spanning Tree
22
Algorithms – Self Stabilizing Electing a leader (leader election) Collecting connectivity information Optimising communication costs -Synchronizer for file consistency
23
Synchronization Mechanism Provide reliable command and timing Propagate commands between servers Collect and distribute information
24
Replication Consistency Verifies signatures Multiple signature – a conflict Conflict resolution Broadcast resolved signature
25
Locking Table A (unified) global lock table Lock are requested Leader resolves multiple locks Lock are removed by cancelling the locks request
26
File System Implementation
27
Accessing a File Lock file Get signatureGet a copy Yes No Use local copy Yes Update? Cached?
28
Closing a File Send new signature Yes No Update? Confirm signature
29
Meta Access Globally processed Blocked until a lock is obtained Lock file Execute command Wait confirmation
30
Linux Based bgRFS Application User Level Linux system calls System Calls New implementation: open, close, lstat, mkdir, etc … SyncDaemon: Cache manager & Server Up calls Network Communication
31
Future Work Kernel VFS module. Communication improvements: – Reducing update messages – Using timers with -synchronizer Performance enhancements Integrating disconnected operations Conflict resolution algorithms
32
Credits Undergraduate Students: Amir Livneh livneha@cs.bgu.ac.il Itay Granik granik@cs.bgu.ac.il Boris Lansky lanskyb@cs.bgu.ac.il Naama Shmuel shmueln@cs.bgu.ac.il Moshe Shish shishm@cs.bgu.ac.il Guy Erlich erlichg@cs.bgu.ac.il Avital Chohen avitalco@cs.bgu.ac.il Yael Biran birany@cs.bgu.ac.il Tamir Fridman tamirf@cs.bgu.ac.il Shiraz Bernard shirazb@cs.bgu.ac.il Zvika Ferents ferents@cs.bgu.ac.il Roy Feintuch feintuch@cs.bgu.ac.il Chen Shalev shalevc@cs.bgu.ac.il Shay Kraim kraim@cs.bgu.ac.il Alex Hayuit Faculty Prof Shlomi Dolev dolev@cs.bgu.ac.il Graduate Students Ronen I. Kat kat@cs.bgu.ac.il System Engeenier Albina Budker albinabu@cs.bgu.ac.il
33
Visit us at www.cs.bgu.ac.il/~bgrfs www.cs.bgu.ac.il/~bgrfs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.