Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary

Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary
Proactive Replication in Distributed Storage Systems Using Machine Availability Estimation Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary Presented by Xiaoyu Sun 9/21/2018

outline Motivation Goal of this paper An adaptive control problem
Impact of estimation time A hybrid scheme for availability Validation Experiment Conclusion What system we are talking about? Why the problem which this paper try to handle is important? What approaches people already used? The disadvantage and advantage of these approaches 9/21/2018

Motivation Peer-to-Peer based distributed storage system Service guarantees What is p2p? What implementation p2p can do? content delivery, networking, search (Yacy a free distributed search engine ) 9/21/2018

Motivation Durability Availability
once stored, data are never lost, although the data may not be available all the time Availability assures that data can be retrieved in any moment 9/21/2018

Motivation Methods used for Redundancy in Storage System
Replication of the original data parity encoding of the original data 9/21/2018

Motivation Types of failure behavior Permanent failure behavior;
To copy with permanent failures and to assure durability; Transient failure behavior; Reintegrated in the system; Hints: from the traces of peer availability, we know that temporary disconnections are much more frequent than permanent ones. 9/21/2018

Motivation Advantage of Reactive approach
Adaptiveness availability Disadvantage of Reactive approach Waste of resources Bursty use of resources 9/21/2018

Motivation Advantage of Proactive approach
A fixed repair rate Smooth the resource usage Disadvantage of Proactive approach Fail to handle the changing failure behaviors Durability compromised 9/21/2018

Goal of this paper Durability Adaptiveness
A limited network bandwidth Durability Adaptiveness Maximize the smoothness of the repair rate Maximize the smoothness of the bandwidth needed for the repairs Degraded performance of other activities 9/21/2018

An adaptive control problem
Periodically infer the failure behavior of the peers The number of available peers at time t Repair rate is a time dependent signal ∆T observation period used by estimator and the interval between two updates of R(t) Signal the occurrence a repair or a reconnection 9/21/2018

The system model Connected state Temporarily disconnected state
Abandon state µ: Single peer disconnection rate. A session time : Single peer reconnection rate. Repair rate is a time dependent signal A disconnection time P: abandon probability 9/21/2018

The system model Q1 represents the peers in the connected state
G/G/1 first G stands for probability distribution of the inter arrival times Second G stands for the probability distribution of the service times 1 stands for the number of servers M stands for Exponential probability density G any arbitrary probability distribution D all customers have the same value Q2 represents the peers in the disconnected state 9/21/2018

The system model L = λW The long-term average number of customers in a stable system L is equal to the long-term average arrival rate, λ, multiplied by the long-term average time a customer spends in the system, W; or expressed algebraically: L = λW 9/21/2018

The estimator The estimator is to estimate two parameters μ and P
9/21/2018

The estimator 9/21/2018

The controller 9/21/2018

Impact of estimation time
The estimation time ∆T is the most crucial parameters of this model. Impact on bandwidth usage ∆T ∆T=0 Robustness of the Estimation One tries to push system reactivity too much The time needed to estimate the parameters is dynamic Any different choice would make the controller follow short term fluctuation Cause: uneven use the bandwidth resources correlated failures of many nodes where most of available fragments will suddenly disappear to choose the maximum ¢T that divides the time in segments in which the system can be approximated as being statistically stable 9/21/2018

Impact of estimation time
The implementation of this paper does not fix ∆T D means the average number of disconnections observed during an estimation period 9/21/2018

A hybrid scheme for availability
The objective of the controller is to make the repair rate equal to the rate of permanent failures. Define a threshold here THpro If the number of available fragments hits a lower THpro, the system switches to a purely reactive scheme. 9/21/2018

Validation System Model Validation 9/21/2018

Validation 9/21/2018

Validation Estimator Validation
The convergence time depends on μ. This leads us to say that in a changing environment we cannot use a constant estimation period, but instead ¢TD should be adapted to the order of magnitude of the parameter μ as we did in eq. (8). 9/21/2018

Validation Controller Validation 9/21/2018

Validation Upper left lower right 9/21/2018

Experiments Goal of experiments The capacity to assure durability ;
The smoothness of the repair rate; 9/21/2018

Experiments the cost of the reactive scheme is a bursty repair
activity 9/21/2018

Experiments D is too big with respect to the
parameter dynamics and the estimator is not able to cope with their changes too small values of D the estimation is not reliable and for too big values the distribution of the number of available fragments degrades too much 9/21/2018

Experiments 9/21/2018

Conclusion This system combines the resilience of reactive schemes with the smoothness of proactive schemes. Validated the proposed scheme and demonstrated its effectiveness using synthetic data 9/21/2018

Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary

Similar presentations

Presentation on theme: "Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary

Similar presentations

Presentation on theme: "Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary"— Presentation transcript:

Similar presentations

About project

Feedback