Data Replication CS 188 Distributed Systems February 3, 2015

Data Replication CS 188 Distributed Systems February 3, 2015

Some Other Possibilities
What if the machines sharing files are portable and not always connected? What if the machines communicate across the Internet? What if the load on some files is too heavy for a single machine?

An Answer to These Questions
Replicate the data Keep multiple copies of the data on different machines Depending on details, make different copies available for different purposes

How Does This Help? What if the machines sharing files are portable and not always connected? Put a replica of the data on the portable machine What if the machines communicate across the Internet? Avoid expensive cross-Internet traffic by having replicas on both sides What if the load on some files is too heavy for a single machine? Share the load among multiple replicas

Other Replication Advantages
Reliability If one machine fails, replicas of its data might be elsewhere Flexibility Easier to assign data workloads to storage resources

The Replication Concept
When in the course of human events it becomes necessary for one people to When in the course of human events it becomes necessary for one people to When in the course of human events it becomes necessary for one people to When in the course of human events it becomes necessary for one people to When in the course of human events it becomes necessary for one people to There is a conceptual object (like a file) We keep more than one physical copy of it Maybe several Each copy is meant to be a full representation of the object So accessing any should be the same as accessing any other

Replication and Caching
The two are obviously similar Caching usually implies it’s temporary Replication usually implies it’s permanent Caching is usually for local use only Replication is usually for more general use These distinctions are not actually binary, though Permanent isn’t always really permanent Some caches service multiple machines

There Are Some Differences
For example, invalidation on write is feasible for cached data It isn’t feasible for replicated data One can always throw away a cached copy of data (modulo local needs) One can’t always throw away a replica Especially the only one

Replication and Reading
If the data is read-only, the replication problem is easy IF . . . The problems arise if the data is ever written Life then becomes much more complicated

Read-Only Replication
Merely ensure that all copies start off the same They never change Accessing any copy as good as any other Still a problem of finding and choosing replicas to access

Read-Only Data and Metadata
Usually we treat file metadata as part of the file Maybe the data is read only But is the metadata? How about access permissions? How about access time? If metadata can be updated, you still have issues

Choosing Read-Only Replicas
Mostly a performance question Which one is “closest?” Which one is “least loaded?” Initial placement might make a big difference And what if replicas can move?

Varying Read-Only Replication Factors
We can add or delete read-only replicas easily Some issues regarding open files When should we add a replica? When should we delete a replica? When should we move a replica to a different location?

Replication and Writing
Life becomes complicated when you write replicated data Physically the write occurs at one copy Logically the write should be applied to all copies Going from the physical reality to the logical goal is challenging

Illustrating the Problem
Forescore and seven years ago, our forefathers brought forth When in the course of human events it becomes necessary for one people to When in the course of human events it becomes necessary for one people to Forescore and seven years ago, our forefathers brought forth We write to the yellow replica The yellow and blue replicas should be the same, but they aren’t What do we do? Problem solved! But . . .

A Fly in the Ointment We’ve gotten ourselves into this state
When in the course of human events it becomes necessary for one people to Forescore and seven years ago, our forefathers brought forth When in the course of human events it becomes necessary for one people to We’ve gotten ourselves into this state What if the writer’s next access is to the other replica?

A Worse Situation What if someone else reads the other copy?
When in the course of human events it becomes necessary for one people to Forescore and seven years ago, our forefathers brought forth What if someone else reads the other copy?

An Even Worse Situation
Ask not what your country can do for you, but what you can do for your country When in the course of human events it becomes necessary for one people to Forescore and seven years ago, our forefathers brought forth What if someone else writes the other copy?

These Situations Arose Before Distributed Computing
What if there are two processes on one machine? What if they read a file and then both choose to write it? Or one writes without the other’s knowledge? Still problematic, but easier to solve

Single Machine Solutions
Have only one copy of shared data Replication advantages less on a single machine, anyway Use locks to control access to shared data Both solutions rely on a single piece of storage that both parties consult So they don’t work on two machines

Cross-Machine Locking
Why can’t I just share a lock between two machines? A lock is really a piece of data Saying who holds it Either you store it on one machine or on both Storing on just one leads to performance and reliability problems Storing on both gets us back to our original problem But now the shared data is the lock itself

Primary Copy Options Only allow writes to one replica
So no issue of conflicting writes to different replicas Doesn’t solve the read/write concurrency problem Issues if the primary copy fails Or if its server is overloaded Or if there are network partitions

A Diversion Into Clocks
Ultimately, these issues relate to the question of ordering events What order do things happen in? In a distributed system One form of ordering used a lot in the real world is time Can we use time to solve our problem?

Time Services One way to make things happen in order is to timestamp them Read a clock and slap a time stamp on the event As in normal life, things only happen in time order Possible solution for ordering distributed events

Time Services and Replication
Maybe we can slap a timestamp on every write And maybe use timestamps to control reads The timestamps of multiple writes control the order in which they occur Doesn’t solve all the problems, but does solve some

Using a Clock Now B can know the proper order of writes Node 1 Node 2
3:22 3:15 3:27 Read the clock Read the clock Node 1 Node 2 Read the clock 3:15 3:15 To B 3:15 A A A B B To C C C To B 3:22 3:27 Node 3 Now B can know the proper order of writes

The Problem With Clocks
A clock is (ultimately) a physical resource So it’s in exactly one place We use messages to access remote places And messages take varying amounts of time to get from one place to another So, with a single clock, can’t guarantee proper ordering

Solutions to Clock Problems
Physical clocks Logical clocks

Physical Clocks Each node keeps its own local clock
Modern machines always have them, anyway Stamp each synchronizable event with the local clock Problem becomes keeping the clocks synchronized

Globally Accessible Clocks
In the general case, this usually means GPS clocks GPS satellites broadcast highly accurate clock signals Over the entire Earth’s surface Anyone with a GPS receiver that’s working can hear it

Pros and Cons of Physical Clocks
Simplicity Need constant access to clock Transmission errors/delays damage synchronization Requires strong knowledge of transmission delays Never possible to reduce clock skew to zero

Logical Clocks Don’t try to keep track of passage of actual time
Use a logical mechanism to keep track of proper order of events Essentially, assign artificial timestamps that maintain the causality required for the computation

When Are Logical Clocks Useful?
When relative order of events is the issue Rather than relationship to wall clock time Often the case for operations of distributed applications Not always when there is a relationship to the real world

Lamport Clocks Fundamental logical clock system
Each process Pi has a clock Ci Each event is assigned a time at its processor is the happens-before relation a b means a happened before b If a b, C(a) < C(b)

Implementing Lamport Clocks
Whenever an event occurs, increment the local clock Assign new value to event But how do we provide the correct global view? Since processes live on different processors

Handling Messages in Lamport Clocks
Processes communicate only via send and receive of messages Which are events If Pi sends to Pj, Ci(send) < Cj(receive) Since send must happen-before receive How do we force that?

Rules for Lamport Clocks
1). If a b within the same process, C(a) < C(b) 2). If a is a sending event in Pi and b is the corresponding receiving event in Pj, then C(a) < C(b) Enforcing Rule 1 is easy, since it’s on the same processor

Enforcing Rule 2 Timestamp outgoing messages with time of send
Receiver j adds increment d to maximum of message timestamp and local clock Cj= max(C(a), Cj) + d C(b) = Cj Ensures that receive event b gets a clock value after send event a

Lamport Clocks Example 1
send i 2 3 2 2 1 2 1 2 receive j 3 C(a) =1, C(send) = 2, C(receive) = 3 C(a) < C(send) C(send) < C(receive)

Properties of Lamport Clocks
Happens-before is transitive If a b and b c, then a c If a b, then C(a) < C(b) But the converse is not true C(a) < C(b) does not imply a b How can that happen?

Lamport Clock Example 2 i 2 1 2 1 1 2 j 1 C(a) =1, C(b) = 2, C(d) = 1
1 1 2 d j 1 C(a) =1, C(b) = 2, C(d) = 1 C(a) < C(b) C(d) < C(b) ????!!!!????!!!!

The Sad Truth About Distributed Systems Concurrency
Abandon all hope ye who enter here You’ve got to forget your godlike view In the absence of a physical clock, YOU CAN’T ORDER ALL EVENTS PROPERLY!!!!!!!! But perhaps you don’t believe that . . .

Lamport Clock Example 3 i 1 2 1 1 1 2 j 1 C(a) =1, C(b) = 2, C(d) = 1
1 1 2 d j 1 C(a) =1, C(b) = 2, C(d) = 1 But the “order” of events was different than before

Why Do We Have This Problem?
Not really because we aren’t keeping a physical clock It’s because we aren’t communicating enough to derive the order If each process sent the other a message after each local event, our examples would have proper ordering

Obtaining the Proper Order for Example 2
send i 2 3 3 4 3 5 1 Synchronize 3 1 2 3 d receive j 4 5 C(a)<C(b), C(b)<C(d)

And For Example 3 receive a b i 2 3 2 5 2 4 2 1 Synchronize 2 3 4 5 d send j 1 2 C(d) < C(a), C(a) < C(b)

But There’s a Problem What if we have true concurrency?
What if an event occurs while a synchronization message is in transit?

Lamport Clocks Example 4
i 2 1 2 1 Synchronize 2 1 d send j 1 2 C(d) = C(a) Because of concurrency, you can’t win

Lamport Clocks and Partial Orders
Basic Lamport clocks only give a partial order They don’t order events with equal times Easy to provide a full order Number all processes Concatenate process number to clock

In Our Examples, Say process i is numbered 1 and process j is numbered 2 In example 1, no equal times In example 2, C(a) = 1,1 C(b) = 2,1 C(d) = 1,2 So C(a) is ordered before C(d)

Fully Ordered Clocks in Example 1
send i 2 3 2 2 1 2 1,1 2,1 receive j 3,2

b i 1 2 1 2 1,1 2,1 d j 1,2 But d still ordered before b

Don’t Read Too Much Into This Ordering
In example 3, C(a) = 1,1 C(b) = 2,1 C(d) = 1,2 C(a) is still ordered before C(d) Even though we “know” C(d) happened first This ordering is complete, but somewhat arbitrary

b i 1 1 2 1 1,1 2,1 d j 1,2

Vector Logical Clocks In normal Lamport clocks, C(a) < C(b) does not imply a happened before b a and b might be concurrent, instead Vector clocks allow us to distinguish those cases At the cost of keeping more information

How Vector Clocks Work Each process keeps a vector of clocks
For n total processes, n vector elements, one per process Each element of the vector is the newest clock from that process seen locally Comparisons are done on full vectors

Vector Clock Example a e i 1 1 2 1 1 2 1 Greater-than operation matches Lamport criteria exactly When message arrives, set each vector element separately 1,0,0 2,0,0 C(a)<C(e)<C(f) C(b)<C(f) b f j 0,1,0 2,2,0 But, C(a) !< C(b) d k 0,0,1

Vector Clocks Pros and Cons
Partial ordering only where causal relationships exist Higher overheads for clock storage and message transport Potentially killers for huge numbers of processes Tricky (not impossible) when number of processes changes

What’s This Got to Do With Replication?
Writes can be “clock” events We can use vector clocks to keep track of writes to multiple replicas Doesn’t prevent concurrent writes But does detect them Which leads to the possibility of optimistic replication

Data Replication CS 188 Distributed Systems February 3, 2015

Similar presentations

Presentation on theme: "Data Replication CS 188 Distributed Systems February 3, 2015"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Replication CS 188 Distributed Systems February 3, 2015

Similar presentations

Presentation on theme: "Data Replication CS 188 Distributed Systems February 3, 2015"— Presentation transcript:

Similar presentations

About project

Feedback