Accessing Remote Sites CS 188 Distributed Systems January 13, 2015

Accessing Remote Sites CS 188 Distributed Systems January 13, 2015

Deutsch's “Seven Fallacies of Distributed Computing”
1. The network is reliable 2. There is no latency (instant response time) 3. The available bandwidth is infinite 4. The network is secure 5. The topology of the network does not change 6. There is one administrator for the whole network 7. The cost of transporting additional data is zero Bottom Line: true transparency is not achievable

Introduction Distributed systems require nodes to talk to each other
How do they do that, mechanically? To achieve various effects? Synchronization of computations Data passing File systems

Messages The ultimate answer to all of these problems
At the bottom, it’s all done with messages But messages are very general So even given you’re using messages, there are many options

Key Characteristics of Messages
They are explicitly sent Someone asks to send them They have well defined contents Under control of the sender They are all-or-nothing Entire message is delivered or nothing is Receipt is usually optional Receiver must ask to get the message Always unidirectional, usually unicast

Synchronizing Computation With Messages
for (i=0; i< 10; i++) { for (j=0; j< 10; j++) { /* do stuff */ . . . } Tell machine B that loop is done Done! Machine A Machine B

Conceptually Simple Messages can’t be received before they’re sent
So receiving process can’t start computation till after the sender sends the message Sender doesn’t send until he’s ready for receiver to go ahead

Tricky Issues Delivery isn’t reliable
What if the message doesn’t get there? Delivery speed is not predictable So what? But . . . Assumes receiver follows the rules Doesn’t start till he gets the message Does receive the message some reasonable time after its delivery Security issues

What If the Sender Needs Feedback?
Once B starts, do next task for (i=0; i< 10; i++) { for (j=0; j< 10; j++) { /* do stuff */ . . . } Tell machine B that loop is done Done! I started This will require a second message Machine A Machine B From B to A, this time

So What? There is now a round trip delay A to B Then B to A
Two opportunities for messages to get lost Two chances for member processes to screw up

So Why Wait? Machine A Machine B Done! Once B starts, do next task
for (i=0; i< 10; i++) { for (j=0; j< 10; j++) { /* do stuff */ . . . } Tell machine B that loop is done Done! Machine A Machine B

Results of Not Waiting Less delay No more two round trip delays
No certainty about event ordering Did A start its new code before or after B got the message? What if messages are lost, B fails, etc.?

Lessons From Operating System Synchronization
Ensuring proper ordering of events is critical to predictable behavior Thinking about proper synchronization is hard Getting proper synchronization usually requires blocking With bad performance implications And possible deadlocks

Distributed System Implications
Total correctness will be hard and expensive to achieve Getting acceptable results without total correctness is desirable For complex scenarios, likely to be many possible outcomes Some likely to be unpredictable

Data Passing With Messages
Many distributed computations require local processes to share data Often with remote processes Requires moving data to the other machine Again, all you’ve got is messages

Moving Data Machine A Machine B Seems simple enough But . . . Alpha
Beta Gamma . Omega Alpha Beta Gamma . Omega Seems simple enough Machine A But . . . Machine B

Some Complications What if the data won’t fit in one message?
We must use multiple messages Leading to issues of ordering, losses, knowing when you’re done, etc. What if the receiver doesn’t have room to store all of it? Are we keeping a copy at the sender, as well? If so, are both copies writeable? If so, what happens if someone writes?

File Systems With Messages
Like data passing, in most ways But the inherent persistence of the data raises issues Particularly for data that can be written Either on the original site or the accessing site

Accessing Files With Messages
Someone has file X open Open file x for read File X Open file X Now what? Machine A Machine B

Either machine can fail at any time
Reading Note a few things Someone has file X open Read file x File X Read file X Either machine can fail at any time The data is still on machine A But there’s a copy on machine B Maybe more than one Machine A Machine B

What about machine A’s original copy?
Writing Someone has file X open Read file x Write file x File X Write file X What about machine A’s original copy? Machine A Machine B

Some Obvious Complexities
What if the write message is lost? What if A reads the file before the write arrives? But after B does the write? Is that good, bad, or ambiguous? What if A writes the file after B reads it, but before the write message arrives? There are many other complexities

Some Security Issues A message arrives purporting to come from machine X Perhaps you would do the requested action for machine X, but . . . Is it really machine X asking? If machine X is asking for something, is it really what the message says? If you send a response, how can you be sure only machine X gets it?

Basic Solution Authenticate the message
Obtain evidence that the message came from its purported sender And that it wasn’t altered in transit Don’t take actions without suitably strong authentication

Message Authentication
In rare cases, the network is closed E.g., it’s a direct wire to machine X Nobody else can inject the message, so it must come from X More commonly, use cryptographic methods Which we won’t cover in detail here But typically require secure key distribution

One Unhandled Issue “If you send a response, how can you be sure only machine X gets it?” Authentication doesn’t help here If you’re sending out secret information, it isn’t enough that X asked Only X should see the answer Usually handled by encrypting the message With same key issues as above

One Other Security Issue
Can an attacker prevent a given message from being received? A problem of denial of service Often achieved by causing congestion Cryptography doesn’t help here

Basic Data Transport Machine A asks machine B for some data
In general, more than can be held in one message How do we handle that?

In distributed systems, it rarely does
A Simple Approach Data Item X X (Part 1) Data Item X X (Part 2) X (Part 3) X (Part 4) Send me X Assuming everything goes well . . . Machine A Machine B In distributed systems, it rarely does

One Possible Problem Machine A Machine B Data Item X
X (Part 1) Data Item X X (Part 2) X (Part 3) X (Part 4) Send me X How do we know something bad happened? Machine A Machine B What remedial action can we take?

Options For Lost Messages
Detect loss and cancel operation Detect loss and request retransmission Go ahead without the lost data Ensure redundancy in data sent and regenerate lost data on that basis Proper choice depends on system specifics

Another Possible Problem
X (Part 1) Data Item X X (Part 2) X (Part 3) X (Part 4) Send me X How to handle out-of-order delivery? Machine A Machine B

Options for Misordered Delivery
Wait And, possibly, wait, and wait, and wait How long do you wait? Assume misordered message was dropped And take an action appropriate for drops If it comes in eventually, ignore it On detection, ask the sender to retransmit

Another Potential Problem
What if X is big? Really big Perhaps machine A doesn’t know how big What’s really going on at machine A?

Handling the Incoming Data
Data must be stored somewhere In RAM, at first Perhaps on disk/flash later As each message holding part of X arrives, its content must be stored Machine A must allocate buffer space for that data How much, for how long?

Illustrating the Problem
Send me X X, part 7 X, part 5 X, part 8 X, part 9 X, part 6 X, part 3 X, part 4 X, part 1 X, part 2 Now what? Buffer for X Machine A

Another Wrinkle The problem is actually worse at the system level
Every incoming message must be put in a buffer In the OS or device driver or elsewhere Stays in the buffer until the application process “receives” it What if the messages arrive too fast and fill all buffers?

What Are Your Options? When you have used up either the application or system buffers The same as in the network Store until you run out of storage Send it back to where it came from Drop something You can make things better by asking for help, though

Asking For Help IP doesn’t ask for help
It must deal with each packet on its own At a higher level, though, help is possible Flow control Ask the sender to slow down

Where Do We Do Flow Control?
Almost always end-to-end But “end” has a flexible definition Could be “end machine” TCP flow control Could be “end application” Application level flow control

A Brief Diversion Why not flow control in IP?
Based, in part, on the end-to-end argument An argument in network and system design Put application functionality into the application end points, not the network E.g., if you need flow control for your application, put it in the application Important caveat – IF the endpoints can actually do it

Returning to the Core Problem
We are receiving data We need to store (at least temporarily) the data we receive How do we make sure we can do so? Ideally without wasting resources

Possible Answers Figure out how much data will be sent before you set aside space So you definitely have enough space “Use” some of the data as you go along Allowing you to reuse buffer space Allocate more space as you need it Tell sender to stop when you can’t hold more data

Making Life Even Worse The problems we’ve seen occur with only two participating nodes One is a wonderful magic number in distributed systems Everything easier when there’s only one of whatever we’re considering Two is a fairly wonderful magic number It’s either here or there, which simplifies life Three and above aren’t wonderful or magic They make life hard

One Example Machine A Machine B Machine C File X Read file X

What about Machine C’s copy?
So Far, So Good, But . . . File X Write file X Machine A Machine B What about Machine C’s copy? Machine C

Some Choices Send a message to C invalidating its copy
Either from A or B Send a message to C updating its copy Don’t allow the situation at all E.g., don’t allow C to read X Don’t worry about it It’s C’s problem, if C cares

Implications of This Choice
Making sure everyone knows what’s happening requires more messages Which could be delayed, lost, etc. Generally more processing and delay Not worrying about the state of other nodes requires fewer messages But those nodes may act on old data Could get weird behaviors Preventing problem from occurring reduces system’s utility and power But at low cost and with fewer surprises

Which Choice Is Best? It all depends
What is your model of system behavior? How important is consistency and what are the effects of inconsistency? Remember to consider failure models What if some message is lost/delayed? Different distributed systems deal with the issue in different ways

Another Complexity “When sorrows come, they come not single spies, but in battalions” William Shakespeare, Hamlet, Act 4, Scene 5 Distributed systems suffer that problem, too You can’t be sure there’s only one fault Multiple independent or related faults are possible And will happen, sooner or later Will your solution to one be sabotaged by another?

Tying It Back To Distributed Systems
Even the relatively simple issue of moving data around proves complex Highly desirable to understand how your system behaves Transparent solutions are nice But often too expensive An important design choice

Accessing Remote Sites CS 188 Distributed Systems January 13, 2015

Similar presentations

Presentation on theme: "Accessing Remote Sites CS 188 Distributed Systems January 13, 2015"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Accessing Remote Sites CS 188 Distributed Systems January 13, 2015

Similar presentations

Presentation on theme: "Accessing Remote Sites CS 188 Distributed Systems January 13, 2015"— Presentation transcript:

Similar presentations

About project

Feedback