COSC 3407: Operating Systems Lecture 20: Networks and Distributed Systems.

COSC 3407: Operating Systems Lecture 20: Networks and Distributed Systems

This lecture… u Motivation for distributed vs. centralized systems u Survey of network technologies

Technology Trends DecadeTechnology$ per machineSales volumeUsers per machine 50’s-$10M1001000’s 60’sMainframes$1M10K100’s 70’sMinicomputers$100K1M10’s 80’sPCs$10K100M1 90’sLaptops$1K - $5K100M1 00’sPDA’s, Smart Phones $1 - $1k1B1/10

Centralized vs. Distributed Systems u Centralized System: System in which major functions are performed by a single physical computer – Originally, everything on single computer – Later: client/server model u Distributed System: physically separate computers working together on some task – Early model: multiple servers working together » Probably in the same room or building » Often called a “cluster” – Later models: peer-to-peer/wide-spread collaboration

Centralized vs. Distributed Systems u Why do we need distributed systems? – Cheaper and easier to build lots of simple computers – Easier to add power incrementally – Principle of bureaucracy avoidance – Collaboration: Much easier for users to collaborate through network resources (such as network file systems) u The added promise of distributed systems: – Higher availability – one machine goes down, use another – Better reliability – store data in multiple locations – More security – each piece easier to make secure

Centralized vs. Distributed Systems u Reality has been disappointing: – Worse availability – depend on every machine being up – Worse reliability – can lose data if any machine crashes – Worse security – anyone in world can break into system u Key idea: Coordination is more difficult, because we have to coordinate multiple copies of shared state information (using only the network). u What would be easy in a centralized system becomes a lot more difficult.

Goals: Transparencies and “ilities” u One important goal behind much distributed systems design is “Transparency”. u That is, distributed systems are complex, so you would like to shield the user from much of this complexity. (Or would you???). Two questions to ask: – Is it desirable? – Is it achieved? u Various types of transparency are: – Location – Can’t tell where resources are located – Migration – Resources may move throughout the system without the user knowing. – Replication – Can’t tell how many copies of a resource exist. – Concurrency – Can’t tell how many users there are – Parallelism – System may be able to speed up large jobs by splitting them into smaller pieces that run at the same time. – Failure – System may hide various things that go wrong in the system.

Goals: Transparencies u In addition, there are certain properties that are desirable of a distributed system. u These include: – Availability – what is the probability of the system being able to accept and process requests? – Reliability – This is usually stronger than simply availability. » It means not only is the system up, but that it is working correctly. » Thus, it includes availability, security and fault-tolerance. Fault- tolerance can be: u fail-fast, i.e. stop immediately if data starts to corrupt, difficult to achieve). u Recover, restart u Mask – hide (say, fault during plane landing)

Goals: Transparencies u Goals contd. – Performance - various metrics here: » latency, throughput, time to first byte, time to last byte, scalability, etc. » The key to performance in a distributed system is to avoid “bottlenecks”. » Possible bottlenecks include: centralized components, centralized state (e.g., system tables), centralized control, limited connectivity, and need for consistent “global” state. – Manageability – Flexibility u A great example of a system that meets many of these criteria is the Internet as it has evolved into its current form.

Networking Definitions u Network: physical connection that allows two computers to communicate. u Packet: unit of transfer, sequence of bits carried over network. – Network carries packets from one CPU to another. – Destination gets interrupt when packet arrives. u Protocol: agreement between two parties as to how information is to be transmitted, i.e. format and semantics. CPU network

Broadcast Networks u Broadcast networks: shared communication medium. u For example, shared medium can be a wire – all hosts listen to wire. – Inside a computer, this is called a “bus” – a shared set of wires between the CPU and the memory modules u Ethernet is an example broadcast network (10 Mbits/sec – 10Gbit/s) u More examples: cellular phones, GSM GPRS and EDGE, CDMA 1xRTT and 1EvDO (10Kbit/s – 2 Mbit/sec).

Delivery u When you broadcast a packet, how does receiver know who it is for? u Put header on front of packet: – Destination | Packet u For example, header would contain unique machine # (network address) of target. u Everyone gets packet, discards if not the target. u In Ethernet, this check is done in hardware; no OS interrupt if not for you. u This is an example of layering: we’re going to build up complex network protocols by layering more and more stuff on top of the packet.

Arbitration u How do your machines arbitrate for use of shared medium? u Aloha network (70’s) – packet radio between Hawaiian islands u Arbitration: blind broadcast, with checksum at end of packet. – If received ok (not garbled), send back an acknowledgement. – If not received ok, discard. u Need checksum anyway, in case airplane flies overhead (or maybe a surfer goes by), and packet gets garbled.

Arbitration u Sender waits for a while, and if doesn’t get an acknowledgement, re-transmits. u So if two senders try to send at same time, both get garbled, both simply re-send later. u Problem: stability. What if load increases? » More collisions  less gets through  more resent  more load…  More collisions… » Unfortunately: some sender may have started in clear, get scrambled without finishing u Ethernet – early 80’s, first practical local area network (Xerox PARC). 10 Mb/s – 100 Mb/s. u Most common LAN for UNIX environments, becoming more common in PC’s. u What we have in the department.

Arbitration u Used wire instead of radio, but still broadcast: all machines tap into single wire, listen to all packets. u Key advance was in a new way to do arbitration called CSMA/CD: Carrier sense multiple access/ collision detection. u Three pieces: 1.Carrier sense – don’t send unless idle 2.Collision detect – sender checks if packet is trampled. If so, abort, wait, and retry. » How long should I wait, after trying to send and failing? » What if everyone waits the same length of time? » We’d all keep colliding forever. » Also, what if load increases? More likely to get collisions. 3.Adaptive randomized waiting strategy – if try to send, and get collision, means there are a number of people trying to send, so pick bigger mean wait time. – If collide again, pick even bigger wait.

Point-to-point networks u Here’s a different way of thinking about all this – why have a shared bus for Ethernet at all, why not simplify and only have point-to-point links, plus routers/switches? u Central idea behind ATM (asynchronous transfer mode), the first commercial point-to-point LAN. u Inspiration for ATM taken from telephone network. u Point-to-point network: a network in which every physical wire is connected to only two computers u Switch: a bridge that transforms a shared-bus configuration into a point-to-point network. u Router: a device that acts as a junction between two networks to transfer data packets among them.

Point-to-point networks u A bunch of advantages: 1.Higher link performance – can drive point-to-point connection faster than broadcast link 2.Greater aggregate bandwidth than single link 3.Can add capacity incrementally – add more links/switches to get more capacity 4.Better fault tolerance (as in Internet) 5.Lower latency (no need for arbitration to send, although you do need a buffer in the switch)

Point-to-point networks u Disadvantage: more expensive than having everyone share one bus. – But, technology has been relentlessly driving the costs down! u As a result, point-to-point communications are starting to be used in everything from workstations, to local area networks, to the Internet. u Workstations: In a few years, open up a workstation, will see its CPU connected to memory and graphics engine by a switched network, instead of a bus. u Multiprocessors are already connected by hooking together lots of small-scale switches. – For instance, in a 2-D mesh, or in a hypercube. u In LAN’s, not only ATM, but now a version of Ethernet called “switched Ethernet” – uses same packet format, analog signaling as Ethernet, but only two machines on each Ethernet.

Flow Control Options u Switches look just like computers: inputs, memory, and outputs. u What if everyone sends to the same output? Congestion. What happens when buffers fill up? u Option 1: no flow control. Packets get dropped if they arrive and there’s no space. buffers CrossbarCrossbar switch Inputs Outputs

Flow Control Options (con’t) u If I send a lot, I’ll grab the buffers, and then everyone else is hosed. u Option 2: flow control between switches. Don’t send me more, unless I have room. u Problem: cross-traffic. What if path from source to destination is completely unused, but goes through some switch that has buffers filled up with unrelated traffic? A, B A C B, C, D D

Flow Control Options (con’t) u Option 3: per flow control. Allocate a separate set of buffers to each end-to-end stream, and use “don’t send me more” on each separate end-to-end stream. u Problem: fairness – Throughput of each stream is entirely dependent on topology, and relationship to bottleneck. aaaaa ababab acbcacbc dadcdbdc bbbbcccccc ddddd

Flow Control Options (con’t) u Automobile Analogy – At traffic jam, one strategy is merge closest to the bottleneck » Why people get off at one exit, drive 50 feet, merge back into flow » Ends up slowing everybody else a huge amount – Also why have control lights on-ramps » Try to keep from injecting more cars than capacity of road (and thus avoid congestion)

The Internet u What happens if you need more bandwidth than a single Ethernet? u For example, SUN has > 10000 workstations. u Buy two Ethernets? If so, how do machines on each network talk to each other?

The Internet u Put machine that straddles both networks. u Lots of different words for this function: router, gateway, bridge, repeater. u But basically, acts like a switch. u Machine watches all packets on each Ethernet, and if packet is for machine on other one, then copies packet over. Ethernet

The Internet u The Internet is just a generalization of this. – Internet = interconnecting local area networks. u Local networks can be anything – Ethernet, AppleTalk, FDDI, even phone company wires, but building blocks are machines that straddle two or more networks. AT&T UoT Ethernet UBC Ethernet Waterloo Ethernet

The Internet Protocol: “IP” u IP Packet: a network packet on the internet u IP Address: a 32-bit integer used as the destination of an IP packet – Often written as four dot-separated integers, with each integer from 0—255 (thus representing 8x4=32 bits) – Example: 142.51.23.217 u Internet Host: a computer connected to the Internet – Host has one or more IP addresses used for routing » Some of these may be private and unavailable for routing – Not every computer has a unique IP address » Groups of machines may share a single IP address » In this case, machines have private addresses behind a “Network Address Translation” (NAT) gateway

Address ranges and subnets u Address Ranges in IP – IP address space divided into prefix-delimited ranges: » NN.0.0.0/8: Class A (10.0.0.0 private) » NN.NN.0.0/16:Class B (172.[16-31].0.0 private) » NN.NN.NN.0/24:Class C (192.168.xx.0 private) – Such ranges often owned by organizations u Subnet: A network connecting a set of hosts with related destination addresses – With IP, all the addresses in subnet are related by a prefix of bits » The number of matching bits is called the “mask” and can be expressed as a single value (e.g. 24) or a set of ones in a 32-bit value (e.g. 255.255.255.0)

Subnets u A subnet is identified by 32-bit value, with the bits which differ set to zero, followed by a slash and a mask » E.g.: 128.32.131.0/24 designates a subnet in which all the addresses look like 128.32.131.XX » Same subnet: 128.32.131.0/255.255.255.0 u Difference between subnet and complete network range – Subnet is always a subset of address range – Once, subnet meant single physical broadcast wire; now, less clear exactly what it means (virtualized by switches)

Hierarchical Networking: The Internet u How can we build a network with millions of hosts? – Hierarchy! Not every host connected to every other one – Use a network of Routers to connect subnets together » Routing is often by prefix: e.g. first router matches first 8 bits of address, next router matches more, etc.

Simple Network Terminology u Local-Area Network (LAN) – designed to cover small geographical area. – Multiaccess bus, ring, or star network – Speed  10 – 1000 Megabits/second – Broadcast is fast and cheap – In small organization, a LAN could consist of a single subnet. In large organizations, a LAN contains many subnets. u Wide-Area Network (WAN) – links geographically separated sites – Point-to-point connections over long-haul lines (often leased from a phone company) – Speed  1.544 – 45 Megabits/second – Broadcast usually requires multiple messages

Routing u How do packets get to their destination? Simple if there’s a single machine that straddles all networks, but that won’t work in the Internet! u If packet has to go several hops before it gets to destination, and router straddles several networks, how do routers know how to forward packets? u To answer this, some definitions: – Name – Mom, Fraser – Address – phone #, network address – Route – how do we get there from here

Routing u Internet solution: routing tables. – Each router looks at packet header, does table lookup to decide which link to use to get it closer to destination. – Also, all machines on the same subnet, have common portion of address/machine id. u Routing table contains: – Destination address range –> output link that gets closer to destination – Default entry (for subnets without explicit entries) u How do you set up the table? – Internet has no centralized state! – No single machine knows entire topology, and topology is constantly changing! – Topology constantly changing (faults, reconfiguration, etc)

Routing u Instead: 1.Routing table has “cost” – number of hops to destination (in practice, also considers how heavily used each link is) 2.Neighbors periodically exchange routing tables 3.If neighbor has cheaper route, use that one u So: – Initially, routers don’t know about any destination Loop Get routing table from neighbors Update routing table

Routing u Note that since the Internet is made up of many individual networks, it’s routing is similar but more complicated. u Basically these types of algorithms are performed at different levels of the network, such as within a subnet and then among the routers that route between the subnets. A B C D E F G H

COSC 3407: Operating Systems Lecture 20: Networks and Distributed Systems.

Similar presentations

Presentation on theme: "COSC 3407: Operating Systems Lecture 20: Networks and Distributed Systems."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

COSC 3407: Operating Systems Lecture 20: Networks and Distributed Systems.

Similar presentations

Presentation on theme: "COSC 3407: Operating Systems Lecture 20: Networks and Distributed Systems."— Presentation transcript:

Similar presentations

About project

Feedback