Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 5 Naming DISTRIBUTED SYSTEMS (dDist) 2014
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Plan Terminology Types of naming –Flat naming Example: DHT –Structured naming Example: DNS –Attribute-based naming Example: LDAP
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Terminology (1/4) Entity (separate and distinct existence) –: A machine, a process, a file, a Java object, a network card, … Access point –An entity at which other entities can be located E.g., a process can be located at a given machine Name –A string of bits or characters that is used to refer to an entity Address –A name that refers to an access point of an entity Identifiers –A name which uniquely identifies an entity Human-friendly names –A character string name that is understandable by a human
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Terminology (2/4) An entity can be anything with a flavor of separate and distinct existence, e.g.: Replicated Entity –An entity which might exists in several copies, typically at different access points –Requires that the copies are kept consistent to be able to talk about an entity –A replicated entity might have a single access point, typically referring to any of the other access points We return to replication later in the course
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Terminology (3/4) How do we map names to addresses so that we can locate entities? Naming system –Maintains a name-to-address binding –E.g., IP no As of , 09:10.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Terminology (4/4) We work with three naming types –Flat naming Name gives no hints on location E.g.: 00:14:22:72:D2:F7 –Structured naming Name describes how to locate object E.g.: –Attribute-based naming Name describes properties of the entity E.g.: /C=DK/O=Aarhus Universitet/OU=Comp. Sc./CN=Web server
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Plan We look at implementing naming systems –First for flat names –Then for attribute-based names The DNS system as an example of implementing structured naming
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Flat Names Identifiers are often just “random” strings of bits No information on how to locate the entity has been embedded in the name –E.g., identifiers in Chord on the application layer –E.g., Media Access Control (MAC) addresses on the data link layer Unique hardware addresses for network interfaces Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved ARP (1/2) Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved ARP (2/2) Address Resolution Protocol (ARP) Name: IP Address –The “logical name” at this level Address: MAC Address –The access point on the Ethernet network Method: –Machines know their own IP address and MAC address –To locate IP address P broadcasts a packet P Ethernet: Send on FF:FF:FF:FF:FF:FF to broadcast –Receivers check whether they are listening to the IP address P, and if so report back with their MAC address Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Mobility – Problem Mobility –Name constant –Access point changes Examples: –Machine gets new network card, but we want to name it using the same IP address –NFS file is moved from one server to another, but we want to name it using the same file name –Mobile phone in a car Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Mobility – Solutions Multicast groups could be used with ARP- like protocol –Node broadcasts for new location of entity when needed, or –Entity multicasts new location to group when it has moved Forwarding pointers –We look a RPC as example Home-based approaches –We look at Mobile IP Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Broadcast Broadcast is inefficient for large networks Number of messages seen by each machine in a network with n machines grows linearly in n Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Forwarding Pointers (1/4) Figure 5-1. The principle of forwarding pointers using (client stub, server stub) pairs. Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Forwarding Pointers (2/4) Figure 5-2. Redirecting a forwarding pointer by storing a shortcut in a client stub. Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Forwarding Pointers (3/4) Figure 5-2. Redirecting a forwarding pointer by storing a shortcut in a client stub. Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Forwarding Pointers (4/4) Transparency as advantage –Specifically migration transparency Disadvantages –Chain may grow very large –Intermediary nodes may need to maintain links for a long time –Multiple-points-of-failure Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Home-Based Approaches (1/3) A mobile entity maintains a fixed address while it moves The fixed address is called the home address At the home address an entity keeps track of the mobile entity’s current address Example: Mobile IP –Home address The address of the mobile node –Home agent Keeps current location information of the node Tunnels datagrams to the mobile node –Care-of-address Termination point of the tunnel to a mobile node –Part of network/IP layer in IPv6 Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Home-Based Approaches (2/3) Figure 5-3. The principle of Mobile IP. Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Home-Based Approaches (3/3) Disadvantages –Single-point-of-failure –Increased communication latency for first packets Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved The Chord Naming System A Distributed Hash Table (DHT) system –Peers and resources have names like IP addresses and file names –These are mapped to m-bit keys using a hash- function m=160 for SHA1 Also called identifiers for peers –Each peer should store keys (and values) for which its identifier is smallest larger –Can store and lookup key/value pairs in O(log N) time when there are N peers Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Lookup succ(key) = smallest identifier larger than key –With wrap-around from 0 to 2 m -1 E.g., if identifiers = {1,4,7,12,15} then –succ(9) = 12 –succ(12) = 12 Lookup: Given key, find succ(key) and the address of the peer with identifier succ(key) Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Simple Lookup succ(key) = smallest identifier larger than key –With wrap-around from 0 to 2 m -1 E.g., if identifiers = {1,4,7,12,15} then –succ(9) = 12 –succ(12) = 12 Simple lookup: Ask peers in the ring in round-robin if they are succ(key) until found Inefficient: Takes time O(N) Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Finger Tables Main ideas to get efficient lookup: 1.Instead of holding a pointer to only the next peer and the previous peer in the ring, also hold pointers to peers who are far away 2.When looking up key which is far away, then jump directly to the known peer which has the largest identifier smaller than key Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Definitions Definitions of state for node n, using m-bit identifiers Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Distributed Hash Tables General Mechanism Figure 5-4. Resolving key 26 from node 1 and key 12 from node 28 in a Chord system. Flat names Node 18: +1:19 +2:20 +4:22 +8:26 +16:2 succ: 19:20 20:20 22:28 26:28 2:4
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scalable Lookup Flat names lookup(n,id) // At peer with identifier n // ignoring wrap-around if (previous.n < id <= n) return n else k = max { m | finger[m] <= id } return lookup(finger[k],id)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Lookup Complexity (1/4) Time is O(log N) (with high probability) n id finger[k] finger[k+1] Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Lookup Complexity (2/4) Time is O(log N) (with high probability) n id finger[k] finger[k+1] Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Lookup Complexity (3/4) Time is O(log N) (with high probability) Each jump brings us at least halfway there n id finger[k] finger[k+1] Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Lookup Complexity (4/4) Each jump brings us at least halfways Start distance is id-n If m is the number of bits used for keys, then id-n < 2 m Worst case time is O(m) After log N jumps, the distance to id is at most 2 m /N With high probability there’s one node in this interval Time is O(log N) (with high probability) n id finger[k] finger[k+1] Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Joining p joining a network from node n –n’’ := n.lookup(p) –p.successor := n’’ –p.predecessor := n’’.predecessor –p.finger[1] := n’’ –…–… –p.finger[m] := n.lookup(p + 2 m-1 ) –Copy data as necessary from n’’ –n’’.predecessor = p –n’.successor = p Rest of the network does not learn about p joining n n’n’ p n’’ successor predecessor Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Stabilize Must also update finger tables Done using a stabilize() procedure, which also helps clear up after chrashes n.stabilize(): –Check whether n.successor.predecessor == n –Update n and n.successor as necessary –Check and update finger table similarly –… Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Hierarchical (Flat) Naming Network is divided into domains –Single top-level/root domain –Multiple non-overlapping subdomains –Leaf-domains –Directory holds all names known by its children Each domain has associated directory node Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Globe (1/5) Full name: Globe Infrastructure Directory Service Stores and looks up contact addresses for possibly replicated entities, like web- servers Each entity provides one or more contact addresses to Globe Processes can look up contact addresses of entities and Globe tries to return the geographically nearest contact address –Useful for caching and replication Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Globe (2/5) Directory nodes have location records for their contents –In leaf nodes this is an address –In other nodes this is pointers Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Globe (3/5) Look up the domain containing E using expanding ring search –Follow pointers from directory of that domain until addresses for E are found Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Globe (4/5) Figure 5-8. (a) An insert request is forwarded to the first node that knows about entity E. Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Globe (5/5) Figure 5-8. (b) A chain of forwarding pointers to the leaf node is created. Flat names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Attribute-Based Naming Light-weight directory access point (LDAP) is a popular distributed directory service –Richer and more general than DNS (which we will see soon time) Has generalized attribute/value scheme Can search on attribute, not just name –Not a global directory service like DNS Its predecessor, X.500, was meant to be global But “local” LDAP services can point to each other Typically used only enterprise-wide Attribute-based names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Attribute Based Naming: LDAP Figure A simple example of an LDAP directory entry using LDAP naming conventions. Attribute-based names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Hierarchical Implementations Figure (a) Part of a directory information tree. Attribute-based names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Searching in LDAP May specify criterion over attributes Search(“&(C=NL)(O=Vrije Universiteit)(OU=*)(CN=Main server)” May be expensive –Need to iterative over all OU at O in this case Attribute-based names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Attribute-Value Trees Figure (a) A general description of a resource. (b) Its representation as an AVTree. Attribute-based names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Mapping to Distributed Hash Tables Need to transform AV tree into keys Hash every (sub)path in tree to a key –hash(type-book) –hash(type-book-author) –hash(type-book-author- Tolkien) –… –hash(genre-fantasy) Nodes responsible for keys will all store pointers to the resource, here LoTR Attribute-based names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Mapping to Distributed Hash Tables Lookup hash(type-book-author-Tolkien) Will return all books by Tolkien Attribute-based names
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Summary Naming is fundamental to distributed systems Different types of names may be used –Flat naming E.g., DHT –Structured naming E.g, DNS –Attribute-based naming E.g., LDAP