Naming Chapter 4
Names, Addresses, and Identifiers Name: String (of bits/characters) that refers to an entity (e.g. process, file, device, …) Access point: Each entity has an access point that allows for communication with that entity. Address: An access point is also an entity and has a name, the so-called address. Access point/Entity: n-to-n relationship a) person (i.e. entity) with different telephone sets (i.e. access points). b) different entities may share a single access point. Reason for separating names and addresses: flexibility e.g. after code migration the address of a server would change but not its name, no invalidation of references is needed! e.g. if entity has different access points (e.g. horizontal web server organization), a single name may be used (for different addresses!). in general: access-point-to-entity mapping is independent from used names. Identifier: A name that uniquely identifies an entity: 1. At most one entity per ID. 2. At most one ID per entity. 3. IDs are never reused. The nice feature of IDs is that the test of identity becomes a test of ID- equality! Human-friendly names: rather meaningful character strings.
Name Spaces (1) A general naming graph with a single root node. Name space: reflects the structure of the naming scheme (e.g. graph, tree, …) Name resolution: process of mapping a name to its corresponding entity. Initiating name resolution (e.g. begin with root) is known as closure mechanism. Examples: n1: would return content of n5. n0: would return table stored in n1. Two (path) names for a single entity Hard link alias
Name Spaces (2) The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks. –Inodes (index nodes) are numbered from 0 (for the root) to some maximum. –Directory nodes are implemented like file nodes. OS initial routine Free blocks, free inodes,… Address of file on disk, access rights, last time modified, …
Linking and Mounting (1) The concept of a symbolic link explained in a naming graph. –Alias: Another name for same entity (e.g. /home/steen/keys = /keys). –Alias implementations: Hard links: Allow multiple incoming edges for a node (see slide Name Space (1)). Symbolic links: Tree structure with more information in referencing nodes (see above). Resolution continues from n0
Linking and Mounting (2) Mounting remote name spaces through a specific process protocol. Resolution of /remote/vu/mbox: 1.Local resolution until node /remote/vu. 2.Use of NFS protocol to contact the server flits.cs.vu.nl in order to access foreign directory /home/steen. Mount point Mounting point
Linking and Mounting (3) Organization of the DEC Global Name Service
Name Space Distribution (1) An example partitioning of the DNS name space, including Internet-accessible files, into three layers. Caching: for performance and availability performance availability
Name Space Distribution (2) A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, as an administrational layer, and a managerial layer. ItemGlobalAdministrationalManagerial Geographical scale of networkWorldwideOrganizationDepartment Total number of nodesFewManyVast numbers Responsiveness to lookupsSecondsMillisecondsImmediate Update propagationLazyImmediate Number of replicasManyNone or fewNone Is client-side caching applied?Yes Sometimes
Implementation of Name Resolution (1) The principle of iterative name resolution. + : less performance requirements for name servers - : effective caching only at client - : may induce high communication overhead
Implementation of Name Resolution (2) The principle of recursive name resolution. - : high performance demands on name servers +: more effective caching possible (in different places) +: high availability due to cached information (e.g. # down use # ) +: may reduce communication costs
Implementation of Name Resolution (3) Recursive name resolution of. Name servers cache intermediate results for subsequent lookups. (Assumption: name servers hand back to caller more than one result) Server for node Should resolve Looks up Passes to child Receives and caches Returns to requester cs # -- # vu # # # # nl # # # # # # root # # # # # # # #
Implementation of Name Resolution (4) The comparison between recursive and iterative name resolution with respect to communication costs. e.g. America e.g. Europe
The DNS Name Space The most important types of resource records forming the contents of nodes in the DNS name space. Type of record Associated entity Description SOA (start of authority) Zone Holds information on the represented zone e.g. administrator address, host name for this zone A (address) Host Contains an IP address of the host this node represents if multiple, then multiple A records MX (mail exchange) Domain Refers (symbolic link) to a mail server to handle mail addressed to this node e.g. domain (i.e. subtree) cs.uleth.ca may have machineX.cs.uleth.ca (mutiple allowed) SRV (Server) Domain Refers to a server handling a specific service e.g. http.tcp.cs.vu.nl for a Web server NS (name Server) Zone Refers to a name server that implements the represented zone (the node is a node representing a zone) CNAME (canonical name) Node Symbolic link with the primary name of the represented node canonical name = primary name of an entity (host) PTR (pointer) Host Contains the canonical name of a host e.g. to allow inverse mapping (IP to name), node in-addr.arpa would store cname. HINFO (host info) HostHolds information on the host this node represents (e.g. what OS, architecture, …) TXT (text) Any kindContains any entity-specific information considered useful
DNS Implementation (1) An excerpt from the DNS database for the zone cs.vu.nl. cs.vu.nl star ftpwww soling vucs laser zephyr domain 3 name servers for zone 3 mail servers (priority!) this name server has 2 addresses (reliability!) backup for this mail server symb. links to same host host for FTP and Web IP to cname mapping a laser printer
DNS Implementation (2) Part of the description for the vu.nl domain which contains the cs.vu.nl domain. NameRecord typeRecord value cs.vu.nlNSsolo.cs.vu.nl A vu.nl cs ee cs.vu.nl domain
Naming versus Locating Entities a) Direct, single level mapping between names and addresses. b) 2-level mapping using identities. Problem: What to do if an entity is moved? Examples: a) within same domain: ftp.cs.v.nl ftp.research.cs.vu.nl Solution: local update of DNS database (efficient) b) to different domain: ftp.cs.v.nl ftp.informatik.unibw-muenchen.de 2 Solutions: (1) update address in local DNS DB updates become slower when moved again (no more local) (2) use symbolic links lookups become slower Solutions unsatisfactory especially for mobile entities, which change their locations often! NS maps names to addresses NS maps names to IDs LS maps IDs to addresses promotes mobility
Forwarding Pointers (1) The principle of forwarding pointers using (proxy, skeleton) pairs. Examples of location services in LANs: a) Address Resolution Protocol (ARP): broadcasts an IP address (i.e. ID) and host returns its (data link layer) address (e.g. Ethernet address) b) Multicast to locate laptops: Laptop is assigned a dynamic IP address and is member of a group. Host detects a laptop by multicasting a message containing its ID (e.g. computer name) and receiving its current IP address. Forwarding pointers to (mobile) distributed objects: Chain of pointers (i.e. [proxy, skeleton] pairs). New object location Old object location
Forwarding Pointers (2) Redirecting a forwarding pointer, by storing a shortcut in a proxy. For efficiency, only the first request goes through the (current) chain. Skeleton S no longer referred to! ( Garbage Collection needed) Problem: broken chains! First request Subsequent requests Skeleton S
Home-Based Approaches The principle of Mobile IP. The mobile host (A) has a fixed IP address (an ID) and registers its dynamic IP address with a home agent (running at its home location) Problems: a) home agent may be far from client while host is in its proximity! b) host may move and stay for a long time at new location. Better in this case to move home agent, as well. Host A Wants to communicate with mobile host A
Hierarchical Approaches (1) Home agent (HA) Client Mobile host HA 1 HA 2 HA 3 Alternative: (hierarchical) solution Hierarchical organization of a location service into domains, each having a directory node. Main idea: exploit locality, e.g. in mobile telephony, first the phone is looked up in a local network, and then a request is sent to home agent. formally
Hierarchical Approaches (2) An example of storing information of an entity having two addresses in different leaf domains. Used e.g. for replicated entities. Replica 1Replica 2 M is the node for the smallest sub-domain containing the two replicas. Two pointers are needed here!
Hierarchical Approaches (3) Looking up a location in a hierarchically organized location service. Lookups work bottom-up (locality!)
Hierarchical Approaches (4) a)An insert request is forwarded to the first node that knows about entity E. b)A chain of forwarding pointers to the leaf node is created. Chain of forwarding pointers may be created bottom-up (when traversing the node) for efficiency and availability. Created entity E (in this case a replica)
Pointer Caches (1) Caching a reference to a directory node of the lowest-level domain in which an entity will reside most of the time. It does not make sense to cache entity addresses (since they are changing regularly). Addresses of sub-domains, where entity is assumed to be, are cached instead.
Pointer Caches (2) A cache entry that needs to be invalidated because it returns a nonlocal address, while such an address is available. For efficiency, insertion yields cache invalidation, if a replica is constructed. Scalability remains a challenging problem. E.g. root node must store references to every entity in the network ( bottleneck). Possible remedies: use of parallel machines and/or distributed servers implementing the root (and high-level nodes). Replica X
The Problem of Unreferenced Objects An example of a graph representing objects containing references to each other.
Reference Counting (1) The problem of maintaining a proper reference count in the presence of unreliable communication. Detection of duplicates is necessary (and easy, e.g. use of message identifiers). Same problem arises when deleting a remote reference (i.e. decrementing). Immediately after installation proxy p sends message (+1) to skeleton.
Reference Counting (2) a)Copying a reference to another process and incrementing the counter too late b)A solution. Unallowed delete (proxy will deny that, because no ACK has been received yet). Allowed delete.
Advanced Referencing Counting (1) a)The initial assignment of weights in weighted reference counting. b)Weight assignment when creating a new reference. c)Weight assignment when copying a reference (P2 does not need to contact server (no +1 messages); when deleting a reference, decrement/increment is sent to server).
Advanced Referencing Counting (2) Creating an indirection when the partial weight of a reference has reached 1. Problem with “weighted” RC: a priori known maximum number of references (total)! Solution: redirection (like forwarding pointers, see above). Other methods: reference lists instead of reference counters; skeleton holds a list of all proxies using it (used in Java RMI). + : more robust against duplication (delete/insert are idempotent). - : non-scalable (all proxies in a skeleton list!); solution: proxies should re-register after some time with skeleton.
Process (on a machine) Group of processes (on different machines) Proxy (initially unmarked) Skeleton (initially unmarked) Reference from another group (or root entities) Reference from a proxy to a remote skeleton Reference from a skeleton to a local proxy Proxy marked “hard” Proxy marked “soft” Proxy marked “none” Skeleton marked “hard” Skeleton marked “soft” Legend DGC ALGORITHM 1.Initial marking: Mark skeletons that are accessible from the outside “hard”, rest is marked “soft”. Mark all proxies “none”. 2.Intra-process mark propagation: If a proxy is accessible from a local skeleton that is marked “hard” or from the outside, mark it “hard”. If a proxy is accessible from a local skeleton that is marked “soft”, mark the proxy “soft” iff it has not been marked “hard” before. 3.Inter-process mark propagation: Any skeleton that is still marked “soft” is marked “hard”, if it is accessible from a proxy that is marked “hard”. 4.Stabilization: Repeat steps 2 and 3 until no marks can be propagated. 5.Garbage collection: Remove “unreferenced objects” (proxies marked “none” or “soft”, skeletons marked “soft” and their corresponding objects). DGC: Distributed Garbage Collection
Step 1 Step 2 Step 3 Step 2 (Repetition 1) Step 3 (Repetition 1) Step 2 (Repetition 2) Step 3 (Repetition 2) Step 2 (Repetition 3) Step 3 (Repetition 3) NO PROPAGATION! (Step 4 is reached) Step 5 Remove unreferenced objects