Naming (1) Chapter 4.

Naming (1) Chapter 4

Chapter 4 topics What’s in a name? Approaches for naming schemes
Directories and location services Distributed garbage collection

Name Space Name Space is a general term used for the “space” of all possible names using given rules or constructions. For example, “names” are 4 letter words, no numbers or other characters. The name space includes 426 symbols so we can name that many objects in this name space.

Name Spaces Tanenbaum uses it to mean a graph with a single root node that can store all the names in the name space.

What is a Name? An identifier that: Identifies a resource
Uniquely? Describes the resource? Enables us to locate that resource Directly? With help? How is the name used? Disambiguate? Access? Locate?

Names Must humans remember or recognize it? Is resource static?
Never moves Change in location should change name Resource may move Resource is mobile Name vs Identifier vs Address

Approaches to Naming Globally unique identifier
Ethernet Solves identification, but not description or location Hierarchically assigned globally unique identifier (hierarchy is location-based) Telephone number, IP address Solves identification, not description Helps with location

Approaches to Naming Hierarchically assigned name (hierarchy is description-based) Domain Name Service, URL Solves identification Helps with description Still problems with location Globally unique name TCP/IP Protocol Ports Extensibility problems

URI, URL, URN URI Uniform Resource Identifier
IETF meta-standard Defines naming schemes / protocols Each naming scheme has it’s own mechanism URL Uniform Resource Locator Uses DNS to map to host Host knows how to map remainder to resource URN Uniform Resource Name Idea: Permanent URL

Naming: Why an Issue for Application Developers?
DNS is widely accepted standard Only names machines Doesn’t handle mobility URI / URN will become standard Can be descriptive Globally unique, uses registry Persistent But expensive to create

Distributed Database Example: R*
R* developed at IBM Almaden Research – first distributed relational database Wanted mobility of resources Supports fault tolerance But movement rare Performance is critical Solution: Two components to name Unique ID assigned by “birthplace” Local catalog maps ID to: Birthplace (maintains current location) Presumed current location

Security Considerations
Does name give away information? Social Security Numbers URL Batched IDs (e.g., Ethernet) Sequentially assigned IDs Solution: Define what name SHOULD do Ensure it meets goals Look for reasons it doesn’t

Directories Unless you use physical locations for names and objects never move, you will need directories. How to organize? Who uses it? How often? How to modify (when object moves)?

X.500: What is it? Goal: Global “white pages”
Lookup anyone, anywhere Developed by Telecommunications Industry ISO standard directory for OSI networks Idea: Distributed Directory Application uses Directory User Agent to access a Directory Access Point

Directory Information Base (X.501)
Tree structure Root is entire directory Levels are “groups” Country Organization Individual Entry structure Unique name Build from tree Attributes: Type/value pairs Schema enforces type rules Alias entries

Linking and Mounting (book)
Position in hierarchy affects performance - search time

Name Space Distribution
An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

Implementation of Name Resolution (1)
Iterative name resolution.

Recursive name resolution. Root name server does more work, but can cache intermediate results for future requests.

The comparison between recursive and iterative name resolution with respect to communication costs.

Naming versus Locating Entities
Traditional name service has direct, single level mapping between names and addresses. Works if addresses do not change frequently and names are identifiers. Two-level mapping using a name service for human-friendly name to identify and location service for ID to address.

Home-Based Approaches
The principle of Mobile IP.

4.3 Removing Unreferenced Entities
Called Distributed Garbage Collection Many languages (Java) and distributed middleware systems provide for the recycling of memory objects that are no longer referenced. Use terminology of Java RMI with skeletons and proxies.

The Problem of Unreferenced Objects
An example of a graph representing objects containing references to each other.

Garbage – Unreferenced Memory
int * p1; int * p2; p1 = new int; p1 p1

Garbage – Unreachable Objects
Unneeded memory not referenced by objects currently active in the program p2 p1 p3

Garbage Collection in a Centralized System
Simple solution: stop allocation of new objects and deallocation. .Mark all objects as unreferenced .Go through all pointers and mark referenced objects as referenced. .Delete unreferenced objects and resume processing. Works if all objects and references are on the same machine, but time consuming.

Mark all objects as unreferenced
Mark and Sweep - 1 Mark all objects as unreferenced p2 p4 p5 p1 p3

Go through program and unmark referenced objects
Mark and Sweep - 2 Go through program and unmark referenced objects p2 p4 p5 p1 p3

Delete unreferenced objects
Mark and Sweep - 3 Delete unreferenced objects p4 p5 p1

A Little More Efficient
Reference Counting (not distributed) The object maintains a reference counter. When an object is created with a reference pointer, its reference counter is set to one. Reference counter is increased or decreased as additional pointers are created or removed. If the reference counter (RC) goes to zero, the object is GC’ed. Problem with scheme: unreachable objects referencing each other.

Distributed Garbage Collection
In DS the objects and pointers may be on different machines. Difficult to stop processing on one machine, let alone a DS. Proxies and skeletons: When a distributed object is created, a skeleton is created for it. When it is referenced, a proxy is created at the client machine (referencer) to talk to the skeleton at the object site.

Proxies and Skeletons skeleton proxy object process

Distributed Reference Counting(1)
Where are the reference counters maintained? How to increase and decrease RC from a remote proxy? Soln: The object skeleton will maintain the RC. Messages to the RC must be protected against duplication or loss.

Reference Counting (2) The problem of maintaining a proper reference count in the presence of unreliable communication.

Reference Counting (3) (a) Copying a reference to another process and incrementing the counter too late. (b) A solution, but increased message traffic.

Advanced Reference Counting (1)
Idea: Eliminate race between increase and decrease messages by having only messages which decrease the count. Also, make it possible to copy references without communicating with object. This has advantages and disadvantages.

1. When object O is created, it has a TOTAL WEIGHT, TW and PARTIAL WEIGHT, PW. Initially TW = PW = 2^N = 2N. 2. When a reference is created half the PW of the object (TW and PW are stored at the skeleton) is assigned to the proxy at process P1. 3. If a remote reference is duplicated, half the PW at P1 is passed to P2. (skeleton is not aware of this). 4. If remote reference is passed to P2, P2 gets all of the PW. Again, object O doesn’t need to know. 5. When reference is destroyed, message is sent to object’s skeleton to decrement TW by process’s PW. 6. When O’s TW equals its PW, object can be GC’ed.

Advanced Referencing Counting (3)
(a) The initial assignment of weights in weighted reference counting (b) Weight assignment when creating a new reference.

Weight assignment when copying a reference from P1 for P2.

Problem: only a limited number of references (the N in 2N) can be copied in this way without resorting to an additional scheme. Additional schemes: Existing reference P1 can create its own skeleton so that it can duplicate more references on its own. However, this gives rise to extra indirections which degrade performance.

Creating an indirection when the partial weight of a reference has reached 1.

Generation Reference Counting (1)
This scheme solves problem of allowing a process to create an endless number of copies of references. Advantages: copies can create copies forever without communicating with the object. Reference creating and destruction requires comm with object or creator but not both. Disadvantage: still requires reliable comm.

Generation Reference Counting (2)
Object skeleton keeps a table G where G[i.] is the number of outstanding references for generation i. When an object is created with a reference, that reference is considered generation 1. Any reference created by the object is generation 1. When a new reference is created, it is told its generation and its copy counter is initially zero. If it copies the reference for P2, it increments its copy counter. The copy (P2) increments the generation count. When a remote reference is deleted, with copy=X, generation=Y, a message is sent to the object skeleton with X and Y. G[Y] is decremented by one for the removed reference. G[Y+1] is increased by X for the X copies made by the process at generation Y. Note: at any given time, G probably will not accurately reflect reality. Also, the counts may be negative. When all entries G[i.] are zero, the object can be GC’ed.

Generation Referencing Counting (3)
Creating and copying a remote reference in generation reference counting.

Naming (1) Chapter 4.

Similar presentations

Presentation on theme: "Naming (1) Chapter 4."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Naming (1) Chapter 4.

Similar presentations

Presentation on theme: "Naming (1) Chapter 4."— Presentation transcript:

Similar presentations

About project

Feedback