Download presentation
1
Processes After today’s lecture, you are asked to know
The basic concept of thread and process. What are the advantages of using multi-threaded client and server? How the client deal with access transparency and replication transparency? What is stateless server and what is stateful server? What is iterative and concurrency server? What are the reasons for Migrating Code? Tells the three segments of a process. What are the weak mobility model and strong mobility model?
2
Naming Names in computer systems are used to share resources, to uniquely identify entities, to refer to locations and so on. An important issue with naming is that a name can be resolved to the entity it refers to. To resolve names, it is necessary to implement a naming system. In distributed system, the implementation of a naming system is itself often distributed across multiple machines. Two things need to be considered for naming system are efficiency and scalability. Contents for this section: Discussing some general issues with respect to naming Organization & implementation of human friendly names, for example DNS
3
Process A Process is often defined as a program in execution. To execute a program, an operating system creates a number of virtual processors, each one for running a different program. To keep track of these virtual processors, the operating system has a process table, containing entries to store CPU register values, memory maps, open files, accounting information, privileges, etc Processes in a modern OS have: Management information Resources Unique process identifier or PID
4
Process Management information covers Resources include
allocated resources event handling permissions scheduling information utilisation of resources Resources include memory allocated - address space open files, I/O channels, devices secondary storage allocations - swap space, memory mapping Address space is protected so only process may read/write to it. Memory protection stops process writing to address space of other processes or kernel crashing OS or other processes if it goes wrong
5
Process For creating a process, the OS must create a complete independent address. The price is high. Even for the switching of the CPU between two processes, because the OS will have to modify registers of the memory management unit (MMU) and invalidate address translation caches such as in the translation lookaside buffer (TLB) Requirements: Changing the memory map in the MMU Flashing the TLB (Translation lookaside buffer)
6
Process Creation in Java
Class Run.java uses getRuntime() obtain context for spawning external process exec() get OS command interpreter to run command getInputStream() get input stream to read output from process
7
Thread A thread is very similar to a process in the sense that it can also be seen as the execution of a (part of a) program on a virtual processor. A thread context often consists of nothing more than the CPU context, along with some other information for thread management. Threads are sometimes called lightweight subcomputations running a in a process that Have their own flow of control and execution state Share their resource context – address space, open files
8
Thread Compared to processes, threads Threads are useful where
are quick to create are quick to context switch can readily share memory, files and sockets Threads are useful where many concurrent computation units are needed computation units need to share address space easily
9
Thread Usage in Non-distributed Systems
For a single-threaded process, whenever a blocking system call is executed, the process as a whole is blocked. Using the multithread process, a program can process more than two tasks at same time, for example the spreadsheet program. Multithreading also makes it possible to exploit parallelism when executing the program on a multiprocessor system. Thread switching can sometimes be done entirely in user space.
10
Java Thread Wire.java creates 2 threads that compete to print their IDs
11
Threads in Distributed Systems
Multithreaded Clients Example: Web browser is doing a number of tasks simultaneously. It is designed as a multithreaded client program. Each thread sets up a separate connection to the server and pulls in the data. Advantages: Hiding communication latencies as much as possible by delivering text contents first, then image and other data. Several connections can be opened simultaneously. Web server can be replicated across multiple machines with multithreaded client. Connections maybe set up to different replicas, allowing data to be transferred in parallel.
12
Multithreaded Servers (1)
A multithreaded server organized in a dispatcher/worker model.
13
Multithreaded Servers (2)
Model Characteristics Threads Parallelism, blocking system calls Single-threaded process No parallelism, blocking system calls Finite-state machine Parallelism, nonblocking system calls Three ways to construct a server.
14
Clients Client-Side Software for Distribution Transparency
Besides the user interface and other application-related software, client software comprises components for achieving distribution transparency. Access transparency is generally handled through the generation of a client stub from an interface definition of what the server has to offer. Replication transparency in many distributed systems is handled by means of client-side solution. One way is forward invocation request to each replica and client proxy collects all responses transparently and passes a single return value to the client application.
15
Client-Side Software for Distribution Transparency
A possible approach to transparent replication of a remote object using a client-side solution.
16
General Server Design Issues
A server is a process implementing a specific service on behalf of a collection of clients. It is organized in this way: it waits for an incoming request from a client and subsequently ensures that the request is taken care of, after which it waits for the next incoming request. Issues: Iterative server: the server itself handles the request and, if necessary, returns a response to the requesting client. Concurrent server: it does not handle the request itself, but passes it to a separate thread or another process, after which it immediately waits for the next incoming request. E.g. Multithreaded server, or Unix way: fork a new process for each new incoming request. Discuss the endpoint (port) and how to manage it. Whether or not the server is stateless: A stateless server does not keep information on the sate of its clients, and can change its own state without having to inform any client, e.g. A Web Server. A stateful server does maintain information on its clients, e.g. a file server that allows a client to keep a local copy of a file.
17
Servers: General Design Issues
3.7 Client-to-server binding using a daemon as in DCE Client-to-server binding using a superserver as in UNIX (e.x. inetd
18
CODE MIGRATION Reasons for Migrating Code
Code migration in distributed systems took place in the form of process migration. That reason has always been performance: The process should be close to where that data reside. A. Migrating parts of the client to server when doing the database operation. B. Migrating parts of the server to client in interactive database applications. Code migration can be used to improve performance by exploiting parallelism.
19
Reasons for Migrating Code
The principle of dynamically configuring a client to communicate to a server. The client first fetches the necessary software, and then invokes the server.
20
Models for Code Migration
As in process migration, the execution status of a program, pending signals and other parts of the environment must be moved as well. A process consists of three segments according to Fugetta’s framework: Code segment is the part that contains the set of instructions that make up the program that is being executed. Resource segment contains references to external resources needed by the process, such as file, printers, devices, other processes, and so on. Execution segment is used to store the current execution state of a process, consisting of private data, the stack, and the program counter. Weak mobility model: In this model, it is possible to transfer only the code segment, along with perhaps some initialization data. Feature: a transferred program is always started from its initial state, e.g. Java applets. Strong mobility model: Besides the code segment being transferred, the execution segment can be transferred as well. Feature: A running process can be stopped, subsequently moved to another machine, and then resume execution where it left off.
21
Models for Code Migration
Even for the upper two models, further distinction can be made between sender-initiated and receiver-initiated migration. In sender-initiated migration, migration is initiated at the machine where the code currently resides or is being executed. In receiver-initiated migration, the initiative for code migration is taken by the target machine In the case of weak mobility, it also makes a difference if the migrated code is executed by the target process, or whether a separate process is started, e.g. Java applets sere executed in the browser’s address space. For strong mobility model, instead of moving a running process, it can also be supported by remote cloning.
22
Models for Code Migration
Alternatives for code migration.
23
Migration and Local Resources
Three types of process-to-resource bindings: Strongest binding – binding by identifier is when a process refers to a resource by its identifier. E.x. when a process uses a URL to refer to a specific Web site by means of that server’s IP address. Weaker form binding is when only the value of a resource is needed. It is also called binding by value. The execution of the process wouldnot be affected if another resource would provide the same value. E.x. a program relies on standard libraries. The weakest form of binding is when a process indicates it needs only a resource of a specific type. This binding by type is exemplified by references to local devices, such as monitors, printers, and so on.
24
Resource-to machine binding
Migration and Local Resources Resource-to machine binding Unattached Fastened Fixed By identifier By value By type MV (or GR) CP ( or MV, GR) RB (or GR, CP) GR (or MV) GR (or CP) GR RB (or GR) Process-to-resource binding Actions to be taken with respect to the references to local resources when migrating code to another machine. GR: Establish a global system wide reference. MV: Move the resource. CP: Copy the value of the resource. RB: Rebind process to locally available resource. Three types of resource to machine bindings: Unattached resources can be easily moved between different machines ( e.x. data) Fastened resources. Moving or copying may be possible Fixed resources. Often refer to local devices.
25
Naming Entities Names, Identifiers, and Addresses
A name in a distributed system is a string of bits or characters that is used to refer to an entity. An entity here can be anything practical: process, printer, mailbox, webpage, hosts, disk….. It can be operated on. The name of an access point is called an address An identifier for entities is a name that has the following properties: An identifier refers to at most one entity Each entity is referred to by at most one identifier An identifier always refers to the same entity (i.e. it is never reused).
26
Name space Names in distributed system are organized into name space. A name space can be represented as a labelled, directed graph with two types of nodes: A leaf node represents a named entity and has the property that it has no outgoing edges. A directory node has a number of outgoing edges, each labelled with a name. A directory node stores a directory table in which an outgoing edges is represented as a pair (edge label, node identifier) Each path in a naming graph can be referred to by the sequence of labels corresponding to the edges in that path such as: N:<label-1, label-2, …, label-n> If N is the root of the naming graph, it is called an absolute path name. Otherwise, it is called a relative path name. global name and local name.
27
Name Spaces (1) A general naming graph with a single root node.
28
Name Spaces (2) The general organization of the UNIX file system implementation on a logical disk of contiguous disk blocks.
29
Name Resolution The process of looking up a name is called name resolution To explain how name resolution works, consider a path name such as N:<label-1, label-2, …, label-n>. Resolution of this name starts at node N of the naming graph, where the name label-1 is looked up in the directory table, and which returns the identifier of the node to which label-1 refers. Resolution continues to label-n by returning the content of that node. Name Resolution includes topics: Closure Mechanism Linking and Mounting
30
Closure Mechanism Knowing how and where to start name resolution is generally referred to as a closure mechanism. Essentially, a closure mechanism deals with selecting the initial node in a name space from which name resolution is to start HOME in UNIX
31
Linking and Mounting Strongly related to name resolution is the use of aliases. An alias is another name for the same entity. Two approaches to implement alias: The first approach is to simply allow multiple absolute paths names to refer to the same node in a naming graph. (Fig 4.1) (hard links). The second approach is to represent an entity by a leaf node, say N, but instead of storing the address or state of that entity, the node stores an absolute path name. (Fig 4.3) (path name /home/steen/keys, which refers to a node containing the absolute path name /keys, is a symbolic link to node n5. Mounting is one way to merge different name spaces Mount point and mounting point The directory node storing the node identifier is called a mount point. The directory node in the foreign name space is called a mounting point. To mount a foreign name space in distributed system requires at least the following information: The name of an access protocol The name of the server. The name of the mounting point in the foreign name space.
32
Linking and Mounting (1)
The concept of a symbolic link explained in a naming graph.
33
Linking and Mounting (2)
Mounting remote name spaces through a specific process protocol.
34
The implementation of a Name Space
A name space forms the heart of a naming service, that is, a service that allows users and processes to add, remove, and look up names. A naming service is implemented by name server. The contents of this part includes: Name Space Distribution Implementation of Name Resolution
35
Name Space Distribution
why name spaces should be arranged hierarchically? Decrease possibility of name conflicts, reduce the size of naming contexts, make name bindings more meaningful, make lookups more efficient and enable federation of name servers.
36
Name Space Distribution
Name spaces for a large-scale, possibly worldwide distributed system, are usually organized hierarchically. The name space is partitioned into three logical layers: The name space is partitioned into three logical layers: The global layer is formed by highest-level. This layer is often characterized by its stability; the directory tables in this layer are rarely changed (19) The administrational layer is formed by directory nodes that together are managed within a single organization. A characteristic feature of the directory nodes in the administrational layer is that they represent groups of entities that belong to the same organization or administrational unit. The managerial layer consists of nodes that may typically change regularly. The nodes in this layer are maintained not only by system administrators, but also by individual end users of a distributed system.
37
Name Space Distribution
The name space is divided into nonoverlapping parts, called zones in DNS. A zone is a part of the name space that is implemented by a separate name server. Name servers in each layer have to meet different requirements
38
Name Space Distribution (1)
An example partitioning of the DNS name space, including Internet-accessible files, into three layers.
39
Name Space Distribution (2)
Item Global Administrational Managerial Geographical scale of network Worldwide Organization Department Total number of nodes Few Many Vast numbers Responsiveness to lookups Seconds Milliseconds Immediate Update propagation Lazy Number of replicas None or few None Is client-side caching applied? Yes Sometimes A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, as an administrational layer, and a managerial layer.
40
Implementation of Name Resolution
Each client has access to a local name resolver, which is responsible for ensuring that the name resolution process is carried out. Assume the (absolute) path name root:<nl,vu,cs,ftp,pub,globe,index.txt> is to be resolved. Using a URL notation, this path name would correspond to ftp://ftp.cs.vu.nl/pub/globe/index.txt , there is two ways to implement name resolution: In iterative name resolution, a name resolver hands over the complete name to the root name server. With recursive name resolution, a name server passes the result to the next name server it finds. The drawback of recursive name resolution is that it puts a higher performance demand on each name server. Its two important advantages are: caching result is more effective compared to iterative name resolution; the communication costs may be reduced.
41
Implementation of Name Resolution (1)
The principle of iterative name resolution.
42
Implementation of Name Resolution (2)
The principle of recursive name resolution.
43
Implementation of Name Resolution (3)
Server for node Should resolve Looks up Passes to child Receives and caches Returns to requester cs <ftp> #<ftp> -- vu <cs,ftp> #<cs> #<cs> #<cs, ftp> ni <vu,cs,ftp> #<vu> #<cs> #<cs,ftp> #<vu> #<vu,cs> #<vu,cs,ftp> root <ni,vu,cs,ftp> #<nl> #<nl> #<nl,vu> #<nl,vu,cs> #<nl,vu,cs,ftp> Recursive name resolution of <nl, vu, cs, ftp>. Name servers cache intermediate results for subsequent lookups.
44
Example: The Domain Name System
The DNS Name Space The DNS name space is hierarchically organized as a rooted tree. A label is a case-insensitive string made up of alphanumeric characters. A label has a maximum length of 63 characters; the length of a complete path name is restricted to 255 characters. The label attached to a node’s incoming edge is also used as the name for that node. A subtree is called a domain; a path name to its root node is called a domain name. The contents of a node is formed by a collection of resource records.
45
The DNS Name Space The most important types of resource records forming the contents of nodes in the DNS name space. Type of record Associated entity Description SOA Zone Holds information on the represented zone A Host Contains an IP address of the host this node represents MX Domain Refers to a mail server to handle mail addressed to this node SRV Refers to a server handling a specific service NS Refers to a name server that implements the represented zone CNAME Node Symbolic link with the primary name of the represented node PTR Contains the canonical name of a host HINFO Holds information on the host this node represents TXT Any kind Contains any entity-specific information considered useful
46
DNS Implementation (1) An excerpt from the DNS database for the zone cs.vu.nl.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.