Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.

Slides:



Advertisements
Similar presentations
DISTRIBUTED COMPUTING PARADIGMS
Advertisements

Multiple Processor Systems
Remus: High Availability via Asynchronous Virtual Machine Replication
Categories of I/O Devices
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Silberschatz and Galvin  Operating System Concepts Module 16: Distributed-System Structures Network-Operating Systems Distributed-Operating.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
Remote Procedure Call (RPC)
Remote Procedure Call Design issues Implementation RPC programming
Implementing A Simple Storage Case Consider a simple case for distributed storage – I want to back up files from machine A on machine B Avoids many tricky.
Silberschatz, Galvin and Gagne  Operating System Concepts Chap15: Distributed System Structures Background Topology Network Types Communication.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Ameoba Designed by: Prof Andrew S. Tanenbaum at Vrija University since 1981.
Distributed components
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
Distributed System Structures CS 3100 Distributed System Structures1.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Module 16: Distributed System Structures.
Tutorials 2 A programmer can use two approaches when designing a distributed application. Describe what are they? Communication-Oriented Design Begin with.
CS444/CS544 Operating Systems Finale 4/27/2007 Prof. Searleman
Course Overview Principles of Operating Systems
Learning Objectives Understanding the difference between processes and threads. Understanding process migration and load distribution. Understanding Process.
Centralized Architectures
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
PRASHANTHI NARAYAN NETTEM.
1 Distributed Systems: Distributed Process Management – Process Migration.
Distributed Process Implementation Hima Mandava. OUTLINE Logical Model Of Local And Remote Processes Application scenarios Remote Service Remote Execution.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Networked File System CS Introduction to Operating Systems.
1 Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
DISTRIBUTED COMPUTING PARADIGMS. Paradigm? A MODEL 2for notes
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
Lecture 4: Sun: 23/4/1435 Distributed Operating Systems Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
Dr. M. Munlin Network and Distributed System Structures 1 NETE0516 Operating Systems Instructor: ผ. ศ. ดร. หมัดอามีน หมัน หลิน Faculty of Information.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved RPC Tanenbaum.
Module 16: Distributed System Structures
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
Distributed System Concepts and Architectures Services
Shuman Guo CSc 8320 Advanced Operating Systems
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 16: Distributed System Structures.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Module 16: Distributed System Structures.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Workshop 6 Agenda Homework review: 15.1, 16.5, 17.4 Study group project milestone Lecture & discussion on network structures Group activity on distributed.
Acknowledgement: These slides are adapted from slides provided in Thißen & Spaniol's course Distributed Systems and Middleware, RWTH Aachen Processes Distributed.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Chapter 14: Distributed Operating Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Apr 4, 2005 Outline n Motivation.
Implementing Remote Procedure Calls Andrew D. Birrell and Bruce Jay Nelson Xerox Palo Alto Research Center Published: ACM Transactions on Computer Systems,
Module 16: Distributed System Structures Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Apr 4, 2005 Distributed.
Section 2.1 Distributed System Design Goals Alex De Ruiter
CS307 Operating Systems Distributed System Structures Guihai Chen Department of Computer Science and Engineering Shanghai Jiao Tong University Spring 2012.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Module 16: Distributed System Structures
#01 Client/Server Computing
Chapter 16: Distributed System Structures
Distributed System Structures 16: Distributed Structures
DISTRIBUTED COMPUTING
Mobile Agents.
Sarah Diesburg Operating Systems COP 4610
Fault Tolerance Distributed Web-based Systems
Distributed Systems and Concurrency: Distributed Systems
#01 Client/Server Computing
Presentation transcript:

Operating Systems Distributed-System Structures

Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design Issues Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design Issues

Network-Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Remote logging into the appropriate remote machine. –Transferring data from remote machines to local machines, via the File Transfer Protocol (FTP) mechanism. Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Remote logging into the appropriate remote machine. –Transferring data from remote machines to local machines, via the File Transfer Protocol (FTP) mechanism.

Distributed-Operating Systems Users not aware of multiplicity of machines. Access to remote resources similar to access to local resources. Data Migration - transfer data by transferring entire file, or transferring only those portions of the file necessary for the immediate task. Computation Migration - transfer the computation, rather than the data, across the system. Users not aware of multiplicity of machines. Access to remote resources similar to access to local resources. Data Migration - transfer data by transferring entire file, or transferring only those portions of the file necessary for the immediate task. Computation Migration - transfer the computation, rather than the data, across the system.

Distributed-Operating Systems (continued) Process Migration - execute an entire process, or parts of it at different sites. –Load balancing - distribute processes across network to even the workload. –Computation speedup - subprocesses can run concurrently on different sites. –Hardware preference - process execution may require specialized processor. Process Migration - execute an entire process, or parts of it at different sites. –Load balancing - distribute processes across network to even the workload. –Computation speedup - subprocesses can run concurrently on different sites. –Hardware preference - process execution may require specialized processor.

Distributed-Operating Systems (continued) –Software preference - required software may be available at only a particular site. –Data access - run process remotely, rather than transfer all data locally. –Software preference - required software may be available at only a particular site. –Data access - run process remotely, rather than transfer all data locally.

Remote Services Requests for access to a remote file system are delivered to the server. Access requests are translated to messages for the server, and the server replies are packed as messages and sent back to the user. A common way to achieve this is via the Remote Procedure Call (RPC) paradigm. Requests for access to a remote file system are delivered to the server. Access requests are translated to messages for the server, and the server replies are packed as messages and sent back to the user. A common way to achieve this is via the Remote Procedure Call (RPC) paradigm.

Remote Services Messages addressed to an RPC daemon listening to a port on the remote system contain the name of a process to run and the parameters to pass to that process. The process is executed as requested, and any output is sent back to the requester in a separate message.

Remote Services (continued) A port is a number included at the start of a message packet. A system can have many ports within its one network address to differentiate the network services it supports.

RPC Scheme Binds Client and Server Port Binding information may be predecided, in the form of fixed port addresses. –At compile time, an RPC call has fixed port number associated with it. –Once a program is compiled, the server cannot change the port number of the requested service. Binding information may be predecided, in the form of fixed port addresses. –At compile time, an RPC call has fixed port number associated with it. –Once a program is compiled, the server cannot change the port number of the requested service.

RPC Scheme Binds Client and Server Port (continued) Binding can be done dynamically by a rendezvous mechanism. –Operating system provides a rendezvous daemon on a fixed RPC port. –Client then sends a message to the rendezvous daemon requesting the port address of the RPC it needs to execute. Binding can be done dynamically by a rendezvous mechanism. –Operating system provides a rendezvous daemon on a fixed RPC port. –Client then sends a message to the rendezvous daemon requesting the port address of the RPC it needs to execute.

RPC Scheme Binds Client and Server Port (continued) A distributed file system (DFS) can be implemented as a set of RPC daemons and clients. –The messages are addressed to the DFS port on a server on which a file operation is to take place. –The message contains the disk operations to be performed (i.e., read, write, rename, delete, or status). A distributed file system (DFS) can be implemented as a set of RPC daemons and clients. –The messages are addressed to the DFS port on a server on which a file operation is to take place. –The message contains the disk operations to be performed (i.e., read, write, rename, delete, or status).

RPC Scheme Binds Client and Server Port (continued) –The return message contains any data resulting from that call, which is executed by the DFS daemon on behalf of the client.

Threads Threads can send and receive messages while other operations within the task continue asynchronously. Pop-up thread - created on “as needed” basis to respond to new RPC. –Cheaper to start new thread than to restore existing one. –No threads block waiting for new work; no context has to be saved, or restored. Threads can send and receive messages while other operations within the task continue asynchronously. Pop-up thread - created on “as needed” basis to respond to new RPC. –Cheaper to start new thread than to restore existing one. –No threads block waiting for new work; no context has to be saved, or restored.

Threads (continued) –Incoming RPCs do not have to be copied to a buffer within a server thread. RPCs to processes on the same machine as the caller made more lightweight via shared memory between threads in different processes running on same machine. –Incoming RPCs do not have to be copied to a buffer within a server thread. RPCs to processes on the same machine as the caller made more lightweight via shared memory between threads in different processes running on same machine.

DCE Thread Calls Thread - Management: create, exit, join, detach Synchronization: mutex_init, mutex_destroy, mutex_lock, mutex_trylock, mutex_unlock Condition-variable: cond_init, cond_destroy, cond_wait, cond_signal, cond_broadcast Thread - Management: create, exit, join, detach Synchronization: mutex_init, mutex_destroy, mutex_lock, mutex_trylock, mutex_unlock Condition-variable: cond_init, cond_destroy, cond_wait, cond_signal, cond_broadcast

DCE Thread Calls (continued) Scheduling: setscheduler, getscheduler, setprio, getprio Kill-thread: cancel, setcancel Scheduling: setscheduler, getscheduler, setprio, getprio Kill-thread: cancel, setcancel

Robustness To ensure that the system is robust, we must: Detect failures. –Link –Site Reconfigure the system so that computation may continue. Recover when a site or a link is repaired. To ensure that the system is robust, we must: Detect failures. –Link –Site Reconfigure the system so that computation may continue. Recover when a site or a link is repaired.

Failure Detection - Handshaking Procedure At fixed intervals, sites A and B send each other an I- am-up message. If site A does not receive this message within a predetermined time period, it can assume that site B has failed, that the link between A and B has failed, or that the message from B has been lost.

Failure Detection - Handshaking Procedure (continued) At the time site A sends the Are-you-up? message, it specifies a time interval during which it is willing to wait for the reply from B. If A does not receive B’s reply message within the time interval, A may conclude that one or more of the following situations has occurred:

Failure Detection - Handshaking Procedure (continued) –Site B is down. –The direct link (if one exists) from A to B is down. –The alternative path from A to B is down. –The message has been lost. –Site B is down. –The direct link (if one exists) from A to B is down. –The alternative path from A to B is down. –The message has been lost.

Recovery From Failure When a failed link or site is repaired, it must be integrated into the system gracefully and smoothly. Suppose that a link between A and B has failed. When it is repaired, both A and B must be notified. We can accomplish this notification by continuously repeating the handshaking procedure. When a failed link or site is repaired, it must be integrated into the system gracefully and smoothly. Suppose that a link between A and B has failed. When it is repaired, both A and B must be notified. We can accomplish this notification by continuously repeating the handshaking procedure.

Recovery From Failure Suppose that site B has failed. When it recovers, it must notify all other sites that it is up again. Site B then may have to receive from the other sites various information to update its local tables.

Design Issues Transparency and locality - distributed system should look like conventional, centralized system and not distinguish between local and remote resources. User mobility - brings user’s environment (i.e., home directory) to wherever the user logs in. Fault tolerance - system should continue functioning, perhaps in a degraded form, when faced with various types of failures. Transparency and locality - distributed system should look like conventional, centralized system and not distinguish between local and remote resources. User mobility - brings user’s environment (i.e., home directory) to wherever the user logs in. Fault tolerance - system should continue functioning, perhaps in a degraded form, when faced with various types of failures.

Design Issues (continued) Scalability - system should adapt to increased service load time. Large-scale systems - service demand from any system component should be bounded by a constant that is independent of the number of nodes. Servers’ process structure - servers should operate efficiently in peak periods; use lightweight processes or threads. Scalability - system should adapt to increased service load time. Large-scale systems - service demand from any system component should be bounded by a constant that is independent of the number of nodes. Servers’ process structure - servers should operate efficiently in peak periods; use lightweight processes or threads.