Computer Engineering Department Distributed Systems Course

Name: Computer Engineering Department Distributed Systems Course
Uploaded: 2017-08-24T17:08:02+00:00
Duration: PTM25S3
Channel: Jemimah Manning
Description: Computer Engineering Department Distributed Systems Course

Distributed (Operating) Systems -Virtualization- -Server Design Issues- -Process and Code Migration-
Computer Engineering Department Distributed Systems Course Asst. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2014

-Virtualization-

Resource Virtualization
On a single-processor computer, simultaneous execution is, of course, an illusion. As there is only a single CPU, only an instruction from a single thread or process will be executed at a time. By rapidly switching between threads and processes, the illusion of parallelism is created The separation between having a single CPU and being able to pretend there are more can be extended to other resources as well, leading to what is known as resource virtualization. Threads and processes can be seen as a way to do more things at the same time. In effect, they allow us build (pieces of) programs that appear to be executed simultaneously. On a single-processor computer, this simultaneous execution is, of course, an illusion. As there is only a single CPU, only an instruction from a single thread or process will be executed at a time. By rapidly switching between threads and processes, the illusion of parallelism is created

Virtualization Virtualization: extend or replace an existing interface to mimic the behavior of another system. Introduced in 1970s: Run legacy software on newer mainframe hardware Handle platform diversity by running apps in VMs Portability and flexibility What is the reason doing virtualization How virtualization works in practice There are many different type of interfaces, ranging from the basic instruction set as offered by a CPU to the vast collecttion of APIthat are shipped with many current middleware systems.

Virtualization Hardware - instruction sets Software - API Platform
Windows MI running on Windows 7 Much higher level: Middleware and its applications

Types of Interfaces Different types of interfaces
Between hardware and software: Assembly instructions that can be invoked by any program. Between OS and hardware: Assembly instructions by privileged programs System calls: Offered by OS Library functions: APIs Depending on what is replaced /mimicked, we obtain different forms of virtualization EMULATING - EMULATOR Computer systems generally offer 4 different types of interfaces at 4 different levels.

Types of Virtualization 1. Process Virtual Machines
Runtime system that essentially provides an abstract instruction set that is to be used for executing applications. Instructions can be interpreted but could also be emulated as is done for running Windows applications on Unix platforms. In this case, emulator also mimic the system calls. Ex. JVM

Types of Virtualization 2. Virtual Machine Monitor VMM
Typical examples are VMware and Xen Virtualization is implemented as a layer completely shielding hardware Offering the complete instruction sets Can be offered simultaneously to different programs at the same time It is possible to have multiple, and different operating systems run independently and concurrently on the same platform.

-Server Design Issues-

Server Design Issues How to locate an end-point (port #)?
Well known port # etc. - IANA Directory service (port mapper in Unix) Super server (inetd daemon in Unix) Well known port numbers are assigned by IANA-Internet Assigned Numbers Authority With assigned end points, the client only needs to find the network address of the machine

A. Client to server binding using a daemon
Daemon runs servers and keeps track of current end point of each service implemented by a co-located server. The daemon itself listens to a well known end point. A client will first contact the daemon, request the end point, and then contact the specific server.

B- Client to server binding using super server
Super server is actually a daemon. Actually implementing each service by means of a separate server may be a waste of resources. Instead of having to keep track of so many passive processes, it is often more efficient to have a single superserver listening to each end point associated with a specific service. When a request comes in, the daemon forks a process to take further care of the request. The process will exit after it finishes.

Server Designs -more- Iterative or sequential Multithreaded
Single process no threads in that process Service one request at a time No concurrency Concurrent server: Does not handle the request itself but passes it to a separate thread or another process. And wait for another request Multithreaded Every time new request comes in handed it to new thread Full concurrency Event-based Sits in between 1 and 2 Single process with single thread The way you emulate concurrency is that all calls are not blocking (non-blocking IO) Sequential calls and single process but you still get concurrency A server is a process implementing a specific service Iterative servers are easy to implement but they don’t allow concurrency What might be efficiency of these server designs Rank them -What is efficiency when it comes to servers - what metric are we using to measure -Throughput: number of request you can service per unit time Then the answer: (from the least efficient to the most efficient) Pure sequential is least efficient Then process-bases: you can service more then one request at a time but because of that you have more than one process context switching between processes consume a lot of CPU cycles (to be able to order the remaining we need extra information. Is the server multiple core or single core?) single core! So Then multithreaded (there is thread-based context switch this is overhead) Then event-based (no context switch) But in terms of the code complexity event-based model is the most complex one. What if the server is multi-core? 2 ile 3 degis after that?

Server Designs -more- Stateful or Stateless?
Stateful server Maintain state of connected clients Sessions in web servers Stateless server No state for clients Soft state Maintain state for a limited time; discarding state does not impact correctness Compare stateful vs. stateless What if server crashes Performance – in terms of different metrics Session – no session in stateless servers Web servers are stateless: They merely respond to incoming HTTP requests Stateful server example file servers. Which files belong to which users Disadvantage if stateful server crashes it has to recover Advantage: performance improvement Stateful server has a notion of user session Whenever a new client connects new session is open Example: shopping card In stateless server every request is assumed a new request

Server Clusters The switch forms the entry point for the server cluster, offering a single network address. Multiple tiers of multiple components Server application is split into smaller components, each component is responsible for doing the certain piece of the server functionality. Most common architecture to split is tiering Transport-level switches accept incoming TCP connection requests and pass requests on to one of servers in the cluster. Of course, not all server clusters will follow this strict separation. It is frequently the case that each machine is equipped with its own local storage, often integrating application and data processing in a single server leading to two-tiered architecture. When a server clusters offer multiple services, it may happen that different machines run different application servers. As a consequence, the switch will have to be able to distinguish services. First tier: HTTP rendering Second tier: EJB, python, java Third tier: database Collection of machines connected through a network, where each machine runs one or more servers Mostly connected with LAN having high bandwidth and low latency Logically organized into three tiers Each tier may be optionally replicated; uses a dispatcher Use TCP handoffs

TCP hand off Standard way of accessing server cluster – TCP connection
Transport layer switch Switches accept incoming TCP connection requests and hand off connections to one of the servers. When the switch receives a TCP connection request, it subsequently identifies the best server for handling that request, and forwards packed to that server. The server, in turn, will send an acknowledgement back to the requesting client, but inserting the switch’s IP address as the source field of the header of the IP packet carrying the TCP segment. The most server clusters offer a single access point. What if that point fails? To eliminate that problem several access points can be provided. For example DNS can return several addresses, all belonging to the same host name.

Switches and single access point
Switch can play an important role in distributing the load among the various servers. It can be seen that switch can play an important role in distributing the load among the various servers load balancer (switch) Switch can inspect the payload of the incoming request: content-aware request distribution. The simplest load-balancing policy that switch can follow is round robin. More advanced server selection criteria can be deployed as well. Server Cluster Implementations 1. HTTP redirecting New connection comes - tells the browser the server available for the client. then browser talks to that replica to get the service – this is not transparent from the browser point of view (presence of replica is exposed to the client/browser) TCP handoff (see previous slide)

Scalability Question : How can you scale the server capacity?
Buy bigger machine! Replicate Distribute data and/or algorithms Ship code instead of data Cache

-Process and Code Migration-

Code and Process Migration
Motivation How does migration occur? Resource migration Agent-based system Heterogeneous - Homogeneous systems Which one is more complicated? Code vs. process migration There are situations in which passing programs, sometimes even while they are being executed An agent-based model (ABM) is one of a class of computational models for simulating the actions and interactions of autonomous agents (both individual or collective entities such as organizations or groups) with a view to assessing their effects on the system as a whole. It combines elements of game theory, complex systems, emergence, computational sociology, multi-agent systems, and evolutionary programming. Monte Carlo Methods are used to introduce randomness Agent: An entity that functions continuously and autonomously in an environment in which other processes take place and other agents exist Agents individually asses its situation in the environment and make decisions on the basis of a set of rules General Characteristics: Autonomy Pro-activeness Reactivity “Social” Ability

Process migration - Strong Mobility -
Key reasons: Performance and flexibility Entire process is moved from one machine to another. The overall system performance can be improved if the process is moved from the heavily loaded to lightly loaded machines. Better utilization of system-wide resources Distributed scheduling Examples: Condor Condor: Workload management system for compute-intensive jobs. Harnesses collection of dedicated or non-dedicated hardware under distributed ownership Developed by University of Wisconsin-Madison Computer Science Department. Originally developed for “cycle stealing” from idle machines Checkpointing saves complete running process and I/O state to disk. Allows recovery from failures, and roll back to the last saved state. Also Allows process migrationm , Move saved state and restart Condor components: Job queueing Scheduling policy Priority mechanism Resource monitoring Resource management

Code Migration - Weak Mobility -
Basic motivation: Process data close to where those data resides Server to Client shipment: Shipment of server code to client – filling forms (reduce communication, no need to pre-link stubs with client) Client to Server Shipment: Ship parts of client application to server instead of data from server to client (e.g., databases) Improve parallelism – agent-based web searches Ex. Java applet, search engine – Google Code migration improve parallelism: A typical example is searching for information in Web. It is relatively simple to implement a search query in the form of a small mobile program, called mobile agent, that moves from site to site. By making several copies of such a program, and sending each off to different sites, we may be able to achieve linear speed-up compared to using just a single program.

Motivation - Code migration
Flexibility Dynamic configuration of distributed system Clients don’t need preinstalled software – download on demand The big area where it is becoming popular is drivers

Migration models Process = Code segment + Resource segment + Execution segment Weak versus strong mobility Weak => transferred program starts from initial state Sender-initiated versus receiver-initiated Sender-initiated (machine having the code) Migration initiated by machine where code resides Client sending a query code to a database server Client should be pre-registered Receiver-initiated (machine receiving the code) Migration initiated by machine that receives code Java applets Receiver can be anonymous Code segment: set of instructions that make up the program that is being executed. Resource segment: contains references to external resources such as files, printers, devices and other processes. Execution segment: current execution state of a process, the stack, program counter etc. Weak mobility: Java applets which always starts execution from the beginning. The benefits of this approach is its simplicity. Strong mobility: Execution segment can be transferred as well. A running process can be stopped, subsequently moved to another machine and then resume its execution where it left off. Much harder to implement. Receiver initiated migration is simpler than sender-initiated migration. Clients take the initiative for migration. Securely uploading code to a server, as is done in sender initiated migration, often requires that the client has previously been registered and authenticated at that server.

Who executes migrated entity?
Code migration (weak mobility) Execute in a separate process [Applets] Execute in target process Execute in the same process that downloaded the code Process needs to be protected against malicious codes Process migration (strong mobility) Migrate the same process Create a clone and migrate it Remote cloning (remote fork) Code migration Migrating process run on target process - You can execute in the same process that downloaded the code Migrating process run on separate process – safety For example java applets are simply downloaded by a Web browser and are executed in the browser’s address space. There is no need to start a separate process. The main drawback is that target process needs to be protected against malicious or inadvertent code executions. A simple solution is to let the OS take care of that by creating a separate process to execute the migrated code.

Models for Code Migration

What about resource segment Do Resources Migrate?
So far, migration of the code and execution segment are mentioned. What about resource segment? Depends on resource to process binding By identifier (strongest): specific web site, ftp server By value: Java libraries By type (weakest): printers, local devices Depends on type of “attachments” Unattached to any node: Data files Fastened resources (can be moved only at high cost) Local database, web sites Fixed resources Can not be moved: Local devices, communication end points

Resource Migration Actions
Actions to be taken with respect to the references to local resources when migrating code to another machine. GR: Establish global system-wide reference MV: Move the resources CP: Copy the resource RB: Rebind process to locally available resource

Migration in Heterogeneous Systems
Systems can be heterogeneous (different architecture, OS) Support only weak mobility: Recompile code, no run time information Strong mobility: Recompile code segment, transfer execution segment [migration stack] (figure) Virtual machines - interpret source (scripts) or intermediate code [Java] Until now we have implicitly assumed that source machine and target machine are homogeneous 1st case: Heterogeneous environmentlarda bir cozum olarak process migration (strong mobility) yapmaz sadece code migration (weak mobility) yapabilirsiniz. Mimariler farkli oldugu icin recompile yaparsin codu bastan calistirirsin sorun kalmaz. Binary code u migrate edemessin, source code u tasirsin sonra tekrar binary codu uygun platform-compiler icin olusturursun. Java yi dusun. 2nd case: Sekil process migration gosteriyor. Process icin source code available oldugunu varsayariz. Yoksa imkansiz tasimak. Gittigi makinada recompile the source code and recreate code segment. Then take the data segment and do appropriate translations. 32-bit 64 bit etc. 3rd case: Using virtual machines:

Case Studies Back up slides

Case study: Agents Software agents Mobile agent
Autonomous process capable of reacting to, and initiating changes in its environment, possibly in collaboration More than a “process” – can act on its own Mobile agent Capability to move between machines Needs support for strong mobility Example: D’Agents (aka Agent TCL) Support for heterogeneous systems, uses interpreted languages

Case Study: Viruses and Malware
Viruses and malware are examples of mobile code Malicious code spreads from one machine to another Sender-initiated: proactive viruses that look for machines to infect Autonomous code Receiver-initiated User (receiver) clicks on infected web URL or opens an infected attachment

Case Study: PlanetLab Distributed cluster across universities
Used for experimental research by students and faculty in networking and distributed systems Uses a virtualized architecture Linux Vservers Node manager per machine Obtain a “slice” for an experiment: slice creation service

Case Study: ISOS Internet scale operating system
Harness compute cycles of thousands of PCs on the Internet PCs owned by different individuals Donate CPU cycles/storage when not in use (pool resources) Contact coordinator for work Coordinator: partition large parallel app into small tasks Assign compute/storage tasks to PCs Examples: P2P backups

Case study: Condor Condor: use idle cycles on workstations in a LAN
Used to run large batch jobs, long simulations Idle machines contact condor for work Condor assigns a waiting job User returns to workstation => suspend job, migrate Flexible job scheduling policies

backup

Types of Virtualization
Emulation VM emulates/simulates complete hardware Unmodified guest OS for a different PC can be run Bochs, VirtualPC for Mac, QEMU Full/native Virtualization VM simulates “enough” hardware to allow an unmodified guest OS to be run in isolation Same hardware CPU IBM VM family, VMWare Workstation, Parallels,… 1.The most basic form of virtualization is emulation Now we are talking about the hardware level virtualization We are trying to emulate one CPU on another Emulating software implements in software entire machine Mac run on PowerPC through VirtualPC Bochs emulate intel PC - It includes emulation of the Intel x86 CPU, common I/O devices, and a custom BIOS 2. Full-native Virtualization Here essentially interface A and B are different Interface A and interface B is whatever native machine You take the native interface B and write a software and implement the same interface You don’t need to emulate CPU Because you emulate one cpu on top of another through the opsystem Here both CPU are same

Computer Engineering Department Distributed Systems Course

Similar presentations

Presentation on theme: "Computer Engineering Department Distributed Systems Course"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer Engineering Department Distributed Systems Course

Similar presentations

Presentation on theme: "Computer Engineering Department Distributed Systems Course"— Presentation transcript:

Similar presentations

About project

Feedback