Ch. 2-2 Computer Clusters1 2.3 컴퓨터 클러스터의 설계 원칙 2.3.1 Single-System Image Featues  It means the illusion of a single system, single control, symmetry,

Slides:

Advertisements

Similar presentations

Multiple Processor Systems

Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

2. Computer Clusters for Scalable Parallel Computing

Introduction to DBA.

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.

Distributed Processing, Client/Server, and Clusters

Chapter 16 Client/Server Computing Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,

Figure 1.1 Interaction between applications and the operating system.

Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.

Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.

VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT.

Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.

RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.

ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.

1 Copyright © 2012, Elsevier Inc. All rights reserved Distributed and Cloud Computing K. Hwang, G. Fox and J. Dongarra Chapter 2: Computer Clusters.

1 Distributed Processing, Client/Server, and Clusters Chapter 13.

Managing Multi-User Databases AIMS 3710 R. Nakatsu.

Module 13: Configuring Availability of Network Resources and Content.

Database Design – Lecture 16

Distributed Systems 1 CS- 492 Distributed system & Parallel Processing Sunday: 2/4/1435 (8 – 11 ) Lecture (1) Introduction to distributed system and models.

Module 9: Configuring Storage

Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.

Appendix B Planning a Virtualization Strategy for Exchange Server 2010.

IMPROUVEMENT OF COMPUTER NETWORKS SECURITY BY USING FAULT TOLERANT CLUSTERS Prof. S ERB AUREL Ph. D. Prof. PATRICIU VICTOR-VALERIU Ph. D. Military Technical.

CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.

Server Systems Administration. Types of Servers Small Servers –Usually are PCs –Need a PC Server Operating System (SOS) such as Microsoft Windows Server,

Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.

Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Clusters. zAlternative to symmetric multiprocessing (SMP) zGroup of interconnected, whole computers working together as a unified computing resource yillusion.

N. GSU Slide 1 Chapter 05 Clustered Systems for Massive Parallelism N. Xiong Georgia State University.

1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.

Ch System Models for Distributed and Cloud Computing Classification of Massive systems (Table 1.2) Clusters of Cooperative Computers 

Distributed Computing Systems CSCI 4780/6780. Distributed System A distributed system is: A collection of independent computers that appears to its users.

"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.

VMware vSphere Configuration and Management v6

High Availability in DB2 Nishant Sinha

WINDOWS SERVER 2003 Genetic Computer School Lesson 12 Fault Tolerance.

CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,

Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.

Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.

Security Operations Chapter 11 Part 2 Pages 1262 to 1279.

SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE

IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.

Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.

1 High-availability and disaster recovery  Dependability concepts:  fault-tolerance, high-availability  High-availability classification  Types of.

Managing Multi-User Databases

OpenMosix, Open SSI, and LinuxPMI

Unit OS10: Fault Tolerance

Introduction to Operating System (OS)

Introduction to Networks

Oracle Solaris Zones Study Purpose Only

Storage Virtualization

Distributed System Structures 16: Distributed Structures

Networking for Home and Small Businesses – Chapter 2

An Introduction to Computer Networking

Operating Systems Bina Ramamurthy CSE421 11/27/2018 B.Ramamurthy.

Outline Midterm results summary Distributed file systems – continued

SpiraTest/Plan/Team Deployment Considerations

CLUSTER COMPUTING.

Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.

Distributed computing deals with hardware

Operating Systems : Overview

Networking for Home and Small Businesses – Chapter 2

Operating Systems : Overview

Operating Systems : Overview

Introduction To Distributed Systems

Chapter 2 Operating System Overview

Operating System Overview

Distributed Systems and Concurrency: Distributed Systems

Presentation transcript:

Ch. 2-2 Computer Clusters1 2.3 컴퓨터 클러스터의 설계 원칙 Single-System Image Featues  It means the illusion of a single system, single control, symmetry, and transparency.  Single system: the entire cluster is viewed by users as one system that has multiple processors.  Single control: Logically, an user or system user utilizes services from one place with a single interface.  Symmetry: All clusters services and functionalities are symmetric to all nodes and all users, except those protected by access rights.  Location-transparent: The user is not aware of the where about of the physical device that eventually provides a service.  Cluster nodes  home node  local node  remote nodes  The illusion of an SSI can be obtained at several layers: application software layer, hardware or kernel layer, middleware layer.

Ch. 2-2 Computer Clusters2  Single Entry Point  The single entry point enables users to login to a cluster as one virtual host.  The system transparently distribute the user ’ s login and connection requests to different physical hosts to balance the load.  Realizing a Single Entry Point in a Cluster of Computers Fig  Single File Hierarchy From the view-point of any process, files can reside on three types of locations in a cluster, as shown in Fig A stable storage requires two aspects: persistent, fault-tolerant. Stable storage (global files) could be implemented as one centralized, large RAID disk. But it could also be distributed using local disks of cluster nodes.  Single I/O Space over Distributed RAID for I/O-Centric Clusters Fig. 2.16

Ch. 2-2 Computer Clusters3  RAID High Availability through Redundancy  When designing robust, high available systems three terms are often used together: reliability, availability, and serviceability (RAS).  신뢰성 : 시스템이 고장 없이 얼마나 오래 동작할 수 있는지를 측정  가용성 : 시스템이 사용자에게 가용인 시간 백분율  서비스 가능성 : 시스템을 서비스 ( 유지, 보수, 업그레이드 ) 하는 것이 얼마나 쉬운지를 말한다.

Ch. 2-2 Computer Clusters4  Availability and Failure Rate  Availability=MTTF/(MTTF+MTTR)  MTTF (mean time to failure)  MTTR (mean time to repair)  Planned vs. Unplanned Failures  Transient vs. Permanent Failures  Partial vs. Total Failures  Single Point of failure in an SMP and in Clusters of Computers, Fig  Redundancy Techniques  Table 2.5 Availability of Computer System Types  Isolated Redundancy  When a component (the primary component) fails, the service it provided is take over the another component (the backup component).  The primary and the backup components should be isolated from each other.  Benefits not a single point of failure 고장 된 구성요소는 나머지 시스템이 작동 중 일 때, 수리될 수 있다. 주된 구성요소와 백업 구성요소는 서로 테스트하고 디버거 할 수 있다.

Ch. 2-2 Computer Clusters5  N-Version Programming to Enhance Software Reliability  The software is implemented by N isolated teams who may not even know the other exist.  Different teams are asked to implement the software using different algorithms, programming languages, environment tools, and even platform.  In a fault-tolerant system, the N versions all run simultaneously and their results are constantly compared. If the results differ, the system is notified that a fault has occurred Fault-Tolerant Cluster Configurations  Three ascending levels of availability  Hot standby server clusters  Active-takeover clusters  Failover cluster 시스템 대체작동은 다수의 기능들 : 고장 진단, 고장 공지, 고장 복구를 제공해야 한 다.  Recovery Scheme  Backward recovery Checkpoint Rollback

Ch. 2-2 Computer Clusters6 2.4 클러스터 작업 및 자원 관리 Cluster Job Scheduling Methods  Cluster jobs may be scheduled to run at a specific time (calendar scheduling) or when a particular event happens (event scheduling).  Table 2.6 Job Scheduling Issues and Schemes for Cluster Nodes  Space Sharing  Multiple jobs can run on disjointed partitions of nodes simultaneously.  At most, a process is assigned to a node at a time.  Job Scheduling by Tiling over Cluster Nodes, Fig  Time Sharing  Independent scheduling (local scheduling)  Gang scheduling The gang scheduling scheme schedules all processes of a parallel job together. When one process is active, all processes are active.  Competition with foreign jobs

Ch. 2-2 Computer Clusters Cluster Job Management Systems  A Job Management System (JMS) should have three parts:  user server  job scheduler  resource manager: 자원 할당 / 감시, 스케줄링 정책 시행, 회계정보 수집  JMS Administration  Cluster Job Types  Characteristics of a Cluster Workload  NAS 벤치마크 경험에 기초한 작업 부하 특성, p. 108 참조  Migration Schemes  Node availability  Migration overhead  Recruitment threshold The recruitment threshold is the amount of time a workstation stays unused before the cluster considers it an idle node Load Sharing Facility for Cluster Computing