Download presentation
Presentation is loading. Please wait.
Published byBranden George Modified over 8 years ago
1
Ch. 2-2 Computer Clusters1 2.3 컴퓨터 클러스터의 설계 원칙 2.3.1 Single-System Image Featues It means the illusion of a single system, single control, symmetry, and transparency. Single system: the entire cluster is viewed by users as one system that has multiple processors. Single control: Logically, an user or system user utilizes services from one place with a single interface. Symmetry: All clusters services and functionalities are symmetric to all nodes and all users, except those protected by access rights. Location-transparent: The user is not aware of the where about of the physical device that eventually provides a service. Cluster nodes home node local node remote nodes The illusion of an SSI can be obtained at several layers: application software layer, hardware or kernel layer, middleware layer.
2
Ch. 2-2 Computer Clusters2 Single Entry Point The single entry point enables users to login to a cluster as one virtual host. The system transparently distribute the user ’ s login and connection requests to different physical hosts to balance the load. Realizing a Single Entry Point in a Cluster of Computers Fig. 2.13 Single File Hierarchy From the view-point of any process, files can reside on three types of locations in a cluster, as shown in Fig. 2.14. A stable storage requires two aspects: persistent, fault-tolerant. Stable storage (global files) could be implemented as one centralized, large RAID disk. But it could also be distributed using local disks of cluster nodes. Single I/O Space over Distributed RAID for I/O-Centric Clusters Fig. 2.16
3
Ch. 2-2 Computer Clusters3 RAID 2.3.2 High Availability through Redundancy When designing robust, high available systems three terms are often used together: reliability, availability, and serviceability (RAS). 신뢰성 : 시스템이 고장 없이 얼마나 오래 동작할 수 있는지를 측정 가용성 : 시스템이 사용자에게 가용인 시간 백분율 서비스 가능성 : 시스템을 서비스 ( 유지, 보수, 업그레이드 ) 하는 것이 얼마나 쉬운지를 말한다.
4
Ch. 2-2 Computer Clusters4 Availability and Failure Rate Availability=MTTF/(MTTF+MTTR) MTTF (mean time to failure) MTTR (mean time to repair) Planned vs. Unplanned Failures Transient vs. Permanent Failures Partial vs. Total Failures Single Point of failure in an SMP and in Clusters of Computers, Fig. 2.19. Redundancy Techniques Table 2.5 Availability of Computer System Types Isolated Redundancy When a component (the primary component) fails, the service it provided is take over the another component (the backup component). The primary and the backup components should be isolated from each other. Benefits not a single point of failure 고장 된 구성요소는 나머지 시스템이 작동 중 일 때, 수리될 수 있다. 주된 구성요소와 백업 구성요소는 서로 테스트하고 디버거 할 수 있다.
5
Ch. 2-2 Computer Clusters5 N-Version Programming to Enhance Software Reliability The software is implemented by N isolated teams who may not even know the other exist. Different teams are asked to implement the software using different algorithms, programming languages, environment tools, and even platform. In a fault-tolerant system, the N versions all run simultaneously and their results are constantly compared. If the results differ, the system is notified that a fault has occurred. 2.3.3 Fault-Tolerant Cluster Configurations Three ascending levels of availability Hot standby server clusters Active-takeover clusters Failover cluster 시스템 대체작동은 다수의 기능들 : 고장 진단, 고장 공지, 고장 복구를 제공해야 한 다. Recovery Scheme Backward recovery Checkpoint Rollback
6
Ch. 2-2 Computer Clusters6 2.4 클러스터 작업 및 자원 관리 2.4.1 Cluster Job Scheduling Methods Cluster jobs may be scheduled to run at a specific time (calendar scheduling) or when a particular event happens (event scheduling). Table 2.6 Job Scheduling Issues and Schemes for Cluster Nodes Space Sharing Multiple jobs can run on disjointed partitions of nodes simultaneously. At most, a process is assigned to a node at a time. Job Scheduling by Tiling over Cluster Nodes, Fig. 2.22 Time Sharing Independent scheduling (local scheduling) Gang scheduling The gang scheduling scheme schedules all processes of a parallel job together. When one process is active, all processes are active. Competition with foreign jobs
7
Ch. 2-2 Computer Clusters7 2.4.2 Cluster Job Management Systems A Job Management System (JMS) should have three parts: user server job scheduler resource manager: 자원 할당 / 감시, 스케줄링 정책 시행, 회계정보 수집 JMS Administration Cluster Job Types Characteristics of a Cluster Workload NAS 벤치마크 경험에 기초한 작업 부하 특성, p. 108 참조 Migration Schemes Node availability Migration overhead Recruitment threshold The recruitment threshold is the amount of time a workstation stays unused before the cluster considers it an idle node. 2.4.3 Load Sharing Facility for Cluster Computing
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.