slide1
Download
Skip this Video
Download Presentation
2.3 컴퓨터 클러스터의 설계 원칙

Loading in 2 Seconds...

play fullscreen
1 / 7

2.3 컴퓨터 클러스터의 설계 원칙 - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

2.3 컴퓨터 클러스터의 설계 원칙. 2.3.1 Single-System Image Featues It means the illusion of a single system, single control, symmetry, and transparency. Single system: the entire cluster is viewed by users as one system that has multiple processors.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '2.3 컴퓨터 클러스터의 설계 원칙' - amir-clark


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
2.3 컴퓨터 클러스터의 설계 원칙

2.3.1 Single-System Image Featues

  • It means the illusion of a single system, single control, symmetry, and transparency.
    • Single system: the entire cluster is viewed by users as one system that has multiple processors.
    • Single control: Logically, an user or system user utilizes services from one place with a single interface.
    • Symmetry: All clusters services and functionalities are symmetric to all nodes and all users, except those protected by access rights.
    • Location-transparent: The user is not aware of the where about of the physical device that eventually provides a service.
  • Cluster nodes
    • home node
    • local node
    • remote nodes
  • The illusion of an SSI can be obtained at several layers: application software layer, hardware or kernel layer, middleware layer.

Ch. 2-2 Computer Clusters

slide2
Single Entry Point
    • The single entry point enables users to login to a cluster as one virtual host.
    • The system transparently distribute the user’s login and connection requests to different physical hosts to balance the load.
    • Realizing a Single Entry Point in a Cluster of Computers
      • Fig. 2.13
    • Single File Hierarchy
      • From the view-point of any process, files can reside on three types of locations in a cluster, as shown in Fig. 2.14.
      • A stable storage requires two aspects: persistent, fault-tolerant.
      • Stable storage (global files) could be implemented as one centralized, large RAID disk. But it could also be distributed using local disks of cluster nodes.
    • Single I/O Space over Distributed RAID for I/O-Centric Clusters
      • Fig. 2.16

Ch. 2-2 Computer Clusters

slide3
RAID

2.3.2 High Availability through Redundancy

  • When designing robust, high available systems three terms are often used together: reliability, availability, and serviceability (RAS).
    • 신뢰성: 시스템이 고장 없이 얼마나 오래 동작할 수 있는지를 측정
    • 가용성: 시스템이 사용자에게 가용인 시간 백분율
    • 서비스 가능성: 시스템을 서비스(유지, 보수, 업그레이드)하는 것이 얼마나 쉬운지를 말한다.

Ch. 2-2 Computer Clusters

slide4
Availability and Failure Rate
    • Availability=MTTF/(MTTF+MTTR)
    • MTTF (mean time to failure)
    • MTTR (mean time to repair)
  • Planned vs. Unplanned Failures
  • Transient vs. Permanent Failures
  • Partial vs. Total Failures
    • Single Point of failure in an SMP and in Clusters of Computers, Fig. 2.19.
  • Redundancy Techniques
    • Table 2.5 Availability of Computer System Types
  • Isolated Redundancy
    • When a component (the primary component) fails, the service it provided is take over the another component (the backup component).
    • The primary and the backup components should be isolated from each other.
    • Benefits
      • not a single point of failure
      • 고장 된 구성요소는 나머지 시스템이 작동 중 일 때, 수리될 수 있다.
      • 주된 구성요소와 백업 구성요소는 서로 테스트하고 디버거 할 수 있다.

Ch. 2-2 Computer Clusters

slide5
N-Version Programming to Enhance Software Reliability
    • The software is implemented by N isolated teams who may not even know the other exist.
    • Different teams are asked to implement the software using different algorithms, programming languages, environment tools, and even platform.
    • In a fault-tolerant system, the N versions all run simultaneously and their results are constantly compared. If the results differ, the system is notified that a fault has occurred.

2.3.3 Fault-Tolerant Cluster Configurations

  • Three ascending levels of availability
    • Hot standby server clusters
    • Active-takeover clusters
    • Failover cluster
      • 시스템 대체작동은 다수의 기능들: 고장 진단, 고장 공지, 고장 복구를 제공해야 한다.
  • Recovery Scheme
    • Backward recovery
      • Checkpoint
      • Rollback

Ch. 2-2 Computer Clusters

slide6
2.4 클러스터 작업 및 자원 관리

2.4.1 Cluster Job Scheduling Methods

  • Cluster jobs may be scheduled to run at a specific time (calendar scheduling) or when a particular event happens (event scheduling).
  • Table 2.6 Job Scheduling Issues and Schemes for Cluster Nodes
  • Space Sharing
    • Multiple jobs can run on disjointed partitions of nodes simultaneously.
    • At most, a process is assigned to a node at a time.
    • Job Scheduling by Tiling over Cluster Nodes, Fig. 2.22
  • Time Sharing
    • Independent scheduling (local scheduling)
    • Gang scheduling
      • The gang scheduling scheme schedules all processes of a parallel job together.
      • When one process is active, all processes are active.
    • Competition with foreign jobs

Ch. 2-2 Computer Clusters

slide7
2.4.2 Cluster Job Management Systems
  • A Job Management System (JMS) should have three parts:
    • user server
    • job scheduler
    • resource manager: 자원 할당/감시, 스케줄링 정책 시행, 회계정보 수집
  • JMS Administration
  • Cluster Job Types
  • Characteristics of a Cluster Workload
    • NAS 벤치마크 경험에 기초한 작업 부하 특성, p. 108 참조
  • Migration Schemes
    • Node availability
    • Migration overhead
    • Recruitment threshold
      • The recruitment threshold is the amount of time a workstation stays unused before the cluster considers it an idle node.

2.4.3 Load Sharing Facility for Cluster Computing

Ch. 2-2 Computer Clusters

ad