Distributed Systems

Distributed Systems LuaiMalhis PH.D Computer Engineering Department An-Najah N. University Fall 2013 Distributed Systems

Introduction • What is a Distributed system? • Why do we need distributed Systems? • History of Distributed Systems. • Applications of Distributed Systems. • Goals and Objectives. Distributed Systems

Definition of a Distributed System (Tanenbaum): • A distributed system is: A collection of independent computers that appears to its users as a single coherent system. Distributed Systems

Centralized Systems: • System shared by users all the time • All resources accessible • Software runs in a single process • Single physical location • Single clock time (Global Time) • Single point of control • Single point of failure Distributed Systems

Decentralized Systems: • Multiple autonomous components • Components shared by users • Some resources may not be accessible • Software can run in concurrent processes on different processors • Multiple physical locations • Multiple points of control • Multiple points of failure • Multiple Clock times (No global time) • No shared memory (in most cases) Distributed Systems

Basics of Distributed Systems: • Networked computers (close or loosely coupled) that provide a degree of operation transparency • Distributed Computer System = • independent processors + networking infrastructure • Communication between processes (on the same or different computer) using message passing technologies is the basis of distributed computing Distributed Systems

Advantages of Distributed Systems over Centralized Systems • Economics: a collection of microprocessors offer a better price/performance than mainframes. Lowprice/performance ratio: cost effective way to increase computing power. • Speed: a distributed system may have more total computing power than a mainframe. Ex. 10,000 CPU chips, each running at 50 MIPS. Not possible to build 500,000 MIPS single processor since it would require 0.002 nsec instruction cycle. Enhanced performance through load distributing. • Inherent distribution: Some applications are inherently distributed. Ex. a supermarket chain. Distributed Systems

Advantages continue • Reliability: If one machine crashes, the system as a whole can still survive. Higher availability and improved reliability. • Incremental growth: Computing power can be added in small increments. Modular expandability • Another deriving force: the existence of large number of personal computers, the need for people to collaborate and share information. Distributed Systems

Advantages Continue • Data sharing: allow many users to access to a common data base • Resource Sharing: expensive peripherals like color printers • Communication: enhance human-to-human communication, e.g., email, chat • Flexibility: spread the workload over the available machines Distributed Systems

Disadvantages of Distributed Systems • Software: difficult to develop software for distributed systems • Network: saturation, lossy transmissions • Security: easy access also applies to secrete data Distributed Systems

Goals of Distributed Systems: • Resource sharing: easy for users to access remote resources. • Transparency: to hide the fact that processes and resources are physically distributed across multiple computers. • Openness: to offer services according to standard rules. • Scalability: easy to expand and manage. Distributed Systems

ISO RM-ODP: forms of transparency :Reference Model for open distributed system Distributed Systems

Scalability Problems: Centralized paradigm Distributed Systems

Scalability Problems: Decentralized paradigm • No machine has complete information about the systems state. • Machines make decisions based only on local information. • Failure of one machine does not ruin the algorithm. • There is no implicit assumption that a global clock exists. Distributed Systems

User Requirements : • What services the system can provide? • How easy to use and manage the system? • What benefits the system can offer? • What is the ratio of performance/cost? • How reliable the system is? • How secure the system can guarantee? Distributed Systems

Distributed Computer System Metrics • Latency – network delay before any data is sent • Bandwidth – maximum channel capacity (analogue communication Hz, digital communication bps) • Granularity – relative size of units of processing required. Distributed systems operate best with coarse grain granularity because of the slow communication compared to processing speed in general Distributed Systems

Metrics Continue • Reliability– ability to continue operating correctly for a given time • Fault tolerance – resilience to partial system failure • Security – policy to deal with threats to the communication or processing of data in a system • Processor speed – MIPS, FLOPS • Administrative/managementdomains – issues concerning the ownership and access to distributed systems components Distributed Systems

Chapter 2: Concepts and Architectures Disk(s) CPU I/O Memory Traditional Computer Architecture Distributed Systems

Computer System Architectures • Flynn, 1966+1972 classification of computer systems in terms of instruction and data stream organizations • Based on Von-Neumann model (separate processor and memory units • 4 machine organizations • SISD - Single Instruction, Single Data (PC, MIPS, etc.) • SIMD - Single Instruction, Multiple Data (Array Proc.) • MISD - Multiple Instruction, Single Data (not Available) • MIMD - Multiple Instruction, Multiple Data (Parallel Proc) Distributed Systems

I D CU PU Serial Processor SISD D1 PU1 I Array Processor CU Dn PUn SIMD CU – control unit PU – processor unit I – instruction stream D – data stream Flynn Architectures (1) Distributed Systems

I1 D1 CU1 PU1 Multiprocessor and Multicomputer In Dn CUn PUn MIMD I1 CU1 PU1 No real examples - possibly some pipeline architectures D In CUn PUn MISD Flynn Architectures (2) Distributed Systems

Processor-Memory Interconnection Network Distributed Systems

Crossbar switch Distributed Systems

P P P P P P P P P P M M M M M M M M M M Multiple stage switch Distributed Systems

Homogeneous Multicomputer Systems – Processor Arrays Grid Hypercube Distributed Systems

Loosely coupled multi-computer systems • Distributed Memory Multi-computer • IPC by message passing • Typically PC or workstation clusters • Physically distributed components • Characterized by longer message delays and limited bandwidth network M P M P M P Distributed Systems

Shared Memory M P M P M P M P I/O Closely coupled multi-computer systems • Shared Memory Multiprocessor • Processors connected via common bus or fast network • Characterized by short message delays, high • bandwidth • IPC via Shared Memory Distributed Systems

Network based Systems • Network size: number of nodes N • Node: ni, 1 i N • Distance: d(ni, nj): number of links between ni and nj • Network distance: D = max(d(ni, nj)) • Degree: degree(ni): number of links from/to ni • Network topology is an abstract graph to represent the architecture of a network Distributed Systems

Desired Properties: • (1) When network size grows arbitrarily, the network distance increases (Max distance between two nodes) • (2) There exists a constant k, such that • degree(ni) k • (3) Routing algorithm is easy to implement and independent of network size • (4) When some nodes or links are failed, the network is still connected (with lower performance) • (5) Traffic loads are evenly distributed over the network Distributed Systems

star ring B-tree complete regular arbitrary Typical network topologies: Distributed Systems

Software Concepts • DOS (Distributed Operating Systems) • NOS (Network Operating Systems) • Middleware Distributed Systems

Uniprocessor Operating System Separating applications from operating system code through a microkernel. Distributed Systems

Distributed Operating System Tightly-coupled operating system for multi-processors and homogeneous multi-computers. Strong transparency. Distributed Systems

DOS: characteristics (1) • Distributed Operating Systems • Allows a multiprocessor or multicomputer network resources to be integrated as a single system image • Hide and manage hardware and software resources • provides transparency support • provide heterogeneity support • control network in most effective way • consists of low level commands + local operating systems + distributed features • Inter-process communication (IPC) Distributed Systems

DOS: characteristics (2) • remote file and device access • global addressing and naming • trading and naming services • synchronization and deadlock avoidance • resource allocation and protection • global resource sharing • deadlock avoidance • communication security • no examples in general use but many research systems: Amoeba, Chorus etc. see Google “distributed systems research” Distributed Systems

Network Operating System Loosely-coupled operating system for heterogeneous multi-computers (LAN and WAN). Weak transparency. Distributed Systems

NOS: characteristics • Network Operating System • extension of centralized operating systems • offer local services to remote clients • each processor has own operating system • user owns a machine, but can access others (e.g. rlogin, telnet) • no global naming of resources • system has little fault tolerance • e.g. UNIX, Windows NT, 2000 Distributed Systems

Middleware System Additional layer on the top of NOS implementing general-purpose services. Better transparency. Distributed Systems

Middleware Examples • Examples: Sun RPC, CORBA, DCOM, Java RMI (distributed object technology) • Built on top of transport layer in the ISO/OSI 7 layer reference model: application (protocol), presentation (semantic), session (dialogue), transport (e.g. TCP or UDP), network (IP, ATM etc), data link (frames, checksum), physical (bits and bytes) • Most are implemented over the internet protocols • Masks heterogeneity of underlying networks, hardware, operating system and programming languages – so provides a uniform programming model with standard services • 3 types of middleware: • transaction oriented (for distributed database applications) • message oriented (for reliable asynchronous communication) • remote procedure calls (RPC) – the original OO middleware Distributed Systems

Types of communication Message passing is the general basis of communication in a distributed system: transferring a set of data from a sender to a receiver. Distributed Systems

Point-to-point Message passing • sender calls send primitive to pass message to sender’s buffer • communication module transmits the message to the destination • destination communication module puts the message to receiver’s buffer • receiver calls receive primitive to get the message Distributed Systems

Distributed Shared Memory • Pages of address space distributed among four machines • Situation after CPU 1 references page 10 • Situation if page 10 is read only and replication is used Distributed Systems

Comparison between Systems Distributed Systems

Client-Server Model Distributed Systems

Timing interaction between client and server Distributed Systems

Design Issues of Distributed Systems • Transparency • Flexibility • Reliability • Performance • Scalability Distributed Systems

1. Transparency • How to achieve the single-system image, i.e., how to make a collection of computers appear as a single computer. • Hiding all the distribution from the users as well as the application programs can be achieved at two levels: • hide the distribution from users • at a lower level, make the system look transparent to programs. 1) and 2) requires uniform interfaces such as access to files, communication. Distributed Systems

Types of transparency • Location Transparency: users cannot tell where hardware and software resources such as CPUs, printers, files, data bases are located. • Migration Transparency: resources must be free to move from one location to another without their names changed. • Replication Transparency: OS can make additional copies of files and resources without users noticing. • Concurrency Transparency: The users are not aware of the existence of other users. Need to allow multiple users to concurrently access the same resource. Lock and unlock for mutual exclusion. • Parallelism Transparency: Automatic use of parallelism without having to program explicitly. The holy grail for distributed and parallel system designers. Distributed Systems

2. Flexibility • Make it easier to change • Monolithic Kernel: systems calls are trapped and executed by the kernel. All system calls are served by the kernel, e.g., UNIX. • Microkernel: provides minimal services. (Fig 9-15)1) IPC 2) some memory management 3) some low-level process management and scheduling 4) low-level i/oE.g., Mach can support multiple file systems, multiple system interfaces. Distributed Systems

3. Reliability • Distributed system should be more reliable than single system. Example: 3 machines with .95 probability of being up. 1-.05**3 probability of being up. • Availability: fraction of time the system is usable. Redundancy improves it. • Need to maintain consistency • Need to be secure • Fault tolerance: need to mask failures, recover from errors. Distributed Systems

Distributed Systems