ITEC801 Distributed Systems

ITEC801Distributed Systems Goals of a Distributed System

Important Things • Things to get from this set of slides • What are some of the possible goals to aim for when designing a distributed system • Some understanding of each one • Some understanding of why achieving some goals might make it harder to achieve others Goals

Goals • openness • transparency • flexibility • reliability • performance • scalability Goals

Openness • openness: Offering services according to standard rules that describe the syntax and semantics of those services • example: network protocol rules • in distributed systems services specified through interfaces • Eg., Interface definition language (IDL) • IDL: specifies names of available functions together with types of parameters, return values, possible exceptions that can be raised, and so on Goals

Openness • interface specifications must be complete and neutral • interface specifications must support: • interoperability • portability • openness must permit flexibility • achieving flexibility: • system composed of small, easily replaceable components • separation of policy and mechanism Goals

Transparency • Making the user believe that there is only a single, undivided system • ie, not distributed • Of course, it is not quite as simple as that • Perhaps more accurate is • Trying to make the user less aware that there is not a single, undivided system Goals

Transparency • transparency can be achieved at different levels • hide the distribution from the users • hide the distribution from programs • the latter is harder • what does transparency really mean? • there are different kinds of transparency Goals

Transparency • Location Transparency: The users cannot tell where the resources are located • Migration Transparency: Resources can move at will without changing their users • Replication Transparency:The users cannot tell how many copies exist • Concurrency Transparency: Multiple users can share resources automatically Goals

Transparency • Parallelism Transparency: Activities can happen in parallel without the user's knowing • Failure Transparency: Hide from the users the failure and recovery of a resource • Persistence Transparency: Hide from the users whether a (software) resource is in memory or on disk Goals

Tradeoff: • High degree of transparency versus performance of a system. Goals

Flexibility • it is important that distributed systems be flexible, as we are still learning about them - so any system may need to be changed • while the need for flexibility may seem self-evident, the best means of achieving it are open to discussion • no-one is going to argue for an inflexible system, but... • just what does it mean to be flexible? Goals

Flexibility • Easily changed? • Highly functional? • Broadly useable? Goals

Reliability • one of the original ideas behind distributed systems was that if one machine went down another could do its job • given that there are many machines, the chance of them all being unavailable is drastically less than that of a single machine going down • so distributed systems should be more reliable • unfortunately, again it is not that simple Goals

More Points of Failure • if a service depends upon a number of specific machines being available, then reliability may be less than in a non-distributed system • a distributed system is one on which I cannot get any work done because some machine I have never heard of has crashed Leslie Lamport Goals

Availability • a highly reliable system is one in which the services are highly available • availability can be enhanced by a design which does not depend on the simultaneous functioning of a substantial number of critical components Goals

Availability, etc • availability also applies to data - it must be available and consistent • security is another aspect of reliability • yet another is fault tolerance - specifically, what happens in the case of crashes, to the information in server or to a communication if either or both ends crash Goals

Performance • it is said that users buy only two things - upward compatibility and performance • a system that is flexible, transparent, and reliable will not be used if it is slower than a snail on barbiturates • an application running on a distributed system should not be noticeably slower than on a uni-processor system Goals

Performance Measures • there are various measures of performance • response time • throughput • system utilisation • network capacity used • performance in a distributed system is obviously affected by communication, especially protocol handling at each end Goals

Scalability • Scalable System: System that can handle additional of users/resources without suffering noticeable loss of performance or increase in administrative complexity. • Three metrics of a scalable system • No. of users/objects that are part of system • Distance between farthest nodes in the system. • Number of organisations that exert administrative control over pieces of the system. Goals

Scalability Examples • Issues w.r.t Size • Centralised services, data algorithm limitations. • Centralised Services: Single server for all users. • Often necessary. • Centralised Data: Single database repository. • Saturation of communication lines. • DNS scalability problems. • Centralised Algorithms: Doing routing based on complete information. Goals

Decentralised Algorithms • decentralised algorithms have the following characteristics • no machine has complete information about the system state • machines make decisions based only on locally available data • failure of one machine does not abort execution of the algorithm • there is no assumption of a global system clock Goals

Scalability Issues • Other issues with Geographical scalability • Problems due to Synchronous communication. • Unreliable WANs. • Geographical scalability versus centralised solutions. • Scaling the system across multiple independent administrative domains. • Conflicting policies w.r.t to resource usage (payment), management and security. Goals

Scaling Techniques • Hiding communication latencies:Promoting asynchronous communication. • Reduce overall communication for interactive applications. • Distribution:Allows information maintained by a distributed service to be spread across multiple servers. • Example: DNS zones. Goals

Scaling & Distribution Issues • Placement of servers • information held close to users who often access it • names of nearby objects obtained from local name servers • Finding the right server. • Mounts [SUN NFS, Locus, Plan 9] • Broadcast [Sprite] • Domain-based queries [Grapevine, X.500] • Replication: having multiple instances or copies of components across the distributed system Goals

Replication Issues • placement of Replicas: depends on purpose for replicating the resource. • Case One: Improving availability in case of network partitions/reducing network delays: Replicas scattered. • Case Two: users local, improving reliability and availability: replicas placed close to each other. • note: Placement of replicas affects choice of mechanism that mains consistency of replicas Goals

Replication Issues • consistency • A replicated object logically a single object. • At any given instant of time, replicas must be consistent. Goals

Replication techniques • replication of read only information • replication of Immutable information • Replication of other information • updates sent to all replicas: Ordering of updates essential. Goals

Scaling Techniques • inconsistencies tolerated depending on the usage of a resource. • This observation exploited by grapevine, allowing it to guarantee only loose consistency. • Updates allowed even when network is partitioned. • Conflicting updates resolved using timestamps. Goals

Conclusions • unfortunately these design considerations often conflict • For example • fault tolerance exacts its price on performance • making a flexible system reliable is not trivial • however, none of these considerations can really be sacrificed for the others Goals

ITEC801 Distributed Systems