Distributed Systems

Distributed Systems

Overview • Definitions • Advantages of distributed systems • Disadvantages of distributed systems • Characteristics of distributed systems • Examples of distributed systems • Challenges arising from the construction of Distributed Systems • Fault tolerant distributed systems • Paradigm shift toward deeply distributed applications

Distributed computing and Distributed system • Early computing was performed on a single processor (centralized computing). • Two advances in technology began to change the situation • The development of powerful microprocessors • The invention of high-speed computer networks • A distributed system is • One in which components located at networked computers communicate and coordinate their actions only by message passing • A software system running on top of a network • Distributed computing • Is computing performed in a distributed system. • Refers to the use of distributed systems to solve computational problems. • In distributed computing, a problem is divided into many tasks, each of which is solved by one or more component, which communicate with each other by message passing.

Advantages of Distributed Systems • The motivation for constructing and using distributed systems stems froma desire to share resources. • Economics: distributed systems allow the pooling of resources, including CPU cycles, data storage, input/output devices, and services. • Distributed systems have a better price/performance ratio • Concurrency: we can solve the problem more quickly using several processors concurrently. • Some applications are inherently distributed

Disadvantages of Distributed Computing • Multiple Points of Failures: the failure of one or more participating computers, or one or more network links, can spell trouble. • Security Concerns: In a distributed system, there are more opportunities for unauthorized attack.

Characteristics of Distributed Systems • Concurrency of components (or Parallel activities) • Autonomous components executing concurrent tasks • Lack of a global clock (only limited precision for processes to synchronize) • No global state: No single process can have knowledge of the current global state of the system

Examples of distributed systems • The following examples are based on familiar and widely used computer networks: the Internet, intranets and the emerging technology of networks based on mobile devices. • World Wide Web, • e-mail • file transfer. • Programs running on the computers connected to the Internet interact by passing messages. • Some times Web is incorrectly used to mean the Internet • By default the terms ‘client’ and ‘server’ refer to processes rather than the computers that they execute upon, although in everyday parlance those terms also refer to the computers themselves. • An executing web browser is an example of a client. Distributed Computing Introduction

Challenges (Cont.) • Heterogeneity • Heterogeneous components must be able to interoperate • Openness • Interfaces should be publicly available to ease adding new components • Security • The system should only be used in the way intended • Scalability • System should work efficiently with an increasing number of users • System performance should increase with inclusion of additional resources.

Challenges (Cont.) • Fault tolerance and handling • Failure of a component (partial failure) should not result in failure of the whole system • Concurrency • Shared access to resources must be possible • Distribution transparency • Distribution should be hidden from the user as much as possible

Heterogeneity—Cont. • Middleware: a software layer that provides a programming abstraction as well as masking the heterogeneity of the underlying networks, hardware, operating systems and programming languages. • The Common Object Request Broker (CORBA, Ch4, 5 and 20) is an example. • Java Remote Method Invocation (RMI, Ch5) is an example of a middleware that support a single programming language.

Security • The resources are accessible to authorized users and used in the way they are intended. • Security for information resources has three components: • Confidentiality • Protection against disclosure to unauthorized individual. • Integrity • Protection against alternation or corruption. • e.g. changing the account number or amount value in a money order • Availability • Protection against interference with the means to access the resources.

Security—Cont. • Security mechanisms • Encryption and Authentication • The following two security challenges have not yet been fully met • Denial of service attack. • Achieved by bombarding the service with such a large number of pointless requests that the serious users are not able to use it. • Security of mobile code • Receiving an executable program as an electronic mail attachment.

Fault Tolerant Distributed Systems

Distributed Predicate Detection • Writing correct distributed programs is hard. In spite of extensive testing and debugging faults persist. • Many distributed systems, especially those employed in safety critical environments should be able to operate properly even in the presence of software faults. • Monitoring the execution of distributed systems and detecting failures is an important way to tolerate such faults. This gives raise to the predicate detection problem.

Distributed Predicate Detection—cont. • Predicate detection is a fundamental problem in distributed computing. This problem arises in many contexts such as testing and debugging of distributed systems. • Example:- the detection of global predicate arises in implementing the most basic command of a debugging system: " Stop the program when the predicate P is true” • A non-trivial task if P requires access to the global state.

Checkpointing and Logging

Checkpointing and Logging • Message logging Every process must save a copy of every message it sends • Checkpointing: Periodically save the state of distributed computation into stable storage, if any process fails, all processes are rolled back to the last checkpoint and the computation is restarted from there.

Paradigm shift toward deeply distributed applications • Thirty years ago • central computing unit accessed from terminals with a minimum of resources. • With the introduction of the personal computers, • a tremendous increase in the resources available at the client side • services were tied to the client hardware • During the last ten years the connectivity has been increased • Thus, applications do not need to be standalone and can benefit from the available connectivity, for additional interaction or just to benefit from extra computational power deployed in data centers • Client machines only act as powerful terminals

Paradigm shift • We may quickly find scenarios that resemble the typical deployment forty years ago but now on a global Internet scale: available anytime, anywhere • elimination of initial investment in hardware needed to deploy a service, • the service can scale and support thousands or millions of users.

Paradigm shift • The infrastructure that supports these services is now an extremely complex distributed system. • we need professionals that are able to design and implement the mechanisms and the software that provide it: this is where PhD and master students will find their role. • The main problem is scalability • It should do so while maintaining the image of a increasingly powerful single computing machine • Cloud Computing and related computing paradigms and concepts like Grid Computing, Utility Computing or Voluntary Computing have been discussed in academia and industry

Could Computing • A cloud is an elastic execution environment of resources (computational and storage) involving multiple stakeholders and providing a metered service (pay only for what you use) at multiple granularities for a specified level of quality (of service). • Elasticity : the ability to scale resource usage up and down rapidly according to instantaneous demand.

Volunteer Computing • Volunteer computing is a type of distributed computing in which computer owners donate their computing resources (such as processing power and storage) to one or more "projects". • The Berkeley Open Infrastructure for Network Computing (BOINC) is an open sourcemiddleware system for volunteer and grid computing. • It became useful as a platform for other distributed applications in areas such as mathematics, medicine, biology, climatology, and astrophysics. The intent of BOINC is to make it possible for researchers to tap into the enormous processing power of personal computers around the world. • BOINC is software that can use the unused CPU and GPU cycles on a computer to do scientific computing—what one individual does not use of his/her computer, BOINC uses.

Grid Computing • It is the federation of computer resources from multiple locations to reach a common goal. What distinguishes grid computing from conventional high performance computing systems such as cluster computing is that grids tend to be more loosely coupled, heterogeneous, and geographically distributed. • It was originally driven by scientific applications which are usually computation-intensive.

Utility Computing • Utility computing is the packaging of computing resources, such as computation, storage and services, as a metered service. This model has the advantage of a low or no initial cost to acquire computer resources; instead, computational resources are essentially rented. • This repackaging of computing services became the foundation of the shift to "on demand" computing, software as a service and cloud computing models that further propagated the idea of computing, application and network as a service.

Cloud Computing • a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources. • Cloud Computing is not a fundamentally new paradigm. It draws on existing technologies and approaches, such as Utility Computing, Software-as-a-Service, distributed computing, and centralized data centers. • What is new is that Cloud Computing combines and integrates these approaches. • While Grid Computing is partly defined by its decentralized control, Cloud Computing seems to be a step back towards centralizing IT in data centers again to economize on scale and scope.

Cloud Computing—Cont. • The most frequently used pricing model of the presented service providers is Pay-per-use, in which the user pays a static price for a used unit, often per hour, GB, CPU-hour etc. • In order to allow the easy use of Cloud systems, there will be the need for a standardized Cloud API.

Cloud Computing—Cont. • Clouds also require new business models, especially with respect to the licensing of software. This has already become an issue in Grids. • Current licenses are bound to a single user or physical machine. • Usage-based licenses, penalties and pricing. • More sophisticated technologies for service composition and ad-hoc creation of situational applications are needed. • The hype about cloud computing accompanies the Software as a Service (SaaS) wave. Services built on top of Cloud infrastructures enable software providers to offer products at lower cost and simultaneously with a higher degree of customization.

Cloud Computing—Cont. • Another crucial point for the eventual acceptance of Cloud technology in business industries will be the safety of critical data, both in transfer as in storage. • Reasons for that are foreign laws, which would possibly allow foreign governments to access this data, or domestic insurance contracts, demanding the data to be stored only in certain regions. • How the large Cloud vendors will tackle this concern?

Mobile and ubiquitous computing • The integration of small and portable computing devices into distributed systems. These devices include: Laptop computers, Handheld devices (PDAs, mobile phones, pagers, video cameras and digital cameras), smart watches, devices embedded in appliances such as cars, washing machines. • Ubiquitous is intended to mean that small computing devices will eventually become so pervasive in everyday objects that they are scarcely noticed.

Related Topics • Performance analysis of distributed systems • Load balancing in clouds • Distributed AI • Distributed information retrieval • Distributed evolutionary algorithms

Distributed Systems