EEC-681/781 Distributed Computing Systems

EEC-681/781Distributed Computing Systems Lecture 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org

Outline • Architecture Model • Server load balancing • End-to-end arguments in system design • Inter-process communications (part 1) EEC-681: Distributed Computing Systems

Architectural Models • Client-server model • Peer-to-peer model EEC-681: Distributed Computing Systems

Basic Client–Server Model EEC-681: Distributed Computing Systems

Servers • Servers:Generally provide services related to a shared resource: • Servers for file systems, databases, implementation repositories, etc. • Servers for shared, linked documents • Servers for shared applications • Servers for shared distributed objects EEC-681: Distributed Computing Systems

Client • Clients:Allow remote service access • Programming interface transforming client’s local service calls to request/reply messages • Devices with (relatively simple) digital components (barcode readers, teller machines, hand-held phones) • Computers providing independent user interfaces for specific services • Computers providing an integrated user interface for related services (compound documents) EEC-681: Distributed Computing Systems

Application Layering • Traditional three-layered view: • User-interface layer contains units for an application’s user interface • Processing layer contains the functions of an application, i.e. without specific data • Data layer contains the data that a client wants to manipulate through the application components EEC-681: Distributed Computing Systems

Multitiered Architecture EEC-681: Distributed Computing Systems

Multitiered Architecture 1-30 EEC-681: Distributed Computing Systems

Horizontal Distribution 1-31 EEC-681: Distributed Computing Systems

Modern Architectures • Vertical distribution • According to application logic • Horizontal distribution • Scalability and availability • In practice, a system is distributed in both directions EEC-681: Distributed Computing Systems

Peer-to-Peer Model • A different way to construct client-server systems where most, or all, of the server functionality resides on the clients themselves • Advantages • Scalable to very large numbers (millions) • Stable under very high load • Self-repairing when disruptive failures occur EEC-681: Distributed Computing Systems

Why Server Load Balancing? • Scalable - Upgrade out instead of up • Easily grow - Spread your load across your existing low-cost servers as your needs grow • HA - Can work closely with High Availability solutions to add and remove servers as necessary Slides on server load balancing taken from Dustin Puryear’s talk on “LVS – Load Balancing and High Availability for Free” at USENIX LISA 2003 EEC-681: Distributed Computing Systems

SLB Layout EEC-681: Distributed Computing Systems

SLB Definitions • Virtual IP (VIP) – The IP used by clients to access a service • Load Balancer (LB) or Server Load Balancer(SLB) – the server that balances packets going to or from the servers providing the service • Real Server – The server providing the actual service (i.e., the server running Apache) • Real IP (RIP) – The IP of each real server • Schedule – How the SLB determines which real server gets the next connection EEC-681: Distributed Computing Systems

SLB Transaction Walkthrough • Client established TCP connection with SLB. • SLB determines which real server is ready for a connection using configured schedule, and creates a connection to that server. • Client sends HTTP request. • SLB forwards HTTP request to real server. • Real server responds back to.. Who? EEC-681: Distributed Computing Systems

So How Exactly Does the Real Server Respond? • There are two traditional methods to solving this problem: • Network Address Translation (NAT) • Direct Routing/Direct Server Return EEC-681: Distributed Computing Systems

NAT-Based SLB EEC-681: Distributed Computing Systems

NAT-Based SLB • SLB sits between the client and real servers for all incoming and outgoing packets, just like a router that does NAT • The SLB can rewrite the IP packet in one of two ways: • Half-NAT: load-balancer rewrites the destination IP • Full-NAT: load-balancer rewrites the source and destination IP • The VIP is only assigned on the load-balancer EEC-681: Distributed Computing Systems

Half-NAT • SLB gets packet from client • SLB changes the destination IP in the header of the packet • SLB forwards packet to real server • Major benefit of this method is that the real server knows the real IP of the client • The real server can properly log client IP’s • The real server can do better security based on the IP address of the client EEC-681: Distributed Computing Systems

Full-NAT • SLB gets packet from client. • SLB changes the destination IP and source IP in the header of the packet. • SLB forwards packet to real server. • Major drawback to this technique: • The real server has no idea who sent the original request EEC-681: Distributed Computing Systems

Direct Routing/Direct Server Return Layer 2 LB EEC-681: Distributed Computing Systems

Direct Routing/Direct Server Return • SLB gets packet from client • SLB changes destination IP in packet header • SLB forwards packet to real server, which also has VIP • Real server responds directly to the client with VIP as the source IP • There are benefits and costs to this approach • Without having to push data back through the SLB you can decrease latency • How do you handle ARP if multiple devices have the same IP (the VIP)? EEC-681: Distributed Computing Systems

End-to-End Arguments in System Design • Concerned with placement of functions among modules of a distributed system • Layered systems are very common • Network protocols and Middleware • Claim • Functions at low levels of a system may be redundant or of little use • Moving a function upward in a layered system closer to application that uses the function EEC-681: Distributed Computing Systems

Careful File Transfer • Objective: move a file from computer A’s storage to computer B’s storage • Application program: file transfer program EEC-681: Distributed Computing Systems

Careful File Transfer • Fault-free steps of file transfer • A reads the file from its disk • A sends the file to its networking stack and the file is packetized at the networking protocol stack • Communication network moves packets from A to B • At B, B’s networking protocol stack delivers transmitted file to the file transfer program running in B • B’s file transfer program asks its file system to write the file to B’s disk EEC-681: Distributed Computing Systems

Careful File Transfer • What can go wrong in each step • File corrupted due to disk error • Software bugs in buffering and copying the file, in A or B • Processor or memory transient error during buffering or copying • Communication system error, e.g., drop or change bits, or duplicate delivery • Host might crash EEC-681: Distributed Computing Systems

Careful File Transfer • How to cope with the threats • Reinforce each step along the way using time-out and retry, error detection, crash recovery, etc. • Doing everything three times EEC-681: Distributed Computing Systems

End-to-end Check and Retry • Assuming low probability of the threats • As a final additional step • B reads back the file from disk into memory • Recalculates checksum • Sends this value back to A • A compares the value with checksum of original file • If two checksums agree, file transfer completes. Otherwise, retry from beginning EEC-681: Distributed Computing Systems

Is Reliable Communication Useful? • Communication system can be made reliable with Packet checksums, sequence number, internal retry, etc. • Is it enough to ensure correct file transfer? • No. Other threats can still corrupt the file • The extra end-to-end check and retry still necessary • Usefulness: Reducing frequency of retry EEC-681: Distributed Computing Systems

Is Reliable Communication Useful? • Conclusion • Communication system to go out of its way to be extraordinarily reliable does not reduce the burden on the application program to ensure reliability EEC-681: Distributed Computing Systems

Performance Aspects • Cannot conclude that lower levels should play no part in obtaining reliability • If file is long, retry the transfer of the file is too expensive • By providing reliable communication, can significantly enhance the performance EEC-681: Distributed Computing Systems

Performance Aspects • Putting reliability measures in lower level • An engineering trade-off based on performance • Not a requirement for correctness • Must be careful. Might not always improve performance • Other application might not need the extra functionality • Might not have enough information to operate efficiently EEC-681: Distributed Computing Systems

Inter-Process Communications • Techniques: • Shared memory • Message passing • Objectives: • Data exchange • Synchronization: processes at different hosts, executing at different rates, need to influence the overall execution pattern => Constraints on the order of events EEC-681: Distributed Computing Systems

The OSI Network Architecture EEC-681: Distributed Computing Systems

Low-level Layers • Physical layer: contains the specification and implementation of bits, and their transmission between sender and receiver • Data link layer: prescribes the transmission of a series of bits into a frame to allow for error and flow control • Network layer: describes how packets in a network of computers are to be routed EEC-681: Distributed Computing Systems

Transport Layer • The transport layer provides the actual communication facilities for most distributed systems. • TCP: connection-oriented, reliable, stream-oriented communication • UDP: unreliable (best-effort) datagram communication EEC-681: Distributed Computing Systems

Application Layer • Many application protocols are directly implemented on top of transport protocols that do a lot of application-independent work EEC-681: Distributed Computing Systems

Message Layout EEC-681: Distributed Computing Systems

Middleware Layer • Middleware is invented to provide commonservices and protocols that can be used by many differentapplications: • A rich set of communication protocols • Marshaling and unmarshaling of data • Naming protocols • Security protocols • Scaling mechanisms • What remains are truly application-specificprotocols EEC-681: Distributed Computing Systems

An Adapted Reference Model with Middleware Layer 2-5 EEC-681: Distributed Computing Systems

EEC-681/781 Distributed Computing Systems