1 / 41

Scalability & Availability

This article discusses the importance of scalability and availability in building real systems, including strategies for scaling up and out, achieving high availability, and optimizing performance.

smaestas
Download Presentation

Scalability & Availability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalability & Availability Paul Greenfield CSIRO

  2. Building Real Systems • Scalable • Fast enough to handle expected load • Grow easily when load grows • Available • Available enough of the time • Performance and availability cost • Aim for ‘enough’ of each but not more

  3. Scalable • Scale-up • Bigger and faster systems • Scale-out • Systems working to handle load • Server farms • Clusters • Implications for application design

  4. Available • Goal is 100% availability • 24x7 operations • Redundancy is the key • No single points of failure • Spare everything • Disks, disk channels, processors, power supplies, fans, memory, .. • Automated fail-over and recovery

  5. Performance • How fast is this system? • Not the same as scalability but related • Scalability is concerned with the limits to possible performance • Measured by response time and throughput • Aim for enough performance • Have a performance target • Tune and add hardware until target hit • Then worry about tomorrow…

  6. Performance Measures • Response time • What delay does the user see? • Instantaneous is good but 95% under 2 seconds is acceptable • Response time varies with ‘heaviness’ of transactions • Fast read-only transactions • Slower update transactions • Effects of database contention

  7. Response Times

  8. Response Times

  9. Response Times

  10. Throughput • How many transactions can be handled in some period of time • Transactions/second or tpm, tph or tpd • A measure of overall capacity • Transaction Processing Council • Standard benchmarks for TP systems • TPCC for typical transaction system • www.tpc.org • Current record is 227,000 tpmc

  11. Throughput • Throughput increases until some resource limit is hit • Adding more clients just increases the response time • Run out of processor, disk bandwidth, network bandwidth • Some resources overload badly • Ethernet network performance degrades

  12. Throughput

  13. System Capacity • How many clients can you support? • Name an acceptable response time • Average 95% under 2 secs is common • And what is ‘average’? • Plot response time vs # of clients • Great if you can run benchmarks • Reason for prototyping and proving proposed architectures before leaping into full-scale implementation

  14. System Capacity

  15. Load Balancing I • A few different but related meanings • 1. Balancing across server processes • CORBA-style where clients use objects that live inside server processes • Want all server processes to be busy • Client calls have to go to the process containing their object, even if this process is busy and others are idle

  16. Load Balancing I

  17. Load Balancing I • Client calls on name server to find the location of a suitable server • Name server can spread client objects across multiple servers • Often ‘round robin’ • Client is bound to server and stays bound forever • Can lead to performance problems

  18. Server Object Reference Client Numbers Total Clients per server object Server Object Reference Client Numbers Total Clients per server object 1 1-100, 201, 206, 211, ….496 160 1 1-100 100 2 101-200, 202, 207, 212, …, 497 160 2 101-200 100 3 203, 208, 213, …, 498 60 3 201-300 100 4 4 204, 209, 214, …, 499 301-400 100 60 5 401-500 100 5 205, 210, 215, …, 500 60 Load Balancing I Initial Later

  19. Load Balancing I • Solution to static allocation problem is for clients to throw away their server objects and get new ones every now and again • Application coding problem • And can be objects be discarded? • What kind of ‘objects’ are they if they can be discarded?

  20. Name Servers • Server processes call name server when they come up • Advertising their services • Clients call name server to find the location of a server process • Up to the name server to match clients to servers • Client calls server process to create objects

  21. Advertise service Request server reference Return server reference Get server object reference Server process Call server object’s methods Load Balancing I Name Server Client Client Server process Client Load balancing across processes within a server

  22. Load Balancing II • What happens when our single system is full? • Use faster systems • Scale-up • Use additional systems • Scale-out • Now load-balancing is used to spread load across systems

  23. Load Balancing II • CORBA world… • Name server can distribute across server processes running on different systems • Scales well… • Name server only involved when handing out a reference to a server, not on every method call

  24. Advertise service Request server reference Return server reference Get server object reference Call server object’s methods Load Balancing II Name Server Server process Client Client Client Server process Load balancing across multiple systems

  25. Load Balancing II • COM+ world… • No need for load-balancing within a system • Multithreaded server process • All objects live in a single process space • Component load balancing across systems • Client calls router when creating object • Router returns reference to an object in a COM+ server process • Load balanced at time of object creation

  26. Load Balancing II MTS process DCOM/MTS App DLL Client Client Client Thread pool Shared object space Application code COM+/MTS using thread pools rather than load balancing within a single system

  27. Response time tracker Router Create object Pass request to server Create object and pass back reference Call object’s methods COM+ Component Load Balancing Client Client Client COM + CLB balancing load across multiple systems

  28. Load Balancing II • COM+ scales well… • Router only involved when object is created • May change in later release to support dynamic re-balancing as server load changes • Method calls direct from client to server • Allocation based on response time rather than round-robin • Allocate to least-loaded server

  29. Load Balancing II • No name server in COM world? • COM/MTS clients ‘know’ the name of the server • Set at client installation time • Can change using GUI tools • Admin problem if server app is moved • COM+ uses Active Directory to find services

  30. Load Balancing II • Some systems involve the router in every method call/request • Request goes to router process who then passes it on to a server process • Scales poorly as the router can be a major bottle-neck • Some availability concerns as well • What happens if the router fails?

  31. Load Balancing II Server process Client Router Server process Client Client Load balancing with router in main call path

  32. Scale-up • No need for load-balancing across systems • Just use a bigger box • Add processors, memory, …. • SMP (symmetric multiprocessing) • Runs into limits eventually • Could be less available

  33. Scale-up • Example from the Web • Large auction site • Server farm of NT boxes (scale-out) • Single database server (scale-up) • 64-processor SUN box • More capacity needed? • Add more NT boxes easily • SUN box is full so have to shift some databases to another box

  34. Clusters • A group of independent computers acting like a single system • Shared disks • Single IP address • Single set of services • Fail-over to other members of cluster • Load sharing within the cluster • DEC, IBM, MS, …

  35. Client PCs Server A Server B Heartbeat Cluster management Disk cabinet A Disk cabinet B Clusters

  36. Clusters • Address scalability • Add more boxes to the cluster • Address availability • Fail-over • Add & remove boxes from the cluster for upgrades and maintenance • Can be used as one element of a highly-available system

  37. Web Server Farms • Web servers are highly scalable • Web applications are normally stateless • Next request can go to any Web server • State comes from client or database • Just need to spread incoming requests • IP sprayers (hardware, software) • >1 Web server looking at same IP address with some coordination (see MS WLB docs) • Same technique for other network apps

  38. Available System Web Clients Web Servers Load balanced using Convoy App Servers use COM+ LB Database is installed on Wolfpack cluster for high availability COM+ LBS router node

  39. Availability • How much? • 99% 87.6 hours a year • 99.9% 8.76 hours a year • 99.99% 0.876 hours a year • Need to consider operations as well • Maintenance, software upgrades, backups, application changes • Not just faults and recovery time

  40. Availability and Scalability • Often a question of application design • Stateful vs stateless • What happens if a server fails? • Can requests go to any server? • What language and database API • Balance cost vs speed – VB/C++ - ODBC/ADO • Synchronous method calls or asynchronous messaging? • Reduce dependency between components • Failure tolerant designs

  41. Next Week • Distributed application architectures • How to design systems that will work, scale and be available • Web-based systems • Web technology

More Related