Download
advanced load balancing web systems n.
Skip this Video
Loading SlideShow in 5 Seconds..
Advanced Load Balancing/Web Systems PowerPoint Presentation
Download Presentation
Advanced Load Balancing/Web Systems

Advanced Load Balancing/Web Systems

358 Views Download Presentation
Download Presentation

Advanced Load Balancing/Web Systems

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Advanced Load Balancing/Web Systems Edward ChowDepartment of Computer ScienceUniversity of Colorado at Colorado Springs Edward Chow

  2. Outline of the Talk • Trends in Web Systems • Web switches and the support for advanced web system features. • Load balancing Research • load balancing algorithms research • network bandwidth measurement research • web server status research Edward Chow

  3. Readings/References • Application level solution: Apache/Jserv/Servlet • Kernel level load balancing solution: http://www.linuxvirtualserver.org/ • Joseph Mark’s presentation • LVS-NAT(Network Address Translation) web page • LVS-IP Tunnel web page • LVS-DR (Direct Routing) web page • Hardware solution: Foundry ServerIron Installation and Configuration Guide, May 2000. Edward Chow

  4. Trends in Web Systems • Increasing the availability, performance, and manageability of web sites. • High Performance through multiple servers connected by high speed networks. • High Availability (HA) 7x24 network services • Reliable/Efficient Content Routing and Content Distribution • Emerging Network Centric StorageNetworks • Emerging Linux virtual server library for low cost HA web systems. Edward Chow

  5. Networkshop’s Prediction • Already, load-balancers are overcoming the inherent one-to-one nature of the network and distributing queries across tuned servers -- GIFs to a machine with a huge RAM cache, processing to servers with fibre-channel-attached databases. • I suspect we'll see content routing as a full-fledged concept in Las Vegas next spring.By Networkshop News 10/1999. Edward Chow

  6. Load Balancing Systems • Cheap solution: Linux/LVS as load balancer for distributing requests to backend real servers. • Medium price solution: Microsoft Server Cluster; Zeus Load Balancer • High performance: Web Switches (special hardware) from Arrowpoint(CISCO), Foundry ServerIron, Alteon WebSystems, Intel XML distributor. Edward Chow

  7. Virtual Resource Management • Also called Server load balancing or Internet Traffic Management. • Goal: Increasing the availability, performance, and manageability of web sites. • April 2000 Acuitive Report on 1999 VRM market share Edward Chow

  8. VRM Market Prediction Edward Chow

  9. Site II losangeles.domain.com Internet Internet Site I newyork.domain.com Router 3-DNS BIG-IP BIG-IP Local DNS GLOBAL-SITE Webmaster Site III tokyo.domain.com Server Array User london.domain.com F5 VRM Solution Edward Chow

  10. BIG/ip - Delivers High Availability • E-commerce - ensures sites are not only up-and-running, but taking orders • Fault-tolerance - eliminates single points of failure • Content Availability - verifies servers are responding with the correct content • Directory & Authentication - load balance multiple directory and/or authentication services (LDAP, Radius, and NDS) • Portals/Search Engines – Using EAV administrators perform key-word searches • Legacy Systems - Load balance services to multiple interactive services • Gateways – Load balance gateways (SAA, SNA, etc.) • E-mail (POP, IMAP, SendMail) - Balances traffic across a large number of mail servers Edward Chow

  11. 3DNS Intelligent Load Balancing • Intelligent Load Balancing • QoS Load Balancing • Quality of Service load balancing is the ability to select apply different load balancing methods for different users or request types • Modes of Load Balancing • Round Robin Ratio • Least Connections Random • User-defined Quality-of-Service Round Trip Time • Completion Rate (Packet Loss) BIG/ip Packet Rate • Global Availability HOPS • Topology Distribution Access Control • LDNS Round Robin Dynamic Ratio • E-Commerce Edward Chow

  12. GLOBAL-SITE Replicate Multiple Servers and Sites • File archiving engine and scheduler for automated site and server replication • BIG-IP controls server availability during replication and synchronization • Gracefully shutdown for update • update in group/scheduled manner • FTP provides transferring files from GLOBAL-SITE to target servers (agent free, scalable) • RCE for source control • No client side software • Complete, turnkey system (appliance)(adapt from F5 presentation) Edward Chow

  13. Content Distribution • Secure, automate content/application distribution to single (multiple server)/wide area Internet sites. • Provide replication, synchronization, staged rollout and roll back. • With revision control, transmit only updates. • User-defined file distribution profiles/rules Edward Chow

  14. Intel NetStructure • Routing based on XML tag (e.g., given preferred treatment for buyers, large volume) • http://www.intel.com/network/solutions/xml.htm Edward Chow

  15. 1. Compared to SUN E450 server Edward Chow

  16. Phobos IPXpress • Balances web traffic among all 4 servers. • Easily connects to any Ethernet network. • Quick set up and remote configuration. • Choose from Six different load balancing algorithmsRound RobinLeast ConnectionsWeighted Least ConnectionsFastest Response TimeAdaptiveFixed • Hot standby failover port for web site uptime. • U.S. Retail $3495.00 Edward Chow

  17. Phobos In-Switch • Only load balancing switch in a PCI card form factor • Plugs directly into any server PCI slot • Supports up to 8,192 servers, ensuring availability and maximum performance • Six different algorithms are available for optimum performance: Round Robin, Weighted Percentage, Least Connections, Fastest Response Time, Adaptive and Fixed. • Provides failover to other servers for high-availability of the web site • U.S. Retail $1995.00 Edward Chow

  18. Foundry NetworksServerIron Internet Traffic Management Switches • One Million Concurrent Connections • SwitchBack™ -Also known as direct server return • Throughput: 64 Gbps with BigServerIron • Session Processing: Lead with80,000 connections/sec. • Symmetric LB: picking up the full load where the failed switch left off without losing stateful information. • Switching Capacity: BigServerIron deliver 256 Gbps of total switching capacity. Edward Chow

  19. BigServerIron • BigServerIron supports up to 168 10/100Base-TX ports or 64 Gigabit Ethernet ports. • Internet IronWare supports unlimited virtual server addresses, up to 64,000 Virtual IP (VIP) addresses and 1,024 real servers. • Web Hosting: enable network managers to define multiple VIPs and track service usage by VIP. • Health Checks: provide Layer 3,4,7 Health ChecksInclude HTTP, DNS, SMTP, POP3, iMAP4, LDAPv3, NNTP, FTP, Telnet and RADIUS Edward Chow

  20. BigServerIron LB Algorithms • Round Robin • Least Connections • Weighted Percentage(assign perform weight to server) • Slow Start - To protect the server from a surging flow of traffic at startup. It can really happened!! “Ya, LVS has performed for us like a champ.. under higher volumes, I have had some problems with wlc.... for some reason LVS freaks and starts binding all traffic to one box... or at least the majority of it.. it is really wierd... but as soon as you switch to using wrr then everything worked fine... I have been using LVS for about 4 months to manage our E-Commerce cluster and I haven't had any problems other than the wlc vs wrr problem…” -- Jeremy Johnson <jjohnson@real.com> 6/1/2000 Edward Chow

  21. BigServerIron LB Features • Set max connection limit for each server • Cookie Switching - This feature directs HTTP requests to a server group based oncookie value. For client persistent and servlet • URL Switching - directs requests based on the text of a URL string using defined policies. Can place different web content on different servers • URL Hashing - map hash value of Cookie header or the URL string to one of the real servers bound to the virtual server. This HTTP request and all future HTTP requests that contain this information then always go to the same real server. • URL Parsing - Selects real server by applying pattern matching expression to the entire URL. ServerIron supports up to 256 URL rules • SSL Session ID Switching - ensures that all the traffic for a SSL transaction with a given SSL session ID always goes to the same server. Edward Chow

  22. IronClad Security • NAT • TCP SYN attack protection: stops binding new sessions for a user definable timeframe when the rate of incoming TCP SYN packets exceed certain threshod. • Guard against Denial Of Service (DoS) Attacks -against massive numbers of uncompleted handshakes, also known as TCP SYN attacks, by monitoring and tracking unfinished connections • High Performance Access Control Lists (ACLs) and Extended ACLs - By using ACLs, network administrators can restrict access to specific applications from/to a given address or sub-net, or port number. • Cisco-syntax ACLs - ServerIron supports Cisco-syntax ACLs, which enables network administrators to cut/copy/paste ACLs from their existing Cisco products. Edward Chow

  23. Session Persistence for eCommerce Transactions • Port Tracking: Some web applications define a lead port (http) and follower (SSL) ports. ServerIron ensures connections to the follower ports arrive at the same server • Sticky Ports - ServerIron supports a wide variety of 'sticky' connections: client’s request for next port or all ports go to same server • Support large range of user programmable options • Mega Proxy Sever Persistence - treat a range of source IP addresses as a single source to solve the persistence problem caused by certain mega proxy sites in the Internet. • Use Source IP address for session persistenece when cookie missing. Edward Chow

  24. High Availability Services • Remote Backup Servers - If no local servers or applications are available, ServerIron sends client requests to remote servers. • HTTP Re-direct - ServerIron can also use HTTP redirect to send traffic to remote servers if the requested application is not available on the local server farm. • Active/Standby - When deployed in Active/Standby mode, the standby load-balancing device will assume control and preserve the state of existing sessions in the event the primary load-balancing device fails • Active/Active - When deployed in Active/Active mode, both load-balancing devices work simultaneously and provide a backup for each other while supporting stateful fail-over. • Quality of Service - Network administrators can prioritize traffic based on ports, MAC, VLAN, and 802.1p attributes, grant priority to HTTP traffic over FTP • Redundant hot-swappable power supplies Edward Chow

  25. Linux Virtual Server (LVS) • Virtual server is a highly scalable and highly available server built on a cluster of real servers. The architecture of the cluster is transparent to end users, and the users see only a single virtual server. Edward Chow

  26. LVS-NAT Configuration • All return traffic go through load balancer Edward Chow

  27. LVS-Tunnel Configuration • Real Servers need to be reconfigured to handle IP-IP packets • Real Servers can be geographically separated and return traffic go through different routes Edward Chow

  28. LVS-Direct Routing Configuration • Similar to the one implemented in IBM's NetDispatcher • Real servers need to configure a non-arp alias interface with virtual IP address and that interface must share same physical segment with load balancer. • Load balancer only rewrites server mac address; IP packetnot changed Edward Chow

  29. HA-LVS Configuration Edward Chow

  30. Persistence Handling in LVS • Sticky connections Examples: • FTP control (port21), data (port20)For passive FTP, the server tells the clients the port that it listens to, the client initiates the data connection connecting to that port. For the LVS/TUN and the LVS/DR, LinuxDirector is only on the client-to-server half connection, so it is imposssible for LinuxDirector to get the port from the packet that goes to the client directly. • SSL Session: port 443 for secure Web servers and port 465 for secure mail server, key for connection must be chosen/exchanged. • Persistent port solution: • First accesses the service, LinuxDirector create a template between the given client and the selected server, then create an entry for the connection in the hash table. • The template expires in a configurable time, and the template won't expire until all its connections expire. • The connections for any port from the client will send to the server before the template expires. • The timeout of persistent templates can be configured by users, and the default is 300 seconds Edward Chow

  31. Performance of LVS-based Systems “We ran a very simple LVS-DR arrangement with one PII-400 (2.2.14 kernel)directing about 20,000 HTTP requests/second to a bank of about 20 Web servers answering with tiny identical dummy responses for a few minutes. Worked just fine.” Jerry Glomph Black, Director, Internet & Technical Operations, RealNetworks “I had basically (1024) four class-Cs of virtual servers which were loadbalanced through a LinuxDirector (two, actually -- I used redundant directors) onto four real servers which each had the four different class-Cs aliased on them.” "Ted Pavlic" <tpavlic@netwalk.com> Edward Chow

  32. Layer 5-7 (content) Layer 4(TCP) • Content Routing based on: • Host Tag • Entire URL • Dynamic Cookie location • File extension • # of rules • # of services • # of services per content rule Layer 3 (IP) • Session load balancing based on IP address and TCP port • Network Address Translation (NAT) • Policies based on TCP port • Switching on MAC address, VLANs • IP Routing • 802.1 P/Q policy What is Content Intelligence?By Erv Johnson, Arrowpoint Edward Chow

  33. ArrowPoint’s Content Smart Web Switch Architecture from CCL viewgraph 4顆MIPS RISC CPU& 512 MB Mem Control Plane (content Policy Services) Switch Fabric Content Location Services Flowwall Security Switch Fabric Switch Fabric Flow Managers Content Based QoS Site & Server Selection Forwarding Plane Up to 16 ports LAN I/O Mapped Row Cache Switch Fabric Shared Memory LAN I/O Mapped Row Cache 註明:系統處理能力 1B hits per day 8 Mb Mem Edward Chow

  34. Load Balancing Study • The current web switches do not take server load or network bandwidth directly into consideration. How can we improve them? • The node with the least connection may have the heaviest load. • The current wide area load balancing does not consider the available/bottleneck bandwidth. • Lack of simulation and network planning tools for suggesting network configuration. Edward Chow

  35. Server Load Status Collection • Three basic approaches: • Observe response time of requests • modify web servers to report current queue/processing speed • Use web server agent to collect system data • The 2nd approach requires access to web server code/internal • We have modified Apache code (v1.3.9) by accumulating size of pending request (in terms bytes) in active child servers and diving it with the estimated processing speed. • Note that it is harder to estimate CGI script of Servlet processing. Edward Chow

  36. Apache Server Status Report Apache Server Status for gandalf.uccs.edu Current Time: Wed Dec 10 00:32:51 1997 Restart Time: Wed Dec 10 00:32:27 1997 Server uptime: 24 seconds Total accesses: 0 - Total Traffic: 0 kB CPU Usage: u0 s0 cu0 cs0 0 requests/sec - 0 B/second 1 requests currently being processed, 4 idle servers ... • Forked web server processes with no work (idle servers) • Requests per second (history) Edward Chow

  37. Collecting System Statistics • Web server agent collects system data • Run queue (#) • CPU idle time (%) • Pages scanned by page daemon (pages/s) • Web server agent uses • vmstat 1 2 • every 1 second collect 2 samples Edward Chow

  38. Vmstat Output and Meaning • r - # of processes waiting to run (extent) • sr - # of pages scanned by page daemon to put back on the free list • id - % of CPU idle time 100 - (us + sy) = id (discrete) Edward Chow

  39. Network Bandwidth Measurement • Bottleneck bandwidth BBw can be measured by sending burst of packets (of size S) and measuring the return time gap(Tg). BBw=S/Tg if no interference • Available bandwidth ABw is harder to measure. • Cprobe (U. Boston) sends burst of packets and measures the time-gap between 1st and last msg. • Estimate ABw based on packet round trip time or comparison with history of round trip time. Edward Chow

  40. Smart Probe Simulation Results Edward Chow

  41. Weight Calculation • Rate each web server with weight based on statistics sent from the web server agents weight of server= ((19.68*rid) + (19.58*rcpu) + (19.60*rrq) + (19.64*rrps) + (17.24*rap) + (4.23*rsr)) Edward Chow

  42. Weight Calculations (Example) • CPU idle time had an average throughput of 51.92. The sum of averages for the characteristics was 265.18. To find the relevant percentage 51.92/265.18 = 0.1958 = 19.58% was then multiplied by the actual CPU percent idle divided by the approximate threshold (found to be 100% during the benchmarks), to get the weight: <cpu weight> = 19.58*(<actual cpu>/100) Edward Chow

  43. Network Design/Planning Tool • Need realistic network traffic (Self-similar) load to exercise the simulator. • Need tools for • specifying network topology, • detecting bottlenecks in the web systems • suggesting new topology and configurations Edward Chow

  44. Why is the Internet hard to model? • It’s BIG • January 2000: > 72 Million Hosts1 • Growing Rapidly • > 67% per year • Constantly Changing • Traffic patterns have high variability • Causes of High variability • Client Request Rates • Server Responses • Network Topology Edward Chow

  45. Characteristics of Client Request Rate1 • Client Sleep Time • Inactive Off Time • Active Off Time • Embedded References 1Barford and Crovella, Generating Representative Web Workloads for Network and Server Performance Evaluation, Boston University, BU-CS-97-006, 1997 Edward Chow

  46. Internet Traffic Request Pattern Edward Chow

  47. Inactive Off Time • Time between requests (Think Time) • Uses a Pareto Distribution • Shape parameter: a = 1.5 • Lower bound: (k) = 1.0 • To create a random variable x: • u ~ U(0,1) • x = k / (1.0-u)^1.0/ a Edward Chow

  48. Inactive Off Time Edward Chow

  49. Active Off Time • Time between embedded references • Uses a Weibull Distribution • alpha: a = 1.46 (scale parameter) • beta: b = 0.382 (shape parameter) • To create a random variable x: • u ~ U(0,1) • x = a ( -ln( 1.0 – u ) ^ 1.0/b Edward Chow

  50. Active Off Time Edward Chow