Performance Issues of Web Services

Performance Issues of Web Services CSCI 8710 November 29-30, 2006 Kraemer

Web Services • Services available via the Internet that complete tasks or conduct transactions. • Self-contained, modular applications that can be described, published, and invoked over the Internet. • Can be automatically invoked by application programs.

Web Services • May be invoked at one site or may combine results of several services executed at different sites.

Performance concerns differ from stanard C/S • May involve both web service processing and network delays • May be accessed by wide variety of devices -- desktop computers, PDAs, mobile phones, other servers • Access via wireless communication networks: dynamic connectivity, low bandwidth, high latency

Performance concerns differ from standard C/S • Undpredictable nature of requests • Highly bursty • Varies with geographical location of clients, day of week, time of day • Highly variable size of requested objects • “Robot” access • Autonomous software agents that can consume significant amounts of system resources

Types of servers providing Web Services • Web servers • Transaction servers • Proxy servers • Cache servers • Wireless gateway servers • Mirror servers

Common problems • Insufficient bandwidth at peak times • Overloaded servers • Uneven server loads • Delivery of dynamic content • Shortage of connections between application servers and database servers • Failure of third-party servers • Delivery of multi-media content

Example:Bill Paying Service • Portal offers bill paying service • Customers can pay variety of bills through the service • Uses services provided by others: • Debit authorization (100 tps capability) • Electronic funds transfer • Customer authentication

Example:Bill Paying Service

Example: Bill Paying Service • Portal B is bill paying service • Treat overall web service as ‘system’ • Treat component services as ‘devices’ • What is the capacity of B, given that the debit authorization service can support 100 tps and that each payment transaction requires 2 visits to the • Xi = Vi * X0 • 100 = 2 * X0 • X0 = 50 tps

Web server elements

HTML and XML • Most documents on the Web written using HTML “markup language” • Most consist of text and inline images • Can also include other multimedia objects • Generates multiple requests: for document and for each inline image -- single click by user may generate series of requests • XML uses tags and attributes to define/delimit data • Application must interpret meaning of the tags

Hardware and Operating System • Hardware view: performance a function of: • Number and speed of processors • Amount of main memory • Bandwidth and storage capacity of disk subsystem • Bandwidth of the NIC • OS considerations: • Performance, scalability, reliability, robustness

Content • Performance affected by: • Content size • Content structure • Hyperlinks • Popularity of content

Perception of Performance • User view: • Fast response time; no connections refused • Management view: • High throughput; high availability • Need to have quantitative measurements that describe behavior of Web service

Metrics • Two most important; • Response time -- seconds • Throughput -- http_ops/sec, also bits/sec

Other metrics • Hit • any connection to a web site, including in-line requests and errors • difficult to compare across sites • Visit • Series of page requests by a user at a single site • Inter-request times < timeout_value • Session • Series of consecutive and related requests made during a single visit • Inter-request times < timeout_value

Other metrics • User-perceived response time • Set of geographically distributed agents poll the WS • Error rate • Increase indicates degrading performance • Examples: • Overflow of pending connection queue • For streaming services: • Jitter • Startup latency

Most common measurements of Web service performance • End-to-end response time • Site response time • Throughput (req/sec) • Throughput (Mbps) • Errors/sec • Visitors/day • Unique visitors/day

Example - Travel Agency • Monitor for 30 minutes: • 9000 HTTP requests • Three types of objects delivered: • Html pages (30%, avg. size 11,200 bytes) • Images (65%, avg. size 17,200 bytes) • Video clips (5%, avg. size 439,000 bytes) • What is the throughput: • 9000 requests/1800 sec = 5 req/sec • What is the throughput in Kbps?

Throughput in Kbps? • Xr = (total_req * class% * avg. size)/time • Xhtml = (9000 * 0.30 * 11,200*8)/1800 = 131.25 • Ximage = (9000 * 0.65 * 17,200*8)/1800 = 436.72 • Xvideo = (9000 * 0.05 * 439,000*8)/1800 = 857.42 • X0 = 131.25 + 436.72 + 857.42 • X0 = 1425.39 Kbps To support the Web traffic, the network connection should be at least a T1 line (1.544 Mbit/s ).

QoS indicators for Web Services • Response time • Availability • Percentage of time a service is ‘live’ (serving customer requests) • Reliability • Probability that WS will perform in satisfactory manner for a given period of time under specified operating and load conditions • Predictability • Cost

Input data needed to monitor QoS • Traffic • Performance • Usage patterns • Knowledge of average and peak load

Where are the delays?

Where are the delays? • Four categories: • DNS lookup phase • TCP connection set-up phase • Server execution time • Network time

DNS lookup phase • Browser converts server name in URL into an IP address to establish the TCP connection • If server name can’t be resolved by local cache, send query to higher-level DNS server • For leading e-commerce sites, avg. lookup times are 0.01 and 0.11 sec. Fastest sites achieve 0.001 sec.

Anatomy of a Web Transaction

Anatomy of a Web transaction • Browser • Network • Server

Anatomy of a Web Transaction: the Browser • User clicks on hyperlink; requests document • Client (browser) checks local cache for document; • in case of hit: • returns document; user response time R’Browser,hit* • In case of miss • Browser asks DNS to map server hostname to IP address • Cloent opens a TCP connectionto the server defined by the URL of the link • Client sends an HTTP request to the server • Browser formats and displays document and renders images • Returned document is stored in browser cache • User response time: R’Browser,miss*

Anatomy of a Web Transaction: the Network • Imposes delays in delivering info from client to server (R’N1) and from server to client (R’N2). • Delays a function of components on path between them: • Modems, routers, comm links, bridges, relays • R’Network • = total time HTTP request spends in the netork • = R’N1 + R’N2

Anatomy of a Web transaction: the Server • request arrives from client • server parses the request according to the http • server executes requested method (GET, HEAD, etc.) • if GET • server looks up file in its document tree by using the file system; file may be in cache or on disk • server read contents of file from disk or cache and writes it to network port • when file send complete, close the connection (if non-persistent HTTP) • R’server = time spent in execution of HTTP request • includes service time and waiting time at the server

Anatomy of a Web transaction • If document not found in client’s cache: • response time is sum of residence time at all resources • Rmiss = R’Browser, miss + R’Network + R’Server • If a hit • Rhit = R’Browser, hit • Typically: • Rhit << Rmiss • Average response time, R, over NT requests: • R = pC * Rhit + (1-pc) * Rmiss

Example • User wants to analyze impact of local cache size of browser on Web response time perceived by user • 20% of requests serviced by local cache with R=400 msec • R for remotely serviced requests = 3 sec • Previous expts. indicate that 3x cache size results in hit rate of 45% • R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 sec • R_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec

Bottlenecks • bottleneck = the component that limits system performance • Need to identify the bottleneck to improve performance

Example • home user • takes too long to download medium-size page (avg. size 20KB) • considering upgrading to processor w/2X faster CPU • How will this affect response time?

Example, continued • Assume: • R’network = 7.5 sec • R’server = 3.6 sec • R’Browser, miss = 0.3 sec • R = R’network + R’server +R’Browser, miss • R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec • Rnew = 7.5 + 3.6 + 0.15 = 11.25 sec • not much difference … CPU not the bottleneck

Example • Pharma co. plans intranet for training and display of images of molecules • training sessions have 100 people • assume 80% active at any one time • Each user performs avg. of 100 ops/hour • Each op requests avg. of 5 images • Avg. size of requested image is 25600 bytes • What is minimum bandwidth of network connection to image server?

Example, continued • 100 * 0.80 * 100 ops/hour * 5 images/op * 25600 bytes/image * 8 bits /byte * 1 hr/3600 sec • (100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps

Web Infrastructure

Web infrastructure • Three major delay sources: • “last mile” • Link between end user and phone company switch, or DSL or cable connection to service provider • ISPs • Recently, more bandwidth added • Improvements via caching, load balancing, more servers • ‘backbone’ of network • Collection of interconnected network providers • Connect to each other to exchange traffic (peering) • Public peering: at major interconnection points (NAPs, network access points)(MAEs, Metropolitan Access Points) • Delays may occur at peering points

Basic Components • Servers • Browsers • Firewalls • protect data, programs, and computers on private network from the uncontrolled activities of untrusted users and software on other computers • Screens network traffic going through it, using • Software, network hardware, computers • Potential performance bottleneck

Proxy, Cache, Mirror • Techniques for improving web performance and security • Try to reduce • access time to web documents • Network bandwidth required for doc xfers • Demand on servers w/ very popular docs

Proxy server • Special type of web server that acts as an agent: server to the client, client to the server • Accepts requests from clients, forwards them to web servers • Receives responses from remote servers, forwards them back to the client • Originally designed to provide web access for users on private networks who had to go through a firewall

Proxy server • Can be configured to cache relayed responses • Benefits: • Improves access speed by bringing data closer to consumer • Cuts down on network traffic • Reduces server load • Increases availability in the web • Problems: • Ensuring that cached docs are up-to-date • What’s worth caching? For how long?

Proxy server

Caching • Used in the Web: • Client-side, at the browser • In the network, a caching proxy • Evaluating caching effectiveness: • Hit ratio = requests_satisfied/total_requests • Byte hit ratio = hit ratio weighted by doc size • Data transferred = bytes xferred/time

Example • Manager wants to install caching proxy server on corporate intranet w/ > 2000 users • Use for 6 months -> then evaluate • Consider two cases: • Cache holds small documents, avg. size 4800 bytes, hit ratio 60% • Cache holds medium documents, avg. size 32500 bytes, hit ratio 20% • Monitor for one hour, observe 28800 requests

Cache efficiency • Saved_BW = • (num_req * hit_ratio * avg_size)/time • Saved_BW_small = • (28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps • Saved_BW_med = • (28800 * 0.20 * 32500*8)/3600 = 416 Kbps • Holding larger documents can save more BW

Mirroring • Replicating site content at other servers • Requires: • Regular updates • DNS to direct browsers to secondary sites when primary is busy • Goals: • Increase availability • Balance server load • Thus increasing quality of service

Example • Manufacturing co., employee portal, too slow for European users • Idea: install mirror site in Paris • What are the bandwidth savings ?

Performance Issues of Web Services