performance issues of web services l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Performance Issues of Web Services PowerPoint Presentation
Download Presentation
Performance Issues of Web Services

Loading in 2 Seconds...

play fullscreen
1 / 78

Performance Issues of Web Services - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

Performance Issues of Web Services. CSCI 8710 November 29-30, 2006 Kraemer. Web Services. Services available via the Internet that complete tasks or conduct transactions. Self-contained, modular applications that can be described, published, and invoked over the Internet.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Performance Issues of Web Services' - elden


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
performance issues of web services

Performance Issues of Web Services

CSCI 8710

November 29-30, 2006

Kraemer

web services
Web Services
  • Services available via the Internet that complete tasks or conduct transactions.
  • Self-contained, modular applications that can be described, published, and invoked over the Internet.
  • Can be automatically invoked by application programs.
web services3
Web Services
  • May be invoked at one site or may combine results of several services executed at different sites.
performance concerns differ from stanard c s
Performance concerns differ from stanard C/S
  • May involve both web service processing and network delays
  • May be accessed by wide variety of devices -- desktop computers, PDAs, mobile phones, other servers
  • Access via wireless communication networks: dynamic connectivity, low bandwidth, high latency
performance concerns differ from standard c s
Performance concerns differ from standard C/S
  • Undpredictable nature of requests
    • Highly bursty
    • Varies with geographical location of clients, day of week, time of day
  • Highly variable size of requested objects
  • “Robot” access
    • Autonomous software agents that can consume significant amounts of system resources
types of servers providing web services
Types of servers providing Web Services
  • Web servers
  • Transaction servers
  • Proxy servers
  • Cache servers
  • Wireless gateway servers
  • Mirror servers
common problems
Common problems
  • Insufficient bandwidth at peak times
  • Overloaded servers
  • Uneven server loads
  • Delivery of dynamic content
  • Shortage of connections between application servers and database servers
  • Failure of third-party servers
  • Delivery of multi-media content
example bill paying service
Example:Bill Paying Service
  • Portal offers bill paying service
  • Customers can pay variety of bills through the service
  • Uses services provided by others:
    • Debit authorization (100 tps capability)
    • Electronic funds transfer
    • Customer authentication
example bill paying service10
Example: Bill Paying Service
  • Portal B is bill paying service
  • Treat overall web service as ‘system’
  • Treat component services as ‘devices’
  • What is the capacity of B, given that the debit authorization service can support 100 tps and that each payment transaction requires 2 visits to the
  • Xi = Vi * X0
  • 100 = 2 * X0
  • X0 = 50 tps
html and xml
HTML and XML
  • Most documents on the Web written using HTML “markup language”
  • Most consist of text and inline images
  • Can also include other multimedia objects
  • Generates multiple requests: for document and for each inline image -- single click by user may generate series of requests
  • XML uses tags and attributes to define/delimit data
    • Application must interpret meaning of the tags
hardware and operating system
Hardware and Operating System
  • Hardware view: performance a function of:
    • Number and speed of processors
    • Amount of main memory
    • Bandwidth and storage capacity of disk subsystem
    • Bandwidth of the NIC
  • OS considerations:
    • Performance, scalability, reliability, robustness
content
Content
  • Performance affected by:
    • Content size
    • Content structure
    • Hyperlinks
    • Popularity of content
perception of performance
Perception of Performance
  • User view:
    • Fast response time; no connections refused
  • Management view:
    • High throughput; high availability
  • Need to have quantitative measurements that describe behavior of Web service
metrics
Metrics
  • Two most important;
    • Response time -- seconds
    • Throughput -- http_ops/sec, also bits/sec
other metrics
Other metrics
  • Hit
    • any connection to a web site, including in-line requests and errors
    • difficult to compare across sites
  • Visit
    • Series of page requests by a user at a single site
    • Inter-request times < timeout_value
  • Session
    • Series of consecutive and related requests made during a single visit
    • Inter-request times < timeout_value
other metrics18
Other metrics
  • User-perceived response time
    • Set of geographically distributed agents poll the WS
  • Error rate
    • Increase indicates degrading performance
    • Examples:
      • Overflow of pending connection queue
  • For streaming services:
    • Jitter
    • Startup latency
most common measurements of web service performance
Most common measurements of Web service performance
  • End-to-end response time
  • Site response time
  • Throughput (req/sec)
  • Throughput (Mbps)
  • Errors/sec
  • Visitors/day
  • Unique visitors/day
example travel agency
Example - Travel Agency
  • Monitor for 30 minutes:
    • 9000 HTTP requests
    • Three types of objects delivered:
      • Html pages (30%, avg. size 11,200 bytes)
      • Images (65%, avg. size 17,200 bytes)
      • Video clips (5%, avg. size 439,000 bytes)
  • What is the throughput:
    • 9000 requests/1800 sec = 5 req/sec
    • What is the throughput in Kbps?
throughput in kbps
Throughput in Kbps?
  • Xr = (total_req * class% * avg. size)/time
    • Xhtml = (9000 * 0.30 * 11,200*8)/1800 = 131.25
    • Ximage = (9000 * 0.65 * 17,200*8)/1800 = 436.72
    • Xvideo = (9000 * 0.05 * 439,000*8)/1800 = 857.42
  • X0 = 131.25 + 436.72 + 857.42
  • X0 = 1425.39 Kbps

To support the Web traffic, the network connection should be at least a T1 line (1.544 Mbit/s ).

qos indicators for web services
QoS indicators for Web Services
  • Response time
  • Availability
    • Percentage of time a service is ‘live’ (serving customer requests)
  • Reliability
    • Probability that WS will perform in satisfactory manner for a given period of time under specified operating and load conditions
  • Predictability
  • Cost
input data needed to monitor qos
Input data needed to monitor QoS
  • Traffic
  • Performance
  • Usage patterns
  • Knowledge of average and peak load
where are the delays25
Where are the delays?
  • Four categories:
    • DNS lookup phase
    • TCP connection set-up phase
    • Server execution time
    • Network time
dns lookup phase
DNS lookup phase
  • Browser converts server name in URL into an IP address to establish the TCP connection
  • If server name can’t be resolved by local cache, send query to higher-level DNS server
  • For leading e-commerce sites, avg. lookup times are 0.01 and 0.11 sec. Fastest sites achieve 0.001 sec.
anatomy of a web transaction28
Anatomy of a Web transaction
  • Browser
  • Network
  • Server
anatomy of a web transaction the browser
Anatomy of a Web Transaction: the Browser
  • User clicks on hyperlink; requests document
  • Client (browser) checks local cache for document;
    • in case of hit:
      • returns document; user response time R’Browser,hit*
    • In case of miss
      • Browser asks DNS to map server hostname to IP address
      • Cloent opens a TCP connectionto the server defined by the URL of the link
      • Client sends an HTTP request to the server
      • Browser formats and displays document and renders images
      • Returned document is stored in browser cache
      • User response time: R’Browser,miss*
anatomy of a web transaction the network
Anatomy of a Web Transaction: the Network
  • Imposes delays in delivering info from client to server (R’N1) and from server to client (R’N2).
    • Delays a function of components on path between them:
      • Modems, routers, comm links, bridges, relays
    • R’Network
      • = total time HTTP request spends in the netork
      • = R’N1 + R’N2
anatomy of a web transaction the server
Anatomy of a Web transaction: the Server
  • request arrives from client
  • server parses the request according to the http
  • server executes requested method (GET, HEAD, etc.)
    • if GET
      • server looks up file in its document tree by using the file system; file may be in cache or on disk
  • server read contents of file from disk or cache and writes it to network port
  • when file send complete, close the connection (if non-persistent HTTP)
  • R’server = time spent in execution of HTTP request
    • includes service time and waiting time at the server
anatomy of a web transaction32
Anatomy of a Web transaction
  • If document not found in client’s cache:
    • response time is sum of residence time at all resources
    • Rmiss = R’Browser, miss + R’Network + R’Server
  • If a hit
    • Rhit = R’Browser, hit
  • Typically:
    • Rhit << Rmiss
  • Average response time, R, over NT requests:
    • R = pC * Rhit + (1-pc) * Rmiss
example
Example
  • User wants to analyze impact of local cache size of browser on Web response time perceived by user
    • 20% of requests serviced by local cache with R=400 msec
    • R for remotely serviced requests = 3 sec
    • Previous expts. indicate that 3x cache size results in hit rate of 45%
    • R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 sec
    • R_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec
bottlenecks
Bottlenecks
  • bottleneck = the component that limits system performance
  • Need to identify the bottleneck to improve performance
example35
Example
  • home user
    • takes too long to download medium-size page (avg. size 20KB)
    • considering upgrading to processor w/2X faster CPU
    • How will this affect response time?
example continued
Example, continued
  • Assume:
    • R’network = 7.5 sec
    • R’server = 3.6 sec
    • R’Browser, miss = 0.3 sec
  • R = R’network + R’server +R’Browser, miss
  • R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec
  • Rnew = 7.5 + 3.6 + 0.15 = 11.25 sec
    • not much difference … CPU not the bottleneck
example37
Example
  • Pharma co. plans intranet for training and display of images of molecules
    • training sessions have 100 people
    • assume 80% active at any one time
    • Each user performs avg. of 100 ops/hour
    • Each op requests avg. of 5 images
    • Avg. size of requested image is 25600 bytes
  • What is minimum bandwidth of network connection to image server?
example continued38
Example, continued
  • 100 * 0.80 * 100 ops/hour * 5 images/op * 25600 bytes/image * 8 bits /byte * 1 hr/3600 sec
  • (100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps
web infrastructure40
Web infrastructure
  • Three major delay sources:
    • “last mile”
      • Link between end user and phone company switch, or DSL or cable connection to service provider
    • ISPs
      • Recently, more bandwidth added
      • Improvements via caching, load balancing, more servers
    • ‘backbone’ of network
      • Collection of interconnected network providers
        • Connect to each other to exchange traffic (peering)
        • Public peering: at major interconnection points (NAPs, network access points)(MAEs, Metropolitan Access Points)
        • Delays may occur at peering points
basic components
Basic Components
  • Servers
  • Browsers
  • Firewalls
    • protect data, programs, and computers on private network from the uncontrolled activities of untrusted users and software on other computers
    • Screens network traffic going through it, using
      • Software, network hardware, computers
    • Potential performance bottleneck
proxy cache mirror
Proxy, Cache, Mirror
  • Techniques for improving web performance and security
  • Try to reduce
    • access time to web documents
    • Network bandwidth required for doc xfers
    • Demand on servers w/ very popular docs
proxy server
Proxy server
  • Special type of web server that acts as an agent: server to the client, client to the server
  • Accepts requests from clients, forwards them to web servers
  • Receives responses from remote servers, forwards them back to the client
  • Originally designed to provide web access for users on private networks who had to go through a firewall
proxy server44
Proxy server
  • Can be configured to cache relayed responses
  • Benefits:
    • Improves access speed by bringing data closer to consumer
    • Cuts down on network traffic
    • Reduces server load
    • Increases availability in the web
  • Problems:
    • Ensuring that cached docs are up-to-date
    • What’s worth caching? For how long?
caching
Caching
  • Used in the Web:
    • Client-side, at the browser
    • In the network, a caching proxy
  • Evaluating caching effectiveness:
    • Hit ratio = requests_satisfied/total_requests
    • Byte hit ratio = hit ratio weighted by doc size
    • Data transferred = bytes xferred/time
example47
Example
  • Manager wants to install caching proxy server on corporate intranet w/ > 2000 users
  • Use for 6 months -> then evaluate
  • Consider two cases:
    • Cache holds small documents, avg. size 4800 bytes, hit ratio 60%
    • Cache holds medium documents, avg. size 32500 bytes, hit ratio 20%
    • Monitor for one hour, observe 28800 requests
cache efficiency
Cache efficiency
  • Saved_BW =
    • (num_req * hit_ratio * avg_size)/time
  • Saved_BW_small =
    • (28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps
  • Saved_BW_med =
    • (28800 * 0.20 * 32500*8)/3600 = 416 Kbps
  • Holding larger documents can save more BW
mirroring
Mirroring
  • Replicating site content at other servers
  • Requires:
    • Regular updates
    • DNS to direct browsers to secondary sites when primary is busy
  • Goals:
    • Increase availability
    • Balance server load
      • Thus increasing quality of service
example50
Example
  • Manufacturing co., employee portal, too slow for European users
  • Idea: install mirror site in Paris
  • What are the bandwidth savings ?
example mirror site in paris
Example: Mirror site in Paris
  • Current avg. BW is 35 Mbps
  • 40% of load from Europe
  • 42% of traffic could be served from caching
  • Cacheable amount: 35 * 0.42 = 14.7Mbps
  • Estimate cache hit ratio at 38%
  • Saved_BW = 14.7 Mbps * 0.38 = 5.6 Mbps
    • 40% of traffic from Europe, so:
      • 5.6 * 0.40 = 2.24 Mbps could be served from cache in Paris
      • 6.4% savings on current BW usage at server
      • improvement in perceived response time for European users
content delivery networks cdn
Content Delivery Networks(CDN)
  • cache or replicate content as needed to meet demands from clients over the Web
  • coordinated caching systems implemented through proprietary networks and data centers
  • employ a DNS-redirecting mechanism
  • tries to assign best location from which to serve the requested content
content delivery networks cdn53
Content Delivery Networks(CDN
  • DNS-redirecting mechanism:
    • client requests URL; browser generates a DNS request for the IP address corresponding to the domain name in the URL
    • CDN controls the DNS service for this domain name
    • CDN modifies DNS requests with the IP addess of a selected server rather than IP address of original server
    • uses a routing function to select “best” server:
      • client location, id of requested content, load of CDN network and servers, proximity of CDN servers to client are all considered
  • CDN should provide:
    • scalability, high availability, manageability, performance
the wap infrastructure
The WAP Infrastructure
  • WAP = Wireless Application Protocol
    • architecture + set of protocols for wireless devices to access Web services at regular Web sites
    • wireless device communicates with WAP gateway, over wireless nework
    • WAP gateway communicates with servers
the wap infrastructure56
The WAP Infrastructure
  • Docs for wireless devices written in form of XML known as WML (wireless markup language)
  • can also use WMLscript
  • WML docs
    • structured as set of “cards”, units of user interaction
    • deck = set of cards
    • users navigate between cards
the wap infrastructure57
The WAP Infrastructure
  • WML decks + WMLScripts
    • stored in regular web servers on internet
    • retrieved by WAP gateway via HTTP
    • Web server response is binary encoded by WAP gateway and sent to wireless device via lightweight protocols
      • designed to minimize BW requirements
server architectures
Server Architectures
  • Web Server
  • Application Server
  • Transaction and Database Server
  • Streaming Server
  • Multi-tier Architecture
web server
Web Server
  • listens for HTTP requests
  • establishes requested connection
  • sends requested file
  • returns to listening mode
  • can handle more than one request at a time
    • fork a copy of the HTTP process for each request
    • multi-threaded HTTP program
    • pool of running processes
dynamic content
Dynamic content
  • can use client-side or server-side programs
  • can improve performance by pushing to client-side
application server
Application Server
  • software that handles all application operations between broswer-based customers and back-end databases
    • receive client request
    • execute business logic, interacting with transaction and/or DB servers
  • can be implemented in many ways:
    • CGI scripts, FastCGIs, server-applications, server-side scripts
transaction and database server
Transaction and Database Server
  • Tranasction Processing (TP) monitor provides:
    • an application programming interface
    • a set of program development tools
    • a system to monitor and control execution of transaction programs
  • DB server:
    • executes and monitor transaction processing applications
streaming server
Streaming Server
  • Initially, audio and video were “download and play” technologies
  • Streaming media begins to play “almost” immediately
    • client request arrives
    • server retrieves video and audio data and begins to deliver them over the network
    • video and audio are compressed (MPEG, MP3)
    • typically have control part and data part
example65
Example
  • Company plans to offer MM online training
  • Employee retrieves lecture of video, audio, slides; 30 minute duration
  • What is the number of streaming servers needed to serve the lecture presentation during busiest period of the day: 4-5 pm
example66
Example
  • 400 employees at peak
  • One MM server can stream presentations to 150 viewers simultaneously
  • What is the average number of simultaneous viewers during peak period?
    • Use Little’s Law: N=R
    •  = Req/time = 400 viewers/60 min
    • R = 30 min
    • N = 30 * 400/60 = 200
    • Need two MM servers
multi tier architecture
Multi-tier Architecture
  • web-based apps usually in 3-tier architecture:
    • presentation layer
      • user interface (browser & HTML, XML, etc.)
    • application layer
      • business logic
        • collection of rules to implement application logic
        • may also contain Java applets, ActiveX controls, etc.
    • data service layer
      • persistent data
example69
Example
  • application layer designed to support 400 simultaneous processes
  • app process:
    • receives client request
    • executes app logic, interacting with DB server
  • Monitoring shows:
    • app process executes for 150 msec between DB requests
    • DB server handles 440 req/sec
    • 400 app processes running during peak period
what if
What if??
  • the application servers are replaced by new servers with 2X speed
  • Each application server characterized by Z, “think time” – time between receiving a reply from the DB server and submitting a new DB request
  • DB layer, characterized by throughput, X, in req/sec
  • R = N/X - Z
what if71
What if ...?
  • DB response time:
    • R = 400/550 – 0.15 = 577 msec = 0.577 sec
  • after cpu upgrade, app processing time should be 75 msec
  • DB response time now:
    • Rnew = 400/550 – 0.075 = 652 msec = 0.652 sec
  • Improvement in app layer may not lead to improvement overall
dynamic load balancing
Dynamic Load Balancing
  • heavy traffic load adversely impacting performance
    • add more servers
    • buy bigger (faster) servers
    • need to do cost-performance analysis
dynamic load balancing73
Dynamic Load Balancing
  • web cluster:
    • multiple web servers
    • single location addressed by one URL and a single virtual IP address
    • incoming requests routed amount servers in user-transparent way
    • switch acts as dispatcher, mapping virtual IP address to actual address
networks
Networks
  • Bandwidth
    • measures the rate at which data can be sent through the network
    • usually expressed in bps
  • Latency
    • time needed for a bit (or small packet) to travel across the network
planning
Planning
  • Streaming service offers training videos
  • training session -> 15 min video at 300 Kbps
  • What impact if videos go to 25 min?
  • Service supports 35 simultaneous sessions
  • Average BW needed (now)
    • 35 * 300 Kbps = 10.5 Mbps
  • Average number simult. sessions (now)
    • N = 35
    • N =  * R
    • 35 =  * 15
    •  = 35/15 = 35/15 .. assume this remains the same
  • Nnew =  * 25 = 35/15 * 25 = 58.33
  • Average BW needed (new)
    • 58.33 * 300 Kbps = 17.5 Mbps
example78
Example
  • training videos, avg. size 950 MB
  • 100 students, 80% active at one time
  • Each user requests 2 clips/hour
  • BW needed to support:
    • ( 0.80 * 100) * 2 * (8 * 950)/3600 sec
    • 337.7 Mbps
    • Need a 622 ATM network to support