Information Systems 337 Prof. Harry Plantinga Analysis and Performance
Analysis • Your boss says "Who's visiting our site? Where are they coming from? What pages are popular? What paths are users taking through the site? What are the problem pages where we lose visitors?" • Analysis options: • Read the raw logfile • Set up a logfile analysis program • Sign up for Google Analytics
Google Analytics • An amazing tool… • Where do my customers live? • What path do they take through the site? • What links are popular? • Would a certain change improve the site?(A/B testing) • How to add Google Analytics to a web site? • Sign up for GA, install the google_analytics module
Suppose… • You set up a new web site using Drupal • Performance is mediocre • What was a site in the top 5% (for speed) drops near the bottom… • 5 seconds for a page load • 0.75sec minimum for every file served • high server load averages • occasionally, load average shoots up and server crashes • What to do?
Second domain name • Add a second server on a different domain name • Serve static files (artwork, stylesheets, etc) from a separate server • Allow 4 files to be downloaded at once between the two domain names • Can turn on maximum caching for static files
Caching • Turn on Drupal's page cache • Make sure users' browser cache works as well as possible • Static content should never expire • Change filename for revised files • Static server: add an Expires or Cache-control header: make sure content doesn't expire • Still, 40-60% of visitors come with an empty cache…
Optimize artwork • Make sure artwork files are small • optimize in Photoshop or another program • Many small files == slow. How to reduce? • combine artwork into one background image • use jQuery to round corners • use CSS Sprites • use inline images in stylesheets with data: urls
Reduce download times • 80-90% of wait time is downloading content • More ideas • Compress components • Accept-Encoding: gzip, deflate? Then deflate! • Apache: use mod_deflate • Use a content delivery network • E.g. Akami • Can improve av. response time 20% or more
Make pages appear faster • Stylesheets at the top, so pages can be rendered right away • Scripts at the bottom • Specify image dimensions • Minimize expensive page components
Page Speed • Google Page Speed tools help identify ways to improve performance • Firebug add-on for Firefox • Apache module mod_pagespeed
Problems remain? • At this point we still had very slow page display times and occasional load average spike/crash • How to debug? • What are the possible server performance bottlenecks?
Server Bottlenecks • Possible server bottlenecks • Is the CPU maxed out? • Running out of RAM • Disk speed, transactions per second • Network bandwidth • Database capacity • How to test?
Disk maxed out? • Check disk activity (reads vs. writes) • vmstat on linux • performance monitor on windows • or write your own… • If pegged, find cause • excessively verbose logging? • background process, e.g. backups? • poorly configured database? • too many database writes? • some other bad algorithm?
Running out of bandwidth? • Get a bigger pipe • Or, send less data • smaller artwork • better caching • gzip compression
Running out of RAM? • Check with linux top or free, performance monitor, etc. • Each server process takes 20-100 MB or more • Each Apache or Drupal module takes more • Our crashes were due to running out of RAM, then swap space • Solutions? • Set MaxClients appropriately • Turn down Keepalive time! (Default 15, to 2?) • MaxRequestsPerChild to lower number, e.g. 300 • Get more RAM
Running out of CPU? • Check with top (linux/mac) or performance monitor (windows) • PHP is interpreted • Each request: load, compile all the code… • This was taking us something like 0.5 secs of CPU time per page • Use a PHP Accelerator (bytecode cache) • e.g. eAccelerator, Zend, Alternative PHP Cache • Application profiling • Get a faster CPU
Database swamped? • Use mysqladmin to figure out how many transactions per second you are using • How many transactions per second can a database process? • fully cached: lots • requiring a disk access: 100? 200? • Solutions: • optimize expensive queries • add indexes on tables • enlarge server's query cache / tune server • reduce database writes
Drupal-specific optimizations • Some improvements you can make in Drupal: • Turn on page caching • Stylesheet optimization • Prune the sessions table/make sessions shorter • Reduce time before garbage collection (cron) • Prune error reporting logs (watchdog table) • Automatic throttling
What next? • I've done all that. What else can I do? • Separate database server • Load balancer and additional servers • Database replication