280 likes | 426 Views
WebTraff: A GUI for Web Proxy Cache Workload Modeling and Analysis. Nayden Markatchev Carey Williamson Department of Computer Science University of Calgary. Introduction. What is WebTraff?
E N D
WebTraff:A GUI for Web Proxy Cache Workload Modeling and Analysis Nayden Markatchev Carey Williamson Department of Computer Science University of Calgary MASCOTS 2002
Introduction What is WebTraff? - An extended and improved version of ProWGen (Proxy Workload Generator), including a GUI interface to a useful set of tools for Web traffic modeling and analysis Purpose: To facilitate the easy generation and analysis of controllable and representative workloads for Web caching simulations MASCOTS 2002
Talk Overview • WebTraff General Information • System Requirements, Data Formats, Assumptions, Inputs, Outputs, Usage • Simple Demo • Using WebTraff to generate and analyze a workload, plus Web proxy cache simulation • Questions and Discussion MASCOTS 2002
Software Requirements Unix based environment running X windows cc, gcc, g++, tcl 8.0 or newer, tk 8.0 or newer, wish, perl 5.0 or newer, gnuplot, gs Hardware Requirements 64 MB or more RAM 100 MB hard disk space (for storing long workload traces) System Requirements Future Work: Port to Windows (volunteers?) MASCOTS 2002
Example of the Web Workload Trace Format Used in WebTraff MASCOTS 2002
Overview of WebTraff • The WebTraff toolkit provides three main functions: • Web workload trace generation • Web workload trace analysis • Web proxy cache simulation • Graphs displayed in PostScript format MASCOTS 2002
WebTraff GUI Interface MASCOTS 2002
Web Workload Generation MASCOTS 2002
Web Workload Generation • This portion of the tool provides a GUI to ProWGen [Busari/Williamson 2001] • ProWGen models four key characteristics of Web proxy workloads. • Zipf-like document popularity distribution • High degree of “one-time” referencing • Heavy-tailed file and transfer size distributions • Temporal locality property in references MASCOTS 2002
Web Workload Generation (cont’d) • Name of trace file being generated • Sliding widgets for: • Number of references (lines) in a workload file • Number of distinct Web objects in workload • Percentage of objects that are “one-timers” • Slope of Zipf-like document popularity profile • Slope of Pareto tail for document size distribution • Degree of statistical correlation (if any) between size and popularity for Web objects MASCOTS 2002
Web Workload Generation (cont’d) • The notion of “temporal locality” refers to temporal correlation in referencing behaviour (e.g., recent past good predictor of near future) • Four models for referencing behaviour: • Independent Reference Model (IRM) • Static LRU Stack Model (SLRU) • Dynamic LRU Stack Model (DLRU) • New LRU Stack Model (NLRU) MASCOTS 2002
Web Workload Generation (cont’d) • “Popularity Bias” parameter (hack!) • This button was added to remedy a problem in earlier version of ProWGen, which tended to choose one-timers early in the trace and popular documents late in the trace • Can now control this in workload generation • Can visually check for stationarity of cache hit ratio during simulations MASCOTS 2002
Web Workload Analysis MASCOTS 2002
Web Workload Analysis • Two main categories of analysis functions: • Time series analysis (on the left) • Web workload analysis (on the right) • Radio buttons, slide bars and text boxes available to control plotting characteristics MASCOTS 2002
Requests per Interval(time series plot) MASCOTS 2002
Bytes per Interval(time series plot) MASCOTS 2002
Popularity Distribution plot MASCOTS 2002
Document Size Distribution (zoomed) MASCOTS 2002
Log-Log Complementary Distribution (LLCD) plot (size) MASCOTS 2002
LRU Stack Depth Analysis(time series plot) MASCOTS 2002
LRU Stack Depth Analysis(marginal distribution) MASCOTS 2002
Web Proxy Cache Simulation MASCOTS 2002
Web Proxy Cache Simulation • Application-level caching simulation parameters • Cache size • Cache replacement policy • Five replacement policies currently available • Random replacement (RAND) • First-In-First-Out (FIFO) • Least-Recently-Used (LRU) (default setting) • Least-Frequently-Used (LFU) • Greedy-Dual-Size (GDS) MASCOTS 2002
DHR Results from “Run Sizes” MASCOTS 2002
DHR Results from “Run Policies” MASCOTS 2002
BHR Results from “Run Policies” MASCOTS 2002
Assessing Cache “Steady State” MASCOTS 2002
For More Information… • WebTraff toolkit: • http://www.cpsc.ucalgary.ca/~carey/software.htm • “ProWGen: A Synthetic Workload Generation Tool for the Simulation Evaluation of Web Proxy Caches” • Busari/Williamson, Computer Networks, Vol 38, No 6, June 2002 • http://www.cpsc.ucalgary.ca/~carey/publications.htm • Contact information: • Email {carey,nayden}@cpsc.ucalgary.ca MASCOTS 2002