1 / 37

A Survey of Web Cache Replacement Strategies

A Survey of Web Cache Replacement Strategies. Stefan Podlipnig , Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter: Junghwan Song 2012.04.25. Outline. Introduction Classification Recency -based Frequency-based Recency /frequency-based

meir
Download Presentation

A Survey of Web Cache Replacement Strategies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter: Junghwan Song 2012.04.25

  2. Outline • Introduction • Classification • Recency-based • Frequency-based • Recency/frequency-based • Function-based • Randomized • Discussions • Importance in nowadays • Future research topics • Conclusions

  3. Why was caching born? • Web has been growing • Load on the Internet and web servers increase Caching have been introduced

  4. Caching effect • Reducing network bandwidth usage • Reducing user- perceived delays • Reducing loads on the origin server • Increasing robustness of web services • Providing a chance to analyze an organization’s usage pattern

  5. When cache becomes full.. • To insert new objects, old objects must be removed • Which objects do we select? Cache replacement strategy

  6. General cache operation • Cache miss  Cache stores new object • Cache hit  Cache serves requested objects • Cache full  Cache evicts old objects

  7. Outline • Introduction • Classification • Recency-based • Frequency-based • Recency/frequency-based • Function-based • Randomized • Discussions • Importance in nowadays • Future research topics • Conclusions

  8. Classification factors • Important factors for classification • Recency • Time of since the last reference • Frequency • The number of requests • Size • Modification • Time of since last modification • Expiration time • Time when an object gets stale

  9. Classification • Recency-based strategy • Frequency-based strategy • Recency/frequency-based strategy • Function-based strategy • Randomized strategy

  10. Recency-based strategy • Recency is a main factor • Based on the temporal locality • Temporal locality: Burst accesses in short time period • There are well-known LRU and extension of it

  11. Recency-based schemes • LRU • Remove the least recently used object • LRU-Threshold • Don’t cache when size of new object is exceeds the threshold • SIZE • Remove the biggest one • LRU is used as a tie breaker

  12. Recency-based schemes • PSS • Classify objects depending upon their size • Range: 2i-1 ~ 2i-1 • Each class has a separate LRU list • Whenever there is a replacement • Choose largest among the least recently used objects of each class • : Size of the object • : The number of accesses since the last request

  13. Characteristics • Pros • Suited when web request streams exhibit temporal locality • Simple to implement and fast • Cons • In general, size is not combined with recency well • Except PSS

  14. Frequency-based strategy • Use frequency as a main factor • Based on popularity of web objects • Frequency represents popularity • There are well-known LFU and extension of it

  15. Two forms of LFU • Perfect LFU • Count all requests to an object i • Request counts persist across replacement • Represent all requests from the past • Space overhead • In-cache LFU (We assume this) • Count requests to cached objects only • Cannot represent all requests in the past • Less space overhead

  16. Frequency-based schemes • LFU • Remove the least frequently used object • LFU-Aging • If avg(all frequency) exceeds certain threshold, all frequency counter/2 • LFU-DA • Each request for object i, calculate following • L is an aging factor, initialized to zero • Smallest Ki-value object is replaced • The value of this object is assigned to L

  17. Characteristics • Pros • Valuable in static environments • Popularity does not change over a time period • Cons • Complex to implement • Cache pollution • Old, popular objects don’t be removed • Overcome with aging

  18. Recency/frequency-based strategy • Use recency and frequency(+@) • LRU* • If least recently used object’s counter is zero, replace it • Otherwise, decrease its counter and move it to the beginning of list(Most recently used position)

  19. Characteristics • Pros • Can take advantages of both recency and frequency • Cons • Additional complexity is added • Simple scheme(ex. LRU*) neglects size

  20. Function-based strategy • Use a potentially general function • GD-Size • , where L is a running aging factor • Smallest-value object is selected • HYBRID • cs: time to contact server, bs: bandwidth to server, Wb&Wn: parameters

  21. Characteristics • Pros • Can control weighting parameters • Optimization is possible • Consider many factors • Can handle different workload situations • Cons • Choosing appropriate parameters is difficult • Using latency as a factor is danger • Latency changes depending upon time

  22. Randomized strategy • Use randomized decisions • RAND • Remove a random object • HARMONIC • Give probability inversely proportional to cost, ci/si(ci: cost to fetch, si: size of object)

  23. Characteristics • Pros • Simple to implement • Cons • Hard to evaluate • Results of simulations that is run on the same Web server are slightly different

  24. Outline • Introduction • Classification • Recency-based • Frequency-based • Recency/frequency-based • Function-based • Randomized • Discussions • Importance in nowadays • Future research topics • Conclusions

  25. Importance in nowadays • Questions on importance of cache replacement strategies • Large cache • Reduction of cacheable traffic • Good-enough algorithms • Alternative models

  26. Large cache • The capacity of caches grows steadily • Replacement strategies are not seen as a limiting factor • Working set for clients<<Cached objects[1] • Basic LRU is sufficient • But, cacheable object will grow in future • Multimedia files [1]. Web caching and replication, Rabinovich and Spatscheck [2002]

  27. Reduction of cacheable traffic • Non-cacheable data is of a significant percentage of the total data • Around 40% of all requests • Overcome with active cache, server accelerator • Active cache: Let proxy cache applets • Server accelerator: Provide an API which control cached data explicitly

  28. Good-enough algorithms • There are already many algorithms that are considered as good enough • Give good results in different evaluations • PSS, etc • Some function-based strategies with weighting parameters can be optimized

  29. Alternative models • Static caching • Content of the cache is updated periodically • Popularity of objects is determined in prior period • Give TTL to cached objects • Simple to implement • Large TTL causes large cache storage usage

  30. Future research topics • Adaptive replacement • Coordinated replacement • Replacement + coherence • Multimedia cache replacement • Differentiated cache replacement

  31. Adaptive replacement • Change replacement strategies(or parameters in function-based) depending on actual workload • Strong temporal locality workload LRU • Workload with no request fluctuations LFU • Problems • Need smooth change of strategies • Wrong changes make performance worse

  32. Coordinated replacement • Make decision with considering other caches’ status • Cooperative caching • There are some papers of cooperative caching in ICN • WAVE(2012)[2] • Age-based cooperative caching(2012)[3] [2] WAVE: Popularity-based and Collaborative In-network Caching for Content-Oriented Networks, K Cho et al, 2012 [3] Age-based Cooperative Caching in Information-Centric Networks, Z minget al, 2012

  33. Multimedia cache replacement • Multimedia caching research will be dominated by video • Videos are the biggest objects • How to cache this big file • Chunks, partial caching, quality-adjust, etc

  34. Differentiated cache replacement • Support QoS in caching • Ex) Classify caches into different classes • Two kindsof differentiation • Using information given by servers • Handled by only proxy • Add some overhead • How to simplify?

  35. Conclusions • Give an exhaustive survey of various cache replacement strategies • Show that there are future research areas of cache replacement strategies

  36. APPENDiX

  37. Large cache • A cache’s handleable rate: 1000 req/sec • Average size of objects: 10KB • Request rate of above: 82Mbps • 60% are cacheable, 40% hit rate: 16.4Mbps (2.05 MBps) • Disk capacity 200 GB: 21 millions objects • Working set’s mMaximum stack distance: 15 millions

More Related