1 / 18

Storage Allocation in Prefetching Techniques of Web Caches

Storage Allocation in Prefetching Techniques of Web Caches. D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03) San Diego June 9-12, 2003 Presented by Laura D. Goadrich. The Web.

Download Presentation

Storage Allocation in Prefetching Techniques of Web Caches

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03) San Diego June 9-12, 2003 Presented by Laura D. Goadrich

  2. The Web • Large-scale distributed information system where data Objects are published and accessible by users • Problems caused by the demand of increased web capacity: • Network traffic congestion • Web server overloads • Solution: web caching

  3. Web caching: • Benefits: • Improves web performance (reduces access latency) • Increases web capacity • Alleviate traffic congestion (reducing network bandwidth consumption) • Reducing number of client requests (workload) • Possibly improve failure tolerance and robustness of Web (maintaining cached copies of web objects for unreachable networks) • Prefetching: • Anticipate users’ future needs • This research: • Focuses on making cache-related storage capacity decisions (storage capacity limits the number of prefetched web objects) • Therefore allocate cache storage in prefetching • The authors state this focus has not been researched**

  4. Ideas: • Current research: • Predict user web accesses without considering cache storage limit • This research: optimization based models • Maximize hit rate • Maximize byte hit rate • Minimize access latency (first 2 are primary goals of web caching: maximize) • Benefit of this research: guide the operations of a prefetching system

  5. Web prefetching techniques • Client-initiated policies • User A is likely to access URL U2 right after URL U1 • Patterns learned via Markov algorithms • Server-initiated policies • Anticipate future requests based on server logs and proactively send the corresponding Web objects to participating cache servers or client browsers • Top-n algorithm • Hybrid policies • Combine user access patterns from clients and general statistics from servers to improve the quality of prediction • Failing of policies: how to make decisions of which Web objects to prefetch considering storage capacity

  6. Assumptions/Notation

  7. Hit Rate (HR) Model (1) (2) (3)

  8. Byte Hit Rate (BHR) Model (4) (2) (3)

  9. Byte Hit Rate (BHR) Model (7) (2) (3)

  10. Transforming HR, BHR & AL into the Knapsack problem • Benefits of Knapsack problem • Well studied • “easiest” NP-hard problem • Can solve optimally by a pseudo-polynomial algorithm based on dynamic programming • A fully polynomial approximation is possible • Focus on greedy algorithm (due to paper length limits)

  11. Greedy Algorithm: • Sort all URLs into a sequence • Determine a threshold k defined as: • Prefetch Web objects referred to by URLs

  12. Other Allocation Policies Tested • Optimal policy using CPLEX • Disadvantages • Complex • Increased implementation time • Difficult to implement • Top-n • Developed for Web usage prediction • Used to regulate storage allocations by appropriately setting n • Equivalent to Greedy BHR relying only on Pi

  13. Simulations a. b. LN(μ,σ)= lognormal distribution with mean eμ and shape σ

  14. Performance Comparison

  15. Results • Greedy algorithms and Top-n in general achieve reasonable performance • Greedy algorithms outperform Top-n with respect to hit rate and access latency • There exists a relatively large performance gap between an optimal approach and fast heuristic methods when Web objects vary greatly in size • Suggests the need for developing more sophisticated allocation policies such as a dynamic programming-based approach

  16. Contributions: • Focus: stress importance of effective storage allocation in prefetching • Paper contributions: • Provide new formulations for prefetching storage allocation • Create computationally efficient allocation policies based on storage allocations solved by the knapsack problem • Models created lead to more precise understanding of the applicability and effectiveness of Top-n policy

  17. Future Work • Trace-based simulation • Actual web access logs • More realistic environment • Modeling • Integrate allocation models with caching storage management models i.e. Cache replacement

  18. Changes- Recommendations • Not renaming the same constraints • More resources (5 articles, 2 books) • Discuss feasible solve times (opt) • Test/Hypothesize implementation strategies for real application

More Related