1 / 52

A Survey of Web Caching Schemes for the Internet

A Survey of Web Caching Schemes for the Internet. Jia Wang. Agenda. The World Wide Web Problem and solution (caching) Proxy servers Advantages of web caching Disadvantages of web caching Elements of A WWW caching system Desirable properties of WWW caching system

zurina
Download Presentation

A Survey of Web Caching Schemes for the Internet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Survey of Web Caching Schemes for the Internet Jia Wang Web Caching Schemes

  2. Agenda • The World Wide Web • Problem and solution (caching) • Proxy servers • Advantages of web caching • Disadvantages of web caching • Elements of A WWW caching system • Desirable properties of WWW caching system • Problems in designing caching systems for the WWW • Caching architecture Web Caching Schemes

  3. The World Wide Web • The WWW can be considered as a large distributed information system. • Exponential growth in size. • On may 1999 included 600 millions of static web pages. • Increases 15% per month. • Very popular. Web Caching Schemes

  4. SIZE OF DISTINCT STATIC WEB PAGES Web Caching Schemes

  5. The World Wide Web • Usage is relatively inexpensive • Accessing information is very fast • Documents appeal to a wide range of interests • But….. Web Caching Schemes

  6. The World Wide Web • Network congestion • Server overloading Web Caching Schemes

  7. Problem • Internet backbone capacity increases 60% per year. • Bandwidth is not growing fast enough. • Without solution WWW will become too congested and its entire appeal would be lost. Web Caching Schemes

  8. Solution • Caching: Placing popular objects at locations close to the clients. Web Caching Schemes

  9. proxy servers • HTTP servers handled by companies for security reasons. • The bottleneck of the connection between the client and the internet. • Shared by all clients inside the firewall. Web Caching Schemes

  10. Web Caching Schemes

  11. proxy servers • Belonging to same organization, clients share common interests. • They probably access the same set of documents. Web Caching Schemes

  12. thus • On the proxy server, a previously requested and cached documents would likely result in future hits. Web Caching Schemes

  13. proxy severs • Caching most popular web pages on the proxy server can: • Save network bandwidth • Lower access latency for the client Web Caching Schemes

  14. Advantages of web caching • Reduces bandwidth consumption Decreases network traffic Lessens network congestion • Access latency: • frequently used docs are cached nearby • less traffic  shorter delay for docs not cached Web Caching Schemes

  15. Advantages of web caching (cont.) • Reduces workload of remote server • Data can be accessed when remote server is down (enhanced robustness). • Allows analysis of organization usage patterns • cooperation between caches increases efficiency. Web Caching Schemes

  16. Disadvantages of web caching • Data not updated automatically • Cache miss can cause increase in latency (extra proxy processing). • Bottleneck effect – limit # of clients per proxy. • A single proxy is a single point of failure • Information providers can not monitor # of visits per site. Web Caching Schemes

  17. Elements of A WWW caching system • Documents can be cached at the clients, the proxies and the servers. Web Caching Schemes

  18. Elements of a WWW caching system Web Caching Schemes

  19. fast access robustness transparency scalability efficiency adaptivity stability load balance ability to deal with heterogeneity simplicity Desirable properties of WWW caching system Web Caching Schemes

  20. Fast access • Reduce web access latency to a minimum. • Especially comparing to other servers not using caching techniques. Web Caching Schemes

  21. Robustness • Robustness = Availability to user • eliminate single point failure • in case of failure – fall down gracefully • easy to recover from failure Web Caching Schemes

  22. Transparency • Transparent to the user • The user should only notice: • Faster response • Higher availability Web Caching Schemes

  23. Scalability • Scale well along the increasing size and density of the network. • All protocols should be as lightweight as possible. Web Caching Schemes

  24. Efficiency • impose minimal additional burden on the network (in control & data packets) • do not adopt any scheme which leads to under-utilization of the network Web Caching Schemes

  25. Adaptivity • adapt to dynamic changing in the user demand and network environment • achieve optimal performance Web Caching Schemes

  26. Stability • Do not introduce instabilities into the network Web Caching Schemes

  27. Load balancing • distribute load evenly through the entire network • no bottlenecks / hot-spots Web Caching Schemes

  28. Ability to deal with heterogeneity • Adapt to a range of network architecture (hardware & software) Web Caching Schemes

  29. Simplicity • Mechanism simple to deploy • simpler schemes are easier to implement and likely to be accepted as international standards Web Caching Schemes

  30. What Problems do we face in designing caching systems for the WWW ??? Web Caching Schemes

  31. Problems in designing caching systems for the WWW • Caching system architecture • how cache proxies are organized – hierarchically, distributed or hybrid. Web Caching Schemes

  32. Problems in designing caching systems for the WWW • Proxy placement • were to place a cache proxy in order to optimize performance Web Caching Schemes

  33. Problems in designing caching systems for the WWW • Caching contents • What can be cached in the caching system Web Caching Schemes

  34. Problems in designing caching systems for the WWW • Proxy cooperation • How do proxies cooperate with each other Web Caching Schemes

  35. Problems in designing caching systems for the WWW • Data sharing • what kind of data/information can be shared among among cooperative proxies Web Caching Schemes

  36. Problems in designing caching systems for the WWW • Cache resolution/routing • how does a proxy decide where to fetch a page requested by a client. Web Caching Schemes

  37. Problems in designing caching systems for the WWW • Prefetching • How does a proxy decide what and when to prefetch from webservers or other proxies to reduce access latency. Web Caching Schemes

  38. Problems in designing caching systems for the WWW • Cache placement/ replacement • how the proxy decides which page to be stored in its cache and which page to be removed from it. Web Caching Schemes

  39. Problems in designing caching systems for the WWW • Cache coherency • how does a proxy maintain data consistency Web Caching Schemes

  40. Problems in designing caching systems for the WWW • Control information distribution • how is the control information (e.g URL) distributed among proxies. Web Caching Schemes

  41. Problems in designing caching systems for the WWW • Dynamic data caching • how to deal with data that is not cachable Web Caching Schemes

  42. Caching architecture • Hierarchical • Caches are placed at multiple levels of the network. national regional institutional bottom Web Caching Schemes

  43. Hierarchicalarchitecture • Bottom– clients/browsers caches. web page not found national regional web page not found institutional web page not found bottom Web Caching Schemes

  44. Hierarchicalarchitecture • after web page is found forward page, leave copy national regional forward page, leave copy institutional forward page, leave copy bottom Web Caching Schemes

  45. Hierarchicalarchitecture • Advantages: • Bandwidth efficient – especially when cache servers are slow. • Allows to efficiently diffuse popular web pages towards the demand. Web Caching Schemes

  46. Hierarchicalarchitecture • Disadvantages • Cache server needs to be placed at key access points of the network requires coordinationamong caches. • Each level adds a delay. • High levels are bottlenecks. • multiple copies at different cache levels. Web Caching Schemes

  47. Distributed architecture • Caches at the bottom level only. • No other intermediate caching levels. • Each cache server contains meta-data on the data stored on other servers. • Hierarchy used only for distributing information about location of the copy. • No copying of actual documents. Web Caching Schemes

  48. Distributed architecture • Advantages: • Traffic flows through low network levels which are less congested. • No additional disk space required for intermediate network levels. • Better load sharing. • More fault tolerant. Web Caching Schemes

  49. Distributed architecture • Disadvantages: • High connection times • Higher bandwidth usage • Administrative issues. Web Caching Schemes

  50. Distributed architecture • Examples • ICP– Internet Cache Protocol (Harvest group) • Retrieve data from neighboring caches + parent caches • CARP– Cache Array Routing Protocol • URL space divided to an array of caches. • Each cache stores only documents whose URL are hashed to it. Web Caching Schemes

More Related