1 / 40

Cooperative Caching and Kill-Bots

Cooperative Caching and Kill-Bots. Presented by: Michael Earnhart. On the Scale and Performance of Cooperative Web Proxy Caching By: Alec Wolman Geoffrey Voelker Nitin Sharma Neal Cardwell Anna Karlin Henry Levy. Intuitive Benefits of Cooperative Caching. Larger population

long
Download Presentation

Cooperative Caching and Kill-Bots

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cooperative Cachingand Kill-Bots Presented by:Michael Earnhart

  2. On the Scale and Performance of Cooperative Web Proxy CachingBy: Alec WolmanGeoffrey VoelkerNitin SharmaNeal CardwellAnna KarlinHenry Levy

  3. Intuitive Benefits of Cooperative Caching • Larger population • Better coverage of web objects • Higher request rate • Less bandwidth utilized to 3rd party websites • More responsible use of the Internet • Distributing the web load across several proxies - Bittorrent model.

  4. Terms • Cacheable • Objects that are currently cacheable with current proxy technology • Ideal Caching • Caching all shared objects • Popular • The most frequently visited objects that account for 40% of all requests

  5. Network Traces Physical connection in the network • UW - Connected to the outgoing switches • MS? • Clients • UW - 22,984 • MS - 60,233 • Destination servers • UW - 244,211 • MS - 360,586 • Duration • 168 ±18 hours

  6. Decision Yes No Simple Cooperative Caching Algorithm Request Cached Locally Current Cached Co-op Current Returnfrom Co-op ReturnObject Retrievefrom Source

  7. Hit Rate • Large benefits for small population • Similar shape regardless of caching • “Knee” at 2500 clients

  8. Latency • ~0 slope • Mean is significantly higher than median • Large delays dominate (as with DNS lookups) • Can Co-op proxy help … No

  9. Bandwidth • Caching helps preserve bandwidth • Independent of population

  10. When is Co-op Caching Useful? Several small organizations Ideal +17% Cacheable +9% 978 is clearly losing No clear winner

  11. Locality • Randomly fill 15 organizations of equal size • -4% hit rate compared to real organization

  12. Large Company Co-op Caching • Popularity is universal • UW cacheable increased 4.2% • MS cacheable increased 2.1% Preloaded MS cache was used as a second levelProxy cache for UW.

  13. Trace Data Conclusions • Cooperative caching is essentially only useful to organizations < 2500 clients • Only 2.7% hit (MS) improvement given cooperative caching for > 2500 clients • Specialize grouping cooperative caching is also ineffective

  14. Analytical Model of Web Accesses • Long term analysis of web caching • Infinite storage per proxy • In theory this proxy setup could cache 100% of the Web objects available • Optimal caching occurs when: Creation Rate + Change Rate < Request Rate

  15. The Model • N clients that act independently • Total number of objects is n • Zipf-like distribution where pi denotes popularity •  is the average client request rate • Time between changes is exponential with parameter 

  16. The Model (Cont.) • pc is the probability that an object is cacheable • Average object size is E(S) • Average last byte latency is E(L)

  17. Parameters for the Model

  18. Simulation Results • Initial stage, < 2500 • Request rate is dominated by changes • Middle region < 250,000 • Unpopular documents begin to hit in cache • Final region > 250,000 • Request rate dominates even fast changing objects

  19. Latency • Hit rate determines latency • Assume 10ms response time • Asymptotically approaches(1-pc)E(L)

  20. Change Rate • Unpopularity • 60% of requests • 99.7% of all objects • Change Rate • Large impact on unpopular objects • ≥1day interval yields nearly perfect caching of popular objects Notes 250,000 Clients Change interval determined by HTTP header When Change dominates the request rate Hit % goes to min.

  21. Positive Conclusions • Hit rate depends • Population size • Cooperation can increase population • Rate of change • Creation of new objects • Request rate • Increased population increases request rate

  22. Other Conclusions Cooperative Caching vs. Simple Caching • Bandwidth: No significant benefits • Object latency: No significant benefits • Specialize group: No significant benefits • Large populations >250k: No significant benefits

  23. Discussion • Is cooperative caching useful now • Will it become more/less useful do to Web traffic trends of the future • Rate of object change • Request rate  • Number of clients N • Size of the web (in terms of objects) n

  24. Botz-4-Sale: Surviving Organized DDoS Attacks That Mimic Flash Crowds By: Srikanth KandulaDina Katabi Matthias Jacob Arthur Berger

  25. Botz Problem • Worm viruses can spread to 30,000 clients per day • Botnets for hire has become a reality • HTTP servers need to be able to handle highly sophisticated attacks with up to and beyond 10,000 coordinated attackers

  26. What is Kill-bots • Kernel extension to a web server • Provides • Load activated authentication • IP address admission control • Load balancing between kill-bots and HTTP service

  27. What is CAPTCHA • A graphical puzzle used to distinguish between a human user and a computer or automatic user.

  28. Stage 1 • Changes state when HTTP server load exceeds (K1) - SUSPECTED_ATTACK • In SUSPECTED_ATTACK CAPTCHA puzzles are served to all admitted incoming connections • Puzzle serving is done at a minimal cost • No dedicated sockets • No worker processes • 1 Test per session - for usability • Cryptographic support to validate the client • Per-cookie fairness • Limit set to 8 HTTP requests per cookie

  29. Stage 2 • Uses a bloom filter to count up IP addresses which fail Authentication • Once ALL counters (for an IP address) in the bloom filter reach a threshold packets are dropped • This decreases server load hence the server ceases authentication when Load ≤ K2 < K1

  30. What is Bloom Filter • Hash a value to a certain # of bits • Set those bits in the bloom filter vector • Collision resistant hash required

  31. Admission Control • Attempt authentication with probability  • Clearly AdmissionControl isrequired • Optimal AdmissionControl is highlydesired • Adaptation Required

  32. Adaptive Admission Control • Balancing act between serving HTTP requests and authentication puzzles • Aim for point B - Difficult to identify location on BCsegment; pi = 0 • Settle for E • a fraction ofidle time 

  33. Attacks • Social Engineering • Attack: Use people to circumvent authentication • Solution: Kill-Bots puzzles expire in 4 minutes • Polluting Bloom Filter • Attack: Spoof IP Address to fill out filter • Solution: SYN cookies prevent IP Spoofing • Copy Attacks • Attack: Solve one puzzle - copy cookie to zombies • Solutions: 8 simultaneous connections per cookie

  34. Attacks Cont. • Replay Attacks • Attack: Reuse authentication information • Solution: Cookies are time stamped and hashed • Solution: Cookies are based on the answer • Database Attack • Attack: Learn all the possible puzzles • Solutions: Use a rotating set of puzzles • Breaking CAPTCHA • Attack: Decipher the puzzles automatically • Solution: Create a different type of puzzle

  35. Attack Strategies • a=4000 req/s • N=25,000 clients • Quick exhaust • Fresh IP - 2.5s • Slow exhaust • Fresh IP - 5.0s

  36. Experimental Environment • Web server • 2.0GHz P4 with 1GB RAM • Hosted two websites with mathopd • Debian mirror • CSAIL web-server • 100 Mbps Ethernet connection • Attack • 100 PlanetLab nodes • 256 attackers per node (25,600 total attackers)

  37. Metrics • Goodput of legitimate clients • # of bytes delivered to all legitimate clients • Response times of legitimate clients • Elapsed time to complete a request (<60s) • Total number of legitimate requests dropped

  38. PlanetLab Results • CyberSlam attack • a=4000 req/s • Attack lasts 1800s • 60% of legitimate users solve puzzle correctly

  39. PlanetLab Results Cont. • Flash Crowd (non-attack) • f=2000 req/s (norm = 300 req/s) • Kill-bots improves performance • Base - wastes throughput on retry, and incomplete transfers

  40. Discussion • Willingness to solve puzzles? 60% • Research group’s web page - NOT a standard audience • Solving puzzles for text browsers or the visually impaired - not possible • NAT/Proxy Solution • Requires Zombies to be x-1 times as active as legitimate users • Arbitrary parameter values • Flash crowds - base server has no connection limiting. This is not realistic

More Related