1 / 41

#16 Application Measurement

#16 Application Measurement. Presentation by Bobin John. 1 st paper:. Measurement, Modeling & Analysis of a Peer-to-Peer File-Sharing Workload (KaZaa paper). KaZaa paper. P2P file sharing is the most dominant This paper deals with KaZaa 200-day trace is taken Model is developed

Download Presentation

#16 Application Measurement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. #16 Application Measurement Presentation by Bobin John

  2. 1st paper: Measurement, Modeling & Analysis of a Peer-to-Peer File-Sharing Workload (KaZaa paper)

  3. KaZaa paper • P2P file sharing is the most dominant • This paper deals with KaZaa • 200-day trace is taken • Model is developed • Locality-awareness can improve KaZaa performance

  4. Trace Methodology KaZaa trace summary statistics KaZaa “usernames” used KaZaaLite … IPs used Easy to distinguish KaZaa-specific HTTP headers Auto-update transactions filtered out KaZaa paper

  5. KaZaa paper • User Characteristics • KaZaa users are patient

  6. KaZaa paper • User Characteristics • Users slow down as they age • 2 reasons: attrition & slowing down over time

  7. KaZaa paper • Client Activity

  8. KaZaa paper • Object Characteristics • Diverse workload

  9. KaZaa paper • Object Characteristics • Object Dynamics • Clients fetch objects at most once • Popularity of objects is often short-lived • Most popular objects tend to be recently born objects • Most requests are for old objects

  10. KaZaa paper • Object Characteristics • NOT Zipf-like • Web access patterns follow the Zipf property

  11. KaZaa paper • Model

  12. KaZaa paper • Model for P2P file-sharing workloads • Model Description

  13. KaZaa paper • Model for P2P • File-Sharing effectiveness diminishes with client age

  14. KaZaa paper • Model for P2P • New Object Arrivals improve performance

  15. KaZaa paper • Model for P2P • New clients cannot stabilize performance

  16. KaZaa paper • Model for P2P • Model validation

  17. KaZaa paper • New idea! • How to reduce bandwidth cost? • Use a proxy cache • Legal & political problems • Locality-aware request routing • Centralized request redirection • redirector • Decentralized request redirection • supernodes

  18. KaZaa paper • Locality awareness • Methodology • Benefits

  19. KaZaa paper • Locality awareness • Accounting for Hits & Misses

  20. KaZaa paper • Locality awareness • Availability

  21. KaZaa paper • Conclusion • KaZaa workload is different • Does not follow Zipf • Can be improved with locality awareness • Drawbacks • A trace from a university ought not to be generalized to all KaZaa/P2P applications • Further implementation details of locality-awareness? • Scope of use for such a locality awareness tool? • I don’t think universities would like this

  22. 2nd paper: An analysis of Internet Chat systems

  23. Chat paper • Why is chat a worthwhile target for traffic characterization? • Chat offers computer mediated communication • Used by a large number of people … potential of being habit-forming

  24. Chat paper • Different types of chat systems: • Internet Relay Chat [IRC] • Web-based chat systems • ICQ & AIM • Gale

  25. Chat paper • Problem in analyzing chat traffic • Multitude & diversity of systems & protocols • Chat protocol realized on top of HTTP protocol … difficult to separate chat traffic • Resource limitations due to filtering demands

  26. Chat paper • IRC • Set of connected servers • Client connection requests on port 6667 • Unique nicknames • Discussion channels • Channel operators • Medium to share data • IRC operator

  27. Chat paper • Web-chat • Not tty-based … Web browser interface • A single server to connect to • 3 classes of chat systems: • HTML-Web-Chat • Applet-Web-Chat • Applet-IRC-Chat • Difference between IRC & Web-chat is only “social”

  28. Chat paper • Identifying IRC chat traffic • Packet monitor that captures all TCP traffic involving port 6667 • Can only capture text & control messages • Data/file transfers cannot be captured as they run on other TCP connections • IRC’s packet size distribution is mainly dominated by small packets • IRC session should last more than a few minutes • IRC sends keep-alive messages

  29. Chat paper • Identifying Web-chat traffic • HTML-Web-chat: • Appropriate cache-control-headers • Adding state information • Cache-Control: Must-revalidate & Cache-Control: Private indicates non-chat traffic • Use of scripting languages e.g.,Javascript • Use of applet windows e.g., Java

  30. Identifying Web-chat traffic Applet-Web-chat: User would have accessed a Java file or a script or even a page like “xxxchatyyy” … “chat” could occur even in the path Chat paper

  31. Chat paper • Overall strategy for extracting chat traffic

  32. Chat paper • Overall strategy for extracting chat traffic • Repeat this process • Identify traffic that cannot be chat traffic • Remove it • Steps that filter out more non-chat traffic has to be implemented earlier • Other steps that need more processin gor pre-processing should be implemented later

  33. Chat paper • Overall strategy for extracting chat traffic • Eliminate traces from ports < 1024 except port 80 • Also eliminate trace from well-known application ports (e.g., Gnutella - 6346) • Group packets into flows • Mark & filter them according to the previous table

  34. Chat paper • Experiment • At University of Saarland • Resource partitioning • Traces were generated after filtering • 950GB > 1.2GB > 238MB (WEBCHAT1) • 192MB (IRC1) • 350MB (WEBCHAT2)

  35. Chat paper: • Validation • 2 aspects: • Recall – ability of a system to present all relevant items • Precision – ability of a system to present only relevant items

  36. Chat paper • Validation • Lots of calculations “we can expect to locate about 91.7% of all real chat connections and that we expect that at least 93.1% of all connections we identify are indeed chat connections. “

  37. Chat paper • Results • Session durations

  38. Chat paper • Results • Interarrival times of sessions

  39. Chat paper • Results • Packet sizes

  40. Chat paper • Results • Sent & Received bytes

  41. Chat paper • Conclusion • Chat-traffic was successfully filtered out • Accuracy was above 90% • Drawbacks • Use of this work?

More Related