Measurement modeling and analysis of a peer to peer file sharing workload
1 / 45

Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload - PowerPoint PPT Presentation

  • Updated On :

Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload. K.P. Gummadi, R. J. Dunn, et al SOSP’03 Presented by Lu-chuan Kung Outline. Trace methodology and analysis User characteristics Client activities Object dynamics

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload' - RexAlvis

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Measurement modeling and analysis of a peer to peer file sharing workload l.jpg

Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload

K.P. Gummadi, R. J. Dunn, et al


Presented by Lu-chuan Kung

Outline l.jpg
Outline File-Sharing Workload

  • Trace methodology and analysis

    • User characteristics

    • Client activities

    • Object dynamics

  • Analyze why Kazaa workload is not Zipf

  • A model of P2P file-sharing workloads

  • A study of bandwidth-saving techniques

  • Conclusion

Trace methodology l.jpg
Trace Methodology File-Sharing Workload

  • Passively collect Kazaa traffic at the border of campus network and internet

  • Query traffic was not captured b/c of encryption. File transfers are HTTP transfers w/ Kazaa-specific header

  • Summary statistics of the trace:

Kazaa users are patient l.jpg
Kazaa Users Are Patient File-Sharing Workload

  • Transfer time: the difference between the start time and the end time of a request

  • Small objects: <10MB (mostly audio files)

  • Large objects: >100MB (typically video files)

User slow down as they age l.jpg
User Slow Down As They Age File-Sharing Workload

  • Do people become hungrier for content as they gain experience with Kazaa?

  • Older clients requested fewer bytes b/c:

    • Attrition: population declines as clients age

    • Slowing down: older clients ask for less

Client activity l.jpg
Client Activity File-Sharing Workload

  • It’s difficult to quantify the availability of clients in a p2p system

  • Client activity includes:

    • Activity fraction: time spent in transfers / duration of lifetime. Lower bound on availability

    • Average session length: typical duration length

Object characteristics l.jpg
Object Characteristics File-Sharing Workload

  • Kazaa is not one workload

    • Kazaa is a blend of workloads of different properties

    • 3 ranges of objects: small (<10MB), medium (10MB~100GB), and large (>100GB)

    • Majority of requests are for smaller objects

    • Most bytes transferred are due to large objects

Kazaa object dynamics l.jpg
Kazaa Object Dynamics File-Sharing Workload

  • Multimedia objects are immutable, therefore affect object dynamics

    • Kazaa clients fetch objects at most once

      • Kazaa client requests an object once: 94% of time

      • Kazaa client requests an object twice: 99% of time

    • Most requests are for old (repeated) objects

      • An object is old if at least one month has passed since the first request of the object

      • 72% of requests for large objects are old

      • 52% of requests for small objects are old

Kazaa object dynamics9 l.jpg
Kazaa Object Dynamics File-Sharing Workload

  • The popularity of Kazaa objects is often short-lived

    • The most popular pages remains stable for the Web

    • Popularity is fleeting in Kazaa

    • Audio files lose popularity faster than popular video files

  • The most popular Kazaa objects tend to be recently born objects

    • Newly born objects: did not receive any requests during the first month of the trace

Kazaa is not zipf l.jpg
Kazaa Is Not Zipf File-Sharing Workload

  • Zipf’s law:

    • The popularity of ith-most popular object is proportional to i-α, α: Zipf coefficient

  • Kazaa is not Zipf

    • Most popular objects are less popular than Zipf would predict

Why kazaa is not zipf l.jpg
Why Kazaa Is Not Zipf File-Sharing Workload

  • Fetch-repeatly vs. fetch-at-most-once

  • Simulate the two cases based on the same Zipf distribution

  • The result of fetch-at-most-once is similar to Kazaa.

  • Non-Zipf workloads are also observed in web proxy caches and VoD servers

A model of p2p file sharing workloads l.jpg
A Model of P2P File-Sharing Workloads File-Sharing Workload

  • Hypothesis: underlying popularity of objects in a fetch-at-most-once system is driven by Zipf’s law

  • A client requests 2 objects per day. Choose which object to fetch from Zipf(1)

  • An object is born with rate λo , its popularity rank is selected from Zipf(1)

  • Total object population cannot be observed from the trace. Use back-inference: given 18,000 distinct objects are requested in the trace, what’s the total number of objects? Ans: 40,000

Model structure and notation l.jpg
Model Structure and Notation File-Sharing Workload

  • Parameter value are chosen to reflect the measured data from the trace

File sharing effectiveness l.jpg
File-Sharing Effectiveness File-Sharing Workload

  • How should organization exepect bandwidth demand to change over time, given a shared proxy server?

  • Hit rate of the proxy cache decreases in the fetch-at-most-once case

  • Fetch-at-most-once clients consume the most popular objects early

New object arrivals improve hit rate l.jpg
New Object Arrivals Improve Hit Rate File-Sharing Workload

  • Object updates in Web lower the hit rate

  • New objects arrivals are beneficial in P2P system

    • Arrivals of popular objects increase hit rate

    • If no arrivals, clients are forced to choose from the remaining unpopular objects

New clients cannot stabilize performance l.jpg
New Clients Cannot Stabilize Performance File-Sharing Workload

  • The infusion of new clients at a constant rate cannot compensate for the increasing number of old clients

  • If we want to keep hit rate as a constant, we need exponential client arrival rate

Model validation l.jpg
Model Validation File-Sharing Workload

  • Underlying Zipf assumption cannot be validated directly.

  • Use the proposed model to replicate the object popularity distribution in the trace

    • Estimate various parameters

    • Arrival rate of new objects is chosen to fit the measured data. λo = 5,475 objects per year

Exploring locality aware request routing l.jpg
Exploring Locality-aware Request Routing File-Sharing Workload

  • A significant fraction of Internet bandwidth is consumed by Kazaa

  • How would exploitation of locality help to save bandwidth?

  • Different ways to exploit locality:

    • A centralized proxy cache placed at organization border

    • Request redirection: favor organization-internal peers

      • Centralized request redirection

      • Decentralized request redirection

An ideal proxy cache l.jpg
An Ideal Proxy Cache File-Sharing Workload

  • Assume an ideal proxy: infinite capacity and bandwidth

  • 86% of external bandwidth would be saved

  • However, some may not want to store P2P file-sharing content in a proxy server due to legal issues

Benefits of locality awareness l.jpg
Benefits of Locality-Awareness File-Sharing Workload

  • Trace-based simulation

    • Infinite storage capacity

    • At most 12 concurrent downloads

    • Upload bandwidth 500 Kb/s

    • External bandwidth 100 Kb/s

    • Clients are available only when they’re transferring (a very conservative assumption)

  • Cold misses: objects cannot be found in peers

  • Busy misses: objects found but the peer is unavailable due to concurrent transfers

Benefits of locality awareness21 l.jpg
Benefits of Locality-Awareness File-Sharing Workload

  • Locality awareness obtained 68% byte hit rate for large objects and 37% byte hit rate for small objects

  • A substantial number of miss bytes (62% of large objects, 43% of small objects) are due to unavailable clients

Benefits of increased availability l.jpg
Benefits of Increased Availability File-Sharing Workload

  • Most of bytes served and consumed come from highly available peers

  • Adding availability to the most available hosts earns a higher hit rate than adding to the least available host

Conclusion l.jpg
Conclusion File-Sharing Workload

  • P2P file-sharing workloads are different to Web workloads

    • User are patient

    • Aged clients demand less

    • Fetch-at-most once

  • The proposed model suggests that client births and object births are the fundamental forces driving P2P workloads

  • There’s significant locality in the Kazaa workload

    • Locality-aware peers would save 63% external transfers even under conservation assumption

Comments l.jpg
Comments File-Sharing Workload

  • Some of the observed characteristic may be related to the design of Kazaa and the measuring methodology and thus cannot be generalized

  • The lack of portal sites in P2P system may also be a reason that most popular objects in P2P are less popular than Zipf’s law would predict

Assessing the quality of voice communications over internet backbones l.jpg

Assessing the Quality of Voice Communications Over Internet Backbones

A.P. Markopoulou, F.A. Tobagi, M.J. Karam

Tran. on Networking v11 no5 Oct 2003

Presented by Lu-chuan Kung

Outline26 l.jpg
Outline Backbones

  • VoIP System

    • Playout schemes

  • Voice Impairment in Networks

  • Internet measurements

  • Numerical results

  • Discussion

Voip system l.jpg
VoIP System Backbones

Voip system28 l.jpg
VoIP System Backbones

  • Speech signal

    • Talkspurts have mean ~ 352ms

    • Silence periods have mean ~ 650ms

  • Encoding schemes

  • Packetizer: add headers for different protocols

  • Playout buffer: packets are held for a later playout time in order to smooth playout

  • Decoder: reconstruct the speech signal

Playout schemes l.jpg
Playout Schemes Backbones

  • Two types: fixed and adaptive

  • Fixed playout scheme:

    • End-to-end delay p is the same for all packets

    • Large delay decreases packet loss due to late arrivals, but also decreases interactivity

  • Adaptive playout scheme:

    • Estimate p based on delay dav and delay variation v

    • p = dav + 4v

    • Estimate p

      • Talkspurt by talkspurt

      • Packet by packet

Voice impairment in networks l.jpg
Voice Impairment in Networks Backbones

  • Quality of voice is affected by

    • Encoding

    • Packet loss

    • Network delay jitter

    • End-to-end delay

    • Echo

  • End-to-end delay consists of

    • Encoding delay

    • Packetization delay

    • Network delay

    • Playout bufferring delay

    • Decoding delay

Assessment of voice communication in packet networks l.jpg
Assessment of Voice Communication in Packet Networks Backbones

  • Mean Opinion Score (MOS): a subjective rating given by listeners, given on a scale of 1-5

  • Intrinsic quality MOSintr: quality after compression

Degradation due to loss l.jpg
Degradation Due to Loss Backbones

  • PLC: Packet Loss Concealment

  • Convert loss rate to MOS

Loss of interactivity l.jpg
Loss of Interactivity Backbones

  • Loss of interactivity due to large end-to-end delay

  • NTT study

    • 6 conversation modes (tasks), task 1 is the hardest, task 6 is the most relaxed type

Echo impairment l.jpg
Echo Impairment Backbones

  • Echo can cause major quality degradation

  • The effect of echo is a function of delay and echo losses

Emodel l.jpg
Emodel Backbones

  • Published by ITU-T. Provide formulas to predict MOS of voice quality

  • R = (R0 – Is) – Id – Ie + A

    • R0 : basic SNR

    • Is : impairment of signal, eg. sidetone and PCM

    • Id: impairment due to delay (echo + interactivity)

    • Ie : impairment due to distortion (loss)

    • A : advantage factor (lenient users)

Internet measurements l.jpg
Internet Measurements Backbones

  • Probe measurement

    • 5 major U.S. cities

    • 43 paths in total

    • 7 providers: P1,P2,…,P7

    • 50 bytes probes sent every 10 ms

Observations on the traces l.jpg
Observations on the Traces Backbones

  • Duration of the trace: 3 days

  • Network loss

    • 6 out of 7 providers have outages

    • Outages happened at least once per day

  • Delay characteristics

    • Delay spikes

    • Alternation between high and low states

    • Periodic clustered delay spikes

One example call l.jpg
One Example Call Backbones

  • Apply emodel to the traces using different playout buffer scheme

  • Example of a 15-min call

One example call41 l.jpg
One Example Call Backbones

  • Fixed playout incurs many losses in the last 5 mins

How to choose p for fixed scheme l.jpg
How to Choose p for Fixed Scheme Backbones

  • Tradeoff between loss and delay

  • There is a optimal value of delay to achieve maximum MOS value

Example path many calls l.jpg
Example Path – Many Calls Backbones

  • Random calls uniformly spread over an hour

  • 150 short (3.5-min) and 50 long (10-min) calls

  • Plot CDF vs. MOS

Fixed Playout

Adaptive Playout

Discussion l.jpg
Discussion Backbones

  • Backbone networks have a wide range of performance

    • Some are already able to support high quality voice communications

    • Some are barely able to provide acceptable VoIP service (MOS >3.6)

    • Reliability problems are more serious than QoS service mechanisms

Comments45 l.jpg
Comments Backbones

  • How representative are the chosen paths among the typical paths on Internet?

    • End host to end host paths have larger delay