performance analysis of a parallel downloading scheme from mirror sites throughout the internet
Download
Skip this Video
Download Presentation
Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

Loading in 2 Seconds...

play fullscreen
1 / 24

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet. Allen Miu, Eugene Shih 6.892 Class Project December 3, 1999. Overview. Problem Statement Advantages/Disadvantages Operation of Paraloading Goals of Experiment Setup of Experiment

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet' - unity


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
performance analysis of a parallel downloading scheme from mirror sites throughout the internet

Performance Analysis of a Parallel Downloading Scheme from Mirror Sites Throughout the Internet

Allen Miu, Eugene Shih

6.892 Class Project

December 3, 1999

overview
Overview
  • Problem Statement
  • Advantages/Disadvantages
  • Operation of Paraloading
  • Goals of Experiment
  • Setup of Experiment
  • Current Results
  • Summary
  • Questions
problem statement is paraloading good
Problem Statement: Is “Paraloading” Good?

Paraloading is the downloading from multiple

mirror sites in parallel.

Mirror C

Paraloader

Mirror A

Mirror B

advantages of paraloading
Advantages of Paraloading
  • Performance is proportional to the realized aggregate bandwidth of the parallel connections
  • Less prone to complete download failures compared to the single connection download
  • Facilitates dynamic load balancing among parallel connections
  • Facilitates reliable, out-of-order delivery (similar to Netscape)
disadvantages of paraloading
Disadvantages of Paraloading
  • Can be overly aggressive
  • Consumes more server resources
  • Overhead costs for scheduling, maintaining buffers, and sending block request messages
  • Only effective when mirror servers are available
step 1 obtain mirror list
Step 1: Obtain Mirror List
  • Hard-coded
  • DNS?

Mirror

List

Mirror C

Paraloader

Mirror B

Mirror A

step 2 obtain file length
Step 2: Obtain File Length

Mirror C

Paraloader

Mirror B

Mirror A

step 3 send block requests
Step 3: Send Block Requests

Mirror C

Paraloader

Mirror B

Mirror A

step 4 re order
Step 4: Re-order

Mirror C

Paraloader

Mirror B

Mirror A

step 5 send next request
Step 5: Send Next Request

Mirror C

Paraloader

Mirror B

Mirror A

goals of experiment
Goals of Experiment
  • Main goal: To compare the performance of serial and parallel downloading
  • To verify the results of Rodriguez et al.
  • To examine whether varying the degree of parallelism, the number of mirror servers used, affects performance
  • To gain experience with paraloading and to find out what issues are involved in designing efficient paraloading systems
experiment setup
Experiment Setup
  • Implemented a paraloader application in Java, using HTTP1.1 (range-requests and persistent connections)
  • Files are downloaded at MIT from 3 different sets (kernel, mars, tucows) of 7 mirror servers
  • Degree of parallelism examined: M = 1, 3, 5, 7
  • Downloaded a 1MB and a 300KB file (S = 1MB, 300KB) in 1 hour intervals for 7 days
  • Block Size = 32KB
results
Results
  • Paraloading decreases download time over the average single connection case
  • Speedup is far from optimal case (aggregate bandwidth)
    • Block request gaps result in wasted bandwidth
      • Gaps are proportional to RTT
    • Congestion at client? Possible but unlikely.
acknowledgements
Acknowledgements
  • Dave Anderson
  • Dorothy Curtis
  • Wendi Heinzelmann
  • WIND Group
summary of contributions
Summary of Contributions
  • Implemented a paraloader
  • Verified that paraloading indeed provides performance gain… sometimes
    • Increasing degree of parallelism improves overall performance
  • Performance gains are not as good as those reported by Rodriguez et al.
future work
Future Work
  • Examine how block size affects performance gain
  • Examine cost of paraloading
  • Implement and test various optimization techniques
  • Perform measurements at different client sites
paraloading will not be effective in all situations
Paraloading Will Not Be Effective In All Situations
  • Clients should have enough “slack” bandwidth capacity to open more than one connection
  • Parallel connections are bottleneck disjoint
  • Target data on mirror servers is consistent and static
  • Security and authentication services are installed where appropriate
  • Data transport is reliable
  • Mirror locations are quickly and easily obtained
step by step process of the block scheduling paraloading scheme
Step-by-step Process of the Block Scheduling Paraloading Scheme

1. Obtain a list of mirror sites

2. Open a connection to a mirror server and obtain file length

3. Divide file length into blocks

4. Send a block request to each open connection

5. Wait for a response

6. Send a new block request to the first connection that finished downloading a block

7. Loop back to 5 until all blocks are retrieved

paraloading is not a well studied concept
Paraloading is Not a Well-studied Concept
  • Byers et al. proposed using Tornado codes to facilitate paraloading.
  • Rodriguez et al. proposed the block scheduling paraloading scheme that is used in our project
ad