codeen large files codeploy
Download
Skip this Video
Download Presentation
CoDeeN,Large Files, & CoDeploy

Loading in 2 Seconds...

play fullscreen
1 / 17

CoDeeN,Large Files, & CoDeploy - PowerPoint PPT Presentation


  • 66 Views
  • Uploaded on

CoDeeN,Large Files, & CoDeploy. KyoungSoo Park, Vivek Pai, Larry Peterson Princeton University. What Is CoDeeN?. Content Distribution Networks Web pages load faster if You’re contacting a nearby server That server isn’t overloaded The page is already in memory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' CoDeeN,Large Files, & CoDeploy' - hester


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
codeen large files codeploy

CoDeeN,Large Files, & CoDeploy

KyoungSoo Park, Vivek Pai, Larry Peterson

Princeton University

what is codeen
What Is CoDeeN?
  • Content Distribution Networks
  • Web pages load faster if
    • You’re contacting a nearby server
    • That server isn’t overloaded
    • The page is already in memory
    • You use long-lived TCP connections right
codeen by the numbers
CoDeeN By The Numbers
  • In operation ~10 months
  • 150 nodes (~120 live)
  • 6.5 million reqs/day
    • 5 million “good” reqs/day
    • about 300GB/day (estimate)
  • 7K-20K unique IPs per 24 hours
    • Over 600,000 unique IPs served
our strategy
Our “Strategy”
  • Stay operational
  • Build some credibility
  • Exploit that + activity to branch out
    • Involves doing sales pitches
  • Tap into new consumers
    • In particular, nonprofits, non-commercial
how big
How Big?
  • 200 TeraBytes of data total
  • Interviews: about 3.5GB each
  • Files: average of 700MB each
problem nobody handles 700mb
Problem: “Nobody” Handles 700MB
  • CDNs designed for avg size 10KB
  • 1MB = 100 files
  • 700MB = 70,000 files
  • Commercial disks ~ 100GB
    • Our storage ~ 3GB
new problems

slow

client

New Problems
  • Why not replicate less?
    • You’re farther away
  • Why not merge requests?

client

readahead

our approach

file0-1

file1-2

file0-1

file

file2-3

file4-5

file3-4

file4-5

Our Approach

CDN

CDN

Client

Agent

CDN

CDN

Server

CDN

CDN

low level http stuff
GET name/ranges

Header: blah

Header: blah

HTTP/1.0 206 Partial

Range: start-end/length

Header: blah

GET name

Range: bytes ranges

Header: blah

HTTP/1.0 200 OK

Content-length: piece length

New-header: obj length

Low-Level HTTP Stuff

egress

ingress

benefits
Benefits
  • Transparent to client (no software)
  • Server only needs byte-range support
    • Every real server has it
    • Will generate more log entries
  • Can use/augment HTTP infrastructure
    • Caching, redirection, etc
    • Adding security controls
  • Low incremental overhead
    • Agent is about 300 semicolons
    • CDN mods about 20 semicolons
dual use technology
Dual-Use Technology
  • Other one-to-many problems
    • Node/experiment installs
    • Software updates
  • Push model instead of pull
  • Solution?
    • Build “master” script
    • Push to nodes
    • Nodes pull as needed
codeploy
CoDeploy
  • Now in beta
  • Small set of tools at source
    • No (new) installation at target
    • Needed tools at CoDeeN-hosting nodes
  • Fun components
    • Peer-review system of CoDeeN nodes
    • Nearest CoDeeN finder
    • Parallel ssh, scp
what to expect next
What To Expect Next
  • Will redeploy auto-rewriting service
    • Akamai-like URL mangling
    • Was in testing before December upgrade
  • Tie rewriter into “hosting” service
    • Make it simpler for provider to use CoDeeN
more info
More Info

http://codeen.cs.princeton.edu/codeploy

KyoungSoo Park

[email protected]

Vivek Pai

[email protected]

ad