Codeen large files codeploy
Download
1 / 17

CoDeeN,Large Files, & CoDeploy - PowerPoint PPT Presentation


  • 66 Views
  • Uploaded on

CoDeeN,Large Files, & CoDeploy. KyoungSoo Park, Vivek Pai, Larry Peterson Princeton University. What Is CoDeeN?. Content Distribution Networks Web pages load faster if You’re contacting a nearby server That server isn’t overloaded The page is already in memory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CoDeeN,Large Files, & CoDeploy' - hester


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Codeen large files codeploy

CoDeeN,Large Files, & CoDeploy

KyoungSoo Park, Vivek Pai, Larry Peterson

Princeton University


What is codeen
What Is CoDeeN?

  • Content Distribution Networks

  • Web pages load faster if

    • You’re contacting a nearby server

    • That server isn’t overloaded

    • The page is already in memory

    • You use long-lived TCP connections right


Codeen by the numbers
CoDeeN By The Numbers

  • In operation ~10 months

  • 150 nodes (~120 live)

  • 6.5 million reqs/day

    • 5 million “good” reqs/day

    • about 300GB/day (estimate)

  • 7K-20K unique IPs per 24 hours

    • Over 600,000 unique IPs served


Our strategy
Our “Strategy”

  • Stay operational

  • Build some credibility

  • Exploit that + activity to branch out

    • Involves doing sales pitches

  • Tap into new consumers

    • In particular, nonprofits, non-commercial




How big
How Big?

  • 200 TeraBytes of data total

  • Interviews: about 3.5GB each

  • Files: average of 700MB each


Problem nobody handles 700mb
Problem: “Nobody” Handles 700MB

  • CDNs designed for avg size 10KB

  • 1MB = 100 files

  • 700MB = 70,000 files

  • Commercial disks ~ 100GB

    • Our storage ~ 3GB


New problems

slow

client

New Problems

  • Why not replicate less?

    • You’re farther away

  • Why not merge requests?

client

readahead


Our approach

file0-1

file1-2

file0-1

file

file2-3

file4-5

file3-4

file4-5

Our Approach

CDN

CDN

Client

Agent

CDN

CDN

Server

CDN

CDN


Low level http stuff

GET name/ranges

Header: blah

Header: blah

HTTP/1.0 206 Partial

Range: start-end/length

Header: blah

GET name

Range: bytes ranges

Header: blah

HTTP/1.0 200 OK

Content-length: piece length

New-header: obj length

Low-Level HTTP Stuff

egress

ingress


Benefits
Benefits

  • Transparent to client (no software)

  • Server only needs byte-range support

    • Every real server has it

    • Will generate more log entries

  • Can use/augment HTTP infrastructure

    • Caching, redirection, etc

    • Adding security controls

  • Low incremental overhead

    • Agent is about 300 semicolons

    • CDN mods about 20 semicolons


Dual use technology
Dual-Use Technology

  • Other one-to-many problems

    • Node/experiment installs

    • Software updates

  • Push model instead of pull

  • Solution?

    • Build “master” script

    • Push to nodes

    • Nodes pull as needed


Codeploy
CoDeploy

  • Now in beta

  • Small set of tools at source

    • No (new) installation at target

    • Needed tools at CoDeeN-hosting nodes

  • Fun components

    • Peer-review system of CoDeeN nodes

    • Nearest CoDeeN finder

    • Parallel ssh, scp


What to expect next
What To Expect Next

  • Will redeploy auto-rewriting service

    • Akamai-like URL mangling

    • Was in testing before December upgrade

  • Tie rewriter into “hosting” service

    • Make it simpler for provider to use CoDeeN


More info
More Info

http://codeen.cs.princeton.edu/codeploy

KyoungSoo Park

kyoungso@cs.princeton.edu

Vivek Pai

vivek@cs.princeton.edu