CoDeeN,Large Files, & CoDeploy KyoungSoo Park, Vivek Pai, Larry Peterson Princeton University
What Is CoDeeN? Content Distribution Networks Web pages load faster if You’re contacting a nearby server That server isn’t overloaded The page is already in memory You use long-lived TCP connections right
CoDeeN By The Numbers In operation ~10 months 150 nodes (~120 live) 6.5 million reqs/day 5 million “good” reqs/day about 300GB/day (estimate) 7K-20K unique IPs per 24 hours Over 600,000 unique IPs served
Our “Strategy” Stay operational Build some credibility Exploit that + activity to branch out Involves doing sales pitches Tap into new consumers In particular, nonprofits, non-commercial
What Most CDNs (want to) Serve
But What About Big Files?
How Big? 200 TeraBytes of data total Interviews: about 3.5GB each Files: average of 700MB each
Problem: “Nobody” Handles 700MB CDNs designed for avg size 10KB 1MB = 100 files 700MB = 70,000 files Commercial disks ~ 100GB Our storage ~ 3GB
New Problems Why not replicate less? You’re farther away Why not merge requests? client readahead slow client
Our Approach AgentCDN Server Client file file0-1 file2-3 file4-5 file3-4 file1-2
Low-Level HTTP Stuff GET name/ranges Header: blah HTTP/ Partial Range: start-end/length Header: blah GET name Range: bytes ranges Header: blah HTTP/ OK Content-length: piece length New-header: obj length egress ingress
Benefits Transparent to client (no software) Server only needs byte-range support Every real server has it Will generate more log entries Can use/augment HTTP infrastructure Caching, redirection, etc Adding security controls Low incremental overhead Agent is about 300 semicolons CDN mods about 20 semicolons
Dual-Use Technology Other one-to-many problems Node/experiment installs Software updates Push model instead of pull Solution? Build “master” script Push to nodes Nodes pull as needed
CoDeploy Now in beta Small set of tools at source No (new) installation at target Needed tools at CoDeeN-hosting nodes Fun components Peer-review system of CoDeeN nodes Nearest CoDeeN finder Parallel ssh, scp
What To Expect Next Will redeploy auto-rewriting service Akamai-like URL mangling Was in testing before December upgrade Tie rewriter into “hosting” service Make it simpler for provider to use CoDeeN
More Info KyoungSoo Park Vivek Pai