EE616 Technical Project Video Hosting Architecture By Phillip Sutton
Problem Description Need to store and serve massive amounts of video data. Solution must be: –Scalable –Reliable –Relatively fast
Complications Oh yeah…. Have relatively little cash. SO, need minimal startup costs!
Options YouTube…Believe it or Not. Build it yourself. Managed or dedicated hosting Content Delivery Network (CDN). Amazon Simple Storage Service.
YouTube Free to use. 100 million videos served daily. Hosted on Google’s reliable and scalable infrastructure.
Video Sharing Site Comparison WebsiteYouTubeYahoo VideoVeohVimeo Unique Visitors per year205,593,00048,026,00011,476,000569,000 Max Video Bit Rate (kbps)~ ,5001,600 Max Upload File Size (mb) /wk Max Length (min)10N/A Max Screen Size(s)320x x x720 4 Host Format (streaming)FLV Processing TimeUp to several hours Few hoursMinutes 5 1 estimated 2 increasing to 1 GB 3 upcoming 700 kbps 4 claims this capability
Drawbacks Limited file size –Need 4.7 GB. Limited bitrate – Implies relatively low quality. For higher bitrate sites –Still suffer from limited file size. No real options to manage library. No real options to monetize.
Build It Yourself Have almost complete and utter control. No messy CDN contracts to deal with. Scalable; depending on your budget.
Drawbacks Expensive to start. Expensive to grow. Requires space, power, and resources. Requires knowledgeable manpower to maintain and support.
Drawbacks
Managed/Dedicated Hosting Let someone else deal with it –setup, maintenance, and support. Mostly reliable –Many claim 99.9% uptime. Affordable to start –500 GB of storage and 2,500 GB bandwidth. –cost about same as small efficiency apartment on Southside.
Drawbacks Can’t scale with you. Overage costs will get you!!! Can’t control hardware. Can’t make favorable networking agreements.
Content Delivery Networks Multiple data centers. Most have direct internet backbone access. Designed for performance. Replicate content.
Drawbacks Traditionally marketed to enterprises –Apple iTunes uses Akamai. Hard to figure costs w/o signing agreement. Prepay for chunks of storage and bandwidth. Exceeding allocation can be costly. Pay for idle storage and bandwidth.
Amazon Simple Storage Service New kid on the block. Same infrastructure as Amazon.com –Scalable, high availability, low latency. Unlimited storage. Unlimited bandwidth. Pay only for what you use. No contracts; zero cost to startup.
Drawbacks New kid on the block. Latency perhaps not as good as CDNs. Bandwidth costs may still be an issue. No server side processing.
Comparing Costs Build library of GB DVDs Deliver 100TB per month. HostedCDNAmazon S3 Storage$241,000$23,552$3,523 Bandwidth$141,312$29,696$15,153 Total Per Month$382,312$53,248$18,676
S3 Overview Store objects up to 5 GB in size with metadata. Objects stored in buckets. Unlimited number of objects per bucket. Each bucket is owned by an Amazon Web Service (AWS) account.
S3 Overview Object is identified unique key. Use REST-style HTTP, SOAP, or HTTP GET/PUT interfaces. Supports BitTorrent protocol. Authorize requests with ACLs.
S3 Overview Authenticated URLs can be created with time-bounded validity.
Over Simplified Architecture Web Server / CMS S3 Web Client
Over Simplified Architecture Use S3’s online storage service and economy hosting/bandwidth provider. Use a content management system to track all assets stored on S3. Web client communicates with CMS and S3.
Upload Content Web client requests authentication keys from CMS. Once keys are received, client can send files directly to S3. Or send files to CMS without access keys. Then CMS forwards to S3.
Get Content Web client request content from CMS. CMS issues authenticated URL with limited time to live. Client then has preset amount of time to retrieve file directly from S3.
Issues In addition to drawbacks mentioned earlier No server-side processing of scripts. Need to better handle read/write failures. Need to build your own software.
Next Lot’s of work left to do. Create more detailed architecture. Work out code details. Implement and test scalability and performance.
Future Integration with content management system Integrate with Amazon’s EC2 service. Explore BitTorrent protocol for increased through-put.
QUESTIONS