Building Internet Services With TACCArmando Fox, UC Berkeley TACC Retrospective: Contributions, Non-Contributions, and What We Really Learned Armando Fox University of California,Berkeley
Building Internet Services With TACCArmando Fox, UC Berkeley Vision: “The Content You Want” What do above apps have in common? n Adapt (collect, filter, transform) existing content… äaccording to client constraints ärespecting network limitations äaccording to per-user preferences n But: Lack of unified framework for designing apps that exploit this observation
Building Internet Services With TACCArmando Fox, UC Berkeley Contributions n TACC, a model for structuring services äTransformation, Aggregation, Caching, Customization of Internet content n Scalable TACC server äBased on clusters of commodity PC’s äEasy to author “industrial strength” services äScalable Network Service (SNS) platform maps app semantics onto cluster-based availability mechanisms n Experience with real users ä~15,000 today at UCB
Building Internet Services With TACCArmando Fox, UC Berkeley What’s TACC? n Transformation (“local”, “one-to-one”) äTranSend, Anonymizer n Aggregation (“nonlocal”, “many-to-one”) äSearch engines, crawlers, newswatchers n Caching äBoth original and locally-generated content n Customization äPer user: for content generation äPer device: data delivery, content “packaging”
Building Internet Services With TACCArmando Fox, UC Berkeley TACC Example: TranSend n Transparent HTTP proxy n On-the-fly, lossy compression of specific MIME types (GIF, JPG...) n Cache both original & transformed n User specifies aggressiveness and “refinement” UI äParameters to HTML & image transformers T T $ $ C
Building Internet Services With TACCArmando Fox, UC Berkeley Top Gun Wingman n PalmPilot web browser n Intermediate-form page layout n Image scaling & transcoding äControlled by layout engine n Device-specific ADU marshalling äIncluding client versioning äOriginals and device-specific pages cached C $ $ A A ADU T T html
Building Internet Services With TACCArmando Fox, UC Berkeley Application Partitioning n Client competence äStyled text, images, widgets are fine äBitmaps unnecessary n Client responsiveness äScrolling, etc. shouldn’t require roundtrip to server n Client independence äVery late conversion to client-specific format
Building Internet Services With TACCArmando Fox, UC Berkeley TACC Conceptual Data Flow C $ W W W A W W W T FE User request To Internet n Front end accepts RPC-like user requests n User’s customization profile retrieved n Original data fetched from cache or Internet n Aggregation/transformation workers operate on data according to customization profile
Building Internet Services With TACCArmando Fox, UC Berkeley TACC Model Summary n Mostly stateless, composable workers n Unifies previously ad hoc applications under one framework n Encourages re-use through modularization äComposition enables both new services and new clients n TACC breakdown provides unified way to think about app structure
Building Internet Services With TACCArmando Fox, UC Berkeley Services Should Be Easy To Write n Rapid prototyping äInsulate workers from “mundane” details n Easy to incorporate existing/legacy code äFew assumptions about code structure äMust support variety of languages äMay be fragile n Composition to leverage existing code
Building Internet Services With TACCArmando Fox, UC Berkeley Building a TACC Server n Challenge: Scalable Network Service (SNS) requirements äScalability to 100K’s of users with high availability äCost effective to deploy & administer n But, services should remain easy to write äServer provides some bug robustness äServer provides availability äServer handles load balancing and scaling äPreserve modularity (& componentwise upgradability) when deploying
Building Internet Services With TACCArmando Fox, UC Berkeley Layered Model of Internet Services n TACC Layer äProgramming model based on composable building blocks n SNS Layer: “large virtual server” äImplements SNS requirements äCluster computing for hardware F/T and incremental scaling httpd, etc. TACCTACC Scalable Network Svc äExploit TACC model semantics for software F/T n SNS layer is reusable and isolated from TACC äApplication “content” orthogonal to SNS mechanisms äKey to making apps easy to write
Building Internet Services With TACCArmando Fox, UC Berkeley Why Use a Cluster? n Incremental scalability, low cost components n High availability through hardware redundancy Goals: n Demonstrate that clusters and TACC fit well together n Separate SNS from TACC
Building Internet Services With TACCArmando Fox, UC Berkeley Cluster-Based TACC Server n Component replication for scaling and availability n High-bandwidth, low-latency interconnect n Incremental scaling: commodity PC’s C $ LB/FT Interconnect FE $$ W W W T W W W A GUI Front Ends CachesCaches User Profile Database WorkersWorkers Load Balancing & Fault Tolerance Administration Interface
Building Internet Services With TACCArmando Fox, UC Berkeley “Starfish” Availability: LB Death äFE detects via broken pipe/timeout, restarts LB C $ Interconnect FE $$ W W W T LB/FT W W W A
Building Internet Services With TACCArmando Fox, UC Berkeley “Starfish” Availability: LB Death äFE detects via broken pipe/timeout, restarts LB C $ Interconnect FE $$ W W W T LB/FT W W W A äNew LB announces itself (multicast), contacted by workers, gradually rebuilds load tables äIf partition heals, extra LB’s commit suicide äFE’s operate using cached LB info during failure
Building Internet Services With TACCArmando Fox, UC Berkeley “Starfish” Availability: LB Death äFE detects via broken pipe/timeout, restarts LB C $ Interconnect FE $$ W W W T LB/FT W W W A äNew LB announces itself (multicast), contacted by workers, gradually rebuilds load tables äIf partition heals, extra LB’s commit suicide äFE’s operate using cached LB info during failure
Building Internet Services With TACCArmando Fox, UC Berkeley Fault Recovery Latency Task queue length
Building Internet Services With TACCArmando Fox, UC Berkeley Behavior in the Large n TranSend: 160 image transformations/sec = 10 Ultra-1 servers äPeak seen during UCB traces on 700-modem bank: 15/sec äAmortized hardware cost <$0.35/user/month (one $5K PC serving ~15,000 subscribers) n Wingman: factor of 6-8 worse n Administration: one undergraduate part-time
Building Internet Services With TACCArmando Fox, UC Berkeley Building a Big System n Restartable, atomic workers äRead-only data from other origin server(s) n Orthogonal separation of scalability/availability from application “content” äMultiple lines of defense äApp modules agree to obey semantics compatible with these mechanisms äCommon-case failure behavior compatible with users’ Internet experience äEnables reuse of whole workers, however diverse
Building Internet Services With TACCArmando Fox, UC Berkeley Availability & Scalability Summary n Pervasive strategy: timeout, retry, restart äTransient failures usually invisible to user äProcess peers watch each other äMostly stateless workers, xact support possible n Simplicity from exploiting soft state äPiggyback status info on multicast beacons äUse of stale LB info fine in practice n “Starfish” availability works in practice
Building Internet Services With TACCArmando Fox, UC Berkeley Service Authoring n Keyword hiliting: < 1 day n Wingman: 2-3 weeks n Various apps from graduate seminar projects äSafe worker upload äAnnotate the Web ä“Channel aggregators”
Building Internet Services With TACCArmando Fox, UC Berkeley New Services By Composition n Compose existing services to create a new one ä ~2.5 hours to implement äComposes with TranSend or Wingman TranSend Metasearch Internet
Building Internet Services With TACCArmando Fox, UC Berkeley Experience With Real Users n Transparent enhancements n Minimal downtime n Low administration cost äMulticast-based administration GUI n Virtually no dedicated resources at UCB ä“Overflow pool” of ~100 UltraSPARC servers n Users don’t mind relying on middleware proxy
Building Internet Services With TACCArmando Fox, UC Berkeley Why Now? n Internet’s critical mass n Commercial push for many device types (transistor curves) n Cluster computing economically viable n A good time for infrastructural services
Building Internet Services With TACCArmando Fox, UC Berkeley Related Work n Transformational proxy services: WBI, Strands n Application partitioning: Wit, InfoPad, PARC Ubiquitous Computing n Computing in the infrastructure: Active Networks n Soft state for simplicity and robustness: Microsoft Tiger, multicast routing protocols
Building Internet Services With TACCArmando Fox, UC Berkeley Summary of Contributions n TACC, a composition-based Internet services programming model äcaptures rich variety of apps äone view of customization n No-hassle deployment on a cluster äAutomatic and robust partial-failure handling äAvailability & scaling strategies work in practice n New apps are easy to write, deploy, debug äSNS behaviors are free äCompose existing services to enable new clients
Building Internet Services With TACCArmando Fox, UC Berkeley Non-Contributions (a/k/a Future Work) Accidental contributions: n Legacy code glue n Cheap test rig for next project (prototyping path discovery; a bare bones “cluster OS”) Non-contributions: n Fair resource allocation over cluster n Built-in security abstractions n Rich state management abstractions
Building Internet Services With TACCArmando Fox, UC Berkeley What We Really Learned n Design for failure äIt will fail anyway äEnd-to-end argument applied to availability n Orthogonality is even better than layering äNarrow interface vs. no interface äA great way to manage system complexity äThe price of orthogonality äTechniques: Refreshable soft state; watchdogs/timeouts; sandboxing
Building Internet Services With TACCArmando Fox, UC Berkeley Future Work n TACC as test rig for Ninja n Taxonomy of app structure and platforms äWhat is the “big picture” of different types of Internet services, and where does TACC fit in? äJoint work with Dr. Murray Mazer at the Open Group Research Institute n Apply TACC lessons to building reliable distributed systems n Formalize programming model