Cluster Computing Overview CS444I Internet Services Winter 00 © 1999-2000 Armando Fox

© 1999, Armando Fox Clustering: Holy Grail n Goal: Take a cluster of commodity workstations and make them look like a supercomputer. n Problems l Application structure l Partial failure management l Interconnect technology l System administration

© 1999, Armando Fox Cluster Prehistory: Tandem NonStop n Early (1974) foray into transparent fault tolerance through redundancy l Mirror everything (CPU, storage, power supplies…), can tolerate any single fault (later: processor duplexing) l “Hot standby” process pair approach l What’s the difference between high availability and fault tolerance? n Noteworthy l “Shared nothing”--why? l Performance and efficiency costs? l Later evolved into Tandem Himalaya, which used clustering for both higher performance and higher availability

© 1999, Armando Fox Pre-NOW Clustering in the 90’s n IBM Parallel Sysplex and DEC OpenVMS l Targeted at conservative (read: mainframe) customers l Shared disks allowed under both (why?) l All devices have cluster-wide names (shared everything?) l 1500 installations of Sysplex, 25,000 of OpenVMS Cluster n Programming the clusters l All System/390 and/or VAX VMS subsystems were rewritten to be cluster-aware l OpenVMS: cluster support exists even in single-node OS! l An advantage of locking into proprietary interfaces n What about fault tolerance?

© 1999, Armando Fox The Case For NOW: MPP’s a Near Miss n uproc perf. improves 50% / yr (4%/month) l 1 year lag:WS = 1.50 MPP node perf. l 2 year lag:WS = 2.25 MPP node perf. n No economy of scale in 100s => +$ n Software incompatibility (OS & apps) => +$$$$ n More efficient utilization of compute resources (statistical multiplexing) n “Scale makes availability affordable” (Pfister) Which of these do commodity clusters actually solve?

© 1999, Armando Fox Philosophy: “Systems of Systems” n Higher Order systems research: aggressively use off- the-shelf hardware and OS software n Advantages: l easier to track technological advances l less development time l easier to transfer technology (reduce lag) n New challenges (“the case against NOW”): l maintaining performance goals l system is changing underneath you l underlying system has other people's bugs l underlying system is poorly documented

© 1999, Armando Fox Clusters: “Enhanced Standard Litany” n Hardware redundancy n Aggregate capacity n Incremental scalability n Absolute scalability n Price/performance sweet spot n Software engineering n Partial failure management n Incremental scalability n System administration n Heterogeneity

© 1999, Armando Fox Clustering and Internet Services n Aggregate capacity l TB of disk storage, THz of compute power (if we can harness in parallel!) n Redundancy l Partial failure behavior: only small fractional degradation from loss of one node l Availability: industry average across “large” sites during 1998 holiday season was 97.2% availability (source: CyberAtlas) l Compare: mission-critical systems have “four nines” (99.99%)

© 1999, Armando Fox Spike Absorption n Internet traffic is self-similar l Bursty at all granularities less than about 24 hours l What’s bad about burstiness? n Spike Absorption l Diurnal variation: peak vs. average demand typically a factor of 3 or more l Starr Report: CNN peaked at 20M hits/hour (compared to usual peak of 12M hits/hour; that’s +66%) n Really the holy grail: capacity on demand l Is this realistic?

© 1999, Armando Fox Clustering and Internet Workloads n Internet vs. “traditional” workloads l e.g. Database workloads (TPC benchmarks) l e.g. traditional scientific codes (matrix multiply, simulated annealing and related simulations, etc.) n Some characteristic differences l Read mostly l Quality of service (best-effort vs. guarantees) l Task granularity l “Embarrasingly parallel” l …but are they balanced? (we’ll return to this later)

© 1999, Armando Fox Software Challenges n Message-passing & Active Messages n Shared memory: Network RAM l CC-NUMA, Software DSM: Anyone who thinks cache misses can take milliseconds is an idiot. (Paraphrasing Larry McVoy at OSDI 96) n MP vs SM a long-standing religious debate n Arbitrary object migration (“network transparency”) l What are the problems with this? l Hints: RPC, checkpointing, residual state

© 1999, Armando Fox Partial Failure Management n What does partial failure mean for… l a transactional database? l A read-only database striped across cluster nodes? l A compute-intensive shared service? n What are appropriate “partial failure abstractions”? l Incomplete/imprecise results? l Longer latency? n What current programming idioms make partial failure hard? l Hint: remember the original RPC papers?

© 1999, Armando Fox Software Challenges, Again? n Real issue: we have to think differently about programming… l …to harness clusters? l …to get decent failure semantics? l …to really exploit software modularity? n Traditional uniprocessor programming idioms/models don’t seem to scale up to clusters n Question: Is there a “natural to use” cluster model that scales down to uniprocessors? l If so, is it general or application-specific? l What would be the obstacles to adopting such a model?

© 1999, Armando Fox System Administration on a Cluster Thanks to Eric Anderson (1998) for some of this material. n Total cost of ownership (TCO) way high for clusters l Median sysadmin cost per machine per year (1996): ~$700 l Cost of a headless workstation today: ~$1500 n Previous Solutions l Pay someone to watch l Ignore or wait for someone to complain l “Shell Scripts From Hell” (not general  vast repeated work) n Need an extensible and scalable way to automate the gathering, analysis, and presentation of data

© 1999, Armando Fox System Administration, cont’d. Extensible Scalable Monitoring For Clusters of Computers (Anderson & Patterson, UC Berkeley) n Relational tables allow properties & queries of interest to evolve as the cluster evolves n Extensive visualization support allows humans to make sense of masses of data n Multiple levels of caching decouple data collection from aggregation n Data updates can be “pulled” on demand or triggered by push

© 1999, Armando Fox Case Study: The Berkeley NOW n History and Pictures of an early research cluster Pictures l NOW-0: four HP-735’s l NOW-1: 32 headless Sparc-10’s and Sparc-20’s l NOW-2: 100 UltraSparc 1’s, Myrinet interconnect l inktomi.berkeley.edu: four Sparc-10’s n www.hotbot.com: 160 Ultra’s, 200 CPU’s total l NOW-3: eight 4-way SMP’s n Myrinet interconnection l In addition to commodity switched Ethernet l Originally Sparc SBus, now available on PCIbus

© 1999, Armando Fox The Adventures of NOW: Applications n AlphaSort: 8.41 GB in one minute, 95 UltraSparcs l runner up: Ordinal Systems nSort on SGI Origin, 5 GB) l pre-1997 record, 1.6 GB on an SGI Challenge n 40-bit DES key crack in 3.5 hours l “NOW+”: headless and some headed machines n inktomi.berkeley.edu (now inktomi.com) l now fastest search engine, largest aggregate capacity n TranSend proxy & Top Gun Wingman Pilot browser l ~15,000 users, 3-10 machines

© 1999, Armando Fox The Adventures of NOW: Tools n GLUnix (coming up, later today) n xFS, a serverless network filesystem l Why not just a big RAID on a single server? n Support for the Myrinet fast interconnect l Active Message (AM-1 and AM-2) over Myrinet l Fast Sockets: one-copy TCP fast path over AM-1 on Myrinet n Moral: cluster tools are hard?

© 1999, Armando Fox Cluster Summary n Clusters have potential advantages…but serious challenges to achieving them in practice l Kind of like Network Computers? n Everyone and their brother is now selling a cluster l Who’s selling a system, and who’s selling a promise? l Can clustering be sold as a “secret sauce”? n Next: non-clustering, and approaches to clustering

Cluster Computing Overview CS444I Internet Services Winter 00 © 1999-2000 Armando Fox

Similar presentations

Presentation on theme: "Cluster Computing Overview CS444I Internet Services Winter 00 © 1999-2000 Armando Fox"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cluster Computing Overview CS444I Internet Services Winter 00 © 1999-2000 Armando Fox

Similar presentations

Presentation on theme: "Cluster Computing Overview CS444I Internet Services Winter 00 © 1999-2000 Armando Fox"— Presentation transcript:

Similar presentations

About project

Feedback