Download presentation
Presentation is loading. Please wait.
1
Cluster Computing Overview CS444I Internet Services Winter 00 © 1999-2000 Armando Fox fox@cs.stanford.edu
2
© 1999, Armando Fox Today’s Outline n Clustering: the Holy Grail n The Case For NOW n Clustering and Internet Services n Meeting the Cluster Challenges
3
© 1999, Armando Fox Clustering: Holy Grail n Goal: Take a cluster of commodity workstations and make them look like a supercomputer. n Problems l Application structure l Partial failure management l Interconnect technology l System administration
4
© 1999, Armando Fox Cluster Prehistory: Tandem NonStop n Early (1974) foray into transparent fault tolerance through redundancy l Mirror everything (CPU, storage, power supplies…), can tolerate any single fault (later: processor duplexing) l “Hot standby” process pair approach l What’s the difference between high availability and fault tolerance? n Noteworthy l “Shared nothing”--why? l Performance and efficiency costs? l Later evolved into Tandem Himalaya, which used clustering for both higher performance and higher availability
5
© 1999, Armando Fox Pre-NOW Clustering in the 90’s n IBM Parallel Sysplex and DEC OpenVMS l Targeted at conservative (read: mainframe) customers l Shared disks allowed under both (why?) l All devices have cluster-wide names (shared everything?) l 1500 installations of Sysplex, 25,000 of OpenVMS Cluster n Programming the clusters l All System/390 and/or VAX VMS subsystems were rewritten to be cluster-aware l OpenVMS: cluster support exists even in single-node OS! l An advantage of locking into proprietary interfaces n What about fault tolerance?
6
© 1999, Armando Fox The Case For NOW: MPP’s a Near Miss n uproc perf. improves 50% / yr (4%/month) l 1 year lag:WS = 1.50 MPP node perf. l 2 year lag:WS = 2.25 MPP node perf. n No economy of scale in 100s => +$ n Software incompatibility (OS & apps) => +$$$$ n More efficient utilization of compute resources (statistical multiplexing) n “Scale makes availability affordable” (Pfister) Which of these do commodity clusters actually solve?
7
© 1999, Armando Fox Philosophy: “Systems of Systems” n Higher Order systems research: aggressively use off- the-shelf hardware and OS software n Advantages: l easier to track technological advances l less development time l easier to transfer technology (reduce lag) n New challenges (“the case against NOW”): l maintaining performance goals l system is changing underneath you l underlying system has other people's bugs l underlying system is poorly documented
8
© 1999, Armando Fox Clusters: “Enhanced Standard Litany” n Hardware redundancy n Aggregate capacity n Incremental scalability n Absolute scalability n Price/performance sweet spot n Software engineering n Partial failure management n Incremental scalability n System administration n Heterogeneity
9
© 1999, Armando Fox Clustering and Internet Services n Aggregate capacity l TB of disk storage, THz of compute power (if we can harness in parallel!) n Redundancy l Partial failure behavior: only small fractional degradation from loss of one node l Availability: industry average across “large” sites during 1998 holiday season was 97.2% availability (source: CyberAtlas) l Compare: mission-critical systems have “four nines” (99.99%)
10
© 1999, Armando Fox Spike Absorption n Internet traffic is self-similar l Bursty at all granularities less than about 24 hours l What’s bad about burstiness? n Spike Absorption l Diurnal variation: peak vs. average demand typically a factor of 3 or more l Starr Report: CNN peaked at 20M hits/hour (compared to usual peak of 12M hits/hour; that’s +66%) n Really the holy grail: capacity on demand l Is this realistic?
11
© 1999, Armando Fox Diurnal Cycle (UCB dialups, Jan. 1997) n ~750 modems at UC Berkeley n Instrumented early 1997
12
© 1999, Armando Fox Clustering and Internet Workloads n Internet vs. “traditional” workloads l e.g. Database workloads (TPC benchmarks) l e.g. traditional scientific codes (matrix multiply, simulated annealing and related simulations, etc.) n Some characteristic differences l Read mostly l Quality of service (best-effort vs. guarantees) l Task granularity l “Embarrasingly parallel” l …but are they balanced? (we’ll return to this later)
13
© 1999, Armando Fox Meeting the Cluster Challenges n Software & programming models n Partial failure and application semantics n System administration
14
© 1999, Armando Fox Software Challenges n Message-passing & Active Messages n Shared memory: Network RAM l CC-NUMA, Software DSM: Anyone who thinks cache misses can take milliseconds is an idiot. (Paraphrasing Larry McVoy at OSDI 96) n MP vs SM a long-standing religious debate n Arbitrary object migration (“network transparency”) l What are the problems with this? l Hints: RPC, checkpointing, residual state
15
© 1999, Armando Fox Partial Failure Management n What does partial failure mean for… l a transactional database? l A read-only database striped across cluster nodes? l A compute-intensive shared service? n What are appropriate “partial failure abstractions”? l Incomplete/imprecise results? l Longer latency? n What current programming idioms make partial failure hard? l Hint: remember the original RPC papers?
16
© 1999, Armando Fox Software Challenges, Again? n Real issue: we have to think differently about programming… l …to harness clusters? l …to get decent failure semantics? l …to really exploit software modularity? n Traditional uniprocessor programming idioms/models don’t seem to scale up to clusters n Question: Is there a “natural to use” cluster model that scales down to uniprocessors? l If so, is it general or application-specific? l What would be the obstacles to adopting such a model?
17
© 1999, Armando Fox System Administration on a Cluster Thanks to Eric Anderson (1998) for some of this material. n Total cost of ownership (TCO) way high for clusters l Median sysadmin cost per machine per year (1996): ~$700 l Cost of a headless workstation today: ~$1500 n Previous Solutions l Pay someone to watch l Ignore or wait for someone to complain l “Shell Scripts From Hell” (not general vast repeated work) n Need an extensible and scalable way to automate the gathering, analysis, and presentation of data
18
© 1999, Armando Fox System Administration, cont’d. Extensible Scalable Monitoring For Clusters of Computers (Anderson & Patterson, UC Berkeley) n Relational tables allow properties & queries of interest to evolve as the cluster evolves n Extensive visualization support allows humans to make sense of masses of data n Multiple levels of caching decouple data collection from aggregation n Data updates can be “pulled” on demand or triggered by push
19
© 1999, Armando Fox Visualizing Data: Example n Display aggregates of various interesting machine properties on the NOW’s n Note use of aggregation, color
20
© 1999, Armando Fox Case Study: The Berkeley NOW n History and Pictures of an early research cluster Pictures l NOW-0: four HP-735’s l NOW-1: 32 headless Sparc-10’s and Sparc-20’s l NOW-2: 100 UltraSparc 1’s, Myrinet interconnect l inktomi.berkeley.edu: four Sparc-10’s n www.hotbot.com: 160 Ultra’s, 200 CPU’s total l NOW-3: eight 4-way SMP’s n Myrinet interconnection l In addition to commodity switched Ethernet l Originally Sparc SBus, now available on PCIbus
21
© 1999, Armando Fox The Adventures of NOW: Applications n AlphaSort: 8.41 GB in one minute, 95 UltraSparcs l runner up: Ordinal Systems nSort on SGI Origin, 5 GB) l pre-1997 record, 1.6 GB on an SGI Challenge n 40-bit DES key crack in 3.5 hours l “NOW+”: headless and some headed machines n inktomi.berkeley.edu (now inktomi.com) l now fastest search engine, largest aggregate capacity n TranSend proxy & Top Gun Wingman Pilot browser l ~15,000 users, 3-10 machines
22
© 1999, Armando Fox The Adventures of NOW: Tools n GLUnix (coming up, later today) n xFS, a serverless network filesystem l Why not just a big RAID on a single server? n Support for the Myrinet fast interconnect l Active Message (AM-1 and AM-2) over Myrinet l Fast Sockets: one-copy TCP fast path over AM-1 on Myrinet n Moral: cluster tools are hard?
23
© 1999, Armando Fox Cluster Summary n Clusters have potential advantages…but serious challenges to achieving them in practice l Kind of like Network Computers? n Everyone and their brother is now selling a cluster l Who’s selling a system, and who’s selling a promise? l Can clustering be sold as a “secret sauce”? n Next: non-clustering, and approaches to clustering
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.