Download presentation
Presentation is loading. Please wait.
1
UCB Millennium and the Vineyard Cluster Architecture Phil Buonadonna University of California, Berkeley http://www.millennium.berkeley.edu
2
10/9/99UC Berkeley Millennium2 Millennium Project Hierarchical “Cluster of Clusters” PIII-X 64x4 PII 8x2 PIII 32x2 ½ TBDLIB PII PIII Gigabit Ethernet (GbE) Ninja Math Bio CE Physics Astro
3
10/9/99UC Berkeley Millennium3 Millennium Agenda Investigate recent PC technologies in Clusters –NT/Linux –VI Architecture / GbE / Distributed I/O Harvest the lessons learned from NOW –Robust, flexible remote execution –Distributed resource management Investigate clusters that span administrative units –Turn-key cluster deployment –Sense of ownership Investigate the “Computational Economy” Approach –Resource management with a natural sense of ownership –Enough heterogeneous interests to be worthwhile Form basis for Sci. Computing, Internet Services, etc.
4
10/9/99UC Berkeley Millennium4 Vineyard Cluster Architecture Distributed resource utilization and management in a “Vineyard” of Clusters. - VIA / GM, GbE - Multicast Applications / Services - NT / Linux (2.2.x) - Stride Scheduler MPIVEXEC PBS I/O Mgmt / Monitoring REXEC TOOLS Rootstock Distribution
5
10/9/99UC Berkeley Millennium5 Outline Millennium Project Vineyard Cluster SW Architecture Important Component Technologies –Rootstock cluster SW distribution facility –REXEC: Robust Linux Remote Execution –Economic-based Resource allocation –CAN communication over VIA –IO Rivers Directions and Discussion
6
10/9/99UC Berkeley Millennium6 Rootstock Disseminate easy-to-build PC cluster system software Variety of cluster designs –well-engineered high-performance clusters –low-cost casual workgroup clusters –server farms –scalable internet servers Root Cluster Server (CS) –Provides cluster software stock Second-level customized distribution within each cluster from its own CS node
7
10/9/99UC Berkeley Millennium7 Rootstock Cluster Collection of nodes with IP connectivity –can be dedicated subnet, w/ or w/o NAT, or any collection –run nfsd (within cluster), httpd, ssl One node designated as Cluster Root –serves as the root of administrative operations and mgmt. –may be same or different from other nodes –may participate in normal cluster operation or not => is trusted by other nodes and has storage for dialtone May have designated front-end nodes or not May have dedicated cluster-area-network (eg. Myrinet) or not.
8
10/9/99UC Berkeley Millennium8 Rootstock Mechanics K cluster stock - build - os - drvrs - mill SW - os mods leased builds cs CAN Cluster System Distribution Center... IP network 1. Cluster Stock - Rootstock build pages - Full Current Linux - all fixes and pckgs - SSL, SSH - Cluster Drivers - Cluster System Layers - rexec, mpe, pbs - Optional SW ($) - Cluster Kernal Mods 5. Cluster Update button (future) - 2nd dialtone, CF engine, rolling update 2. Make the CS “graft” - specify IP address - pckg removes - dchp, dns, nis,... sanity check and build - resolv.conf, /etc/hosts,... constructs cluster build (lease) download CS build floppy Cluster 3. CS power-on build - xfer and localize DT - add local admin scripts - node build floppy 4. Node power-on build - local stock from CS
9
10/9/99UC Berkeley Millennium9 Computational Economy Market-based approach to resource allocation –Optimizes for user value Resources Economic F.E. APIAPI APIAPI Access Modules Resource Managers Time Share Batch Queue Apps (Value)
10
10/9/99UC Berkeley Millennium10 REXEC Remote Execution Secure, decentralized remote execution environment Features –Decouples resource discovery and selection –Multiple Allocation Policies (VEXECs) –Decentralized control Each client rexec is the root for a distributed task. –Dynamic discovery and configuration Resource announcements on a cluster multi-cast channel All Soft State –Simple, well-defined failure and cleanup models “They all fall down” –Secure Translates Pricing Mechanism to Resource Allocation
11
10/9/99UC Berkeley Millennium11 REXEC / VEXEC Components –rexecd, rexec & vexecd rexecd vexecd (Policy A) rexec Cluster IP Multicast Channel %rexec –n 2 –r 3 indexer minimum $ vexecd (Policy B) Node ANode BNode CNode D “Node A” run indexer on Nodes AB at 3 credits/min
12
10/9/99UC Berkeley Millennium12 Interactive Pricing Mechanism Most work on “economic mechanisms” focuses on single item or batch case –hold auctions (e.g., second-price sealed bid) integrated into Vineyard PBS –interactive case needs to be very simple Bidder i gets b i / k b k of CPU at rate b i –enforced by stride scheduler Running cluster mirror usage experiment –two identical clusters for one user community with $ accounts –one free and uncontrolled –one for bid and controlled –which is more desirable to use
13
10/9/99UC Berkeley Millennium13 Communication / VIA Multiple Physical Layers –Fast Ethernet –Gigabit Ethernet (Inter & Intra cluster net) –Myrinet w/ Lanai7 (Intra cluster net) Transports –IP, IP Multicast –VI Architecture / GM Explore integrated IPC and distributed I/O
14
10/9/99UC Berkeley Millennium14 AM Architecture Components –Endpoints –Virtual Networks –Bundles Operations –Request / Reply Short, Med, Long –Create, Map, Free –Poll, Wait Credit based flow control Proc A Proc B Proc C
15
10/9/99UC Berkeley Millennium15 AM-VIA Architecture VI Queue (VIQ) –Logical channel for AM message type –VI & independent Send/Receive Queues –Independent request credit scheme (counter n ) MAP Object –Container for 3 VIQ’s Short,Medium,Long –Single Registered Memory Region MAP Object
16
10/9/99UC Berkeley Millennium16 AM-VIA Integration Bundle: Pair of VI Completion Queues –Send/Receive Proc A Proc B Proc C Endpoints: Collection of MAP objects –Virtual network emulated by point-to-point connections
17
10/9/99UC Berkeley Millennium17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.