Hardware. THE MOVES INSTITUTE Hardware So you want to build a cluster. What do you need to buy? Remember the definition of a beowulf cluster: Commodity.

Hardware

THE MOVES INSTITUTE Hardware So you want to build a cluster. What do you need to buy? Remember the definition of a beowulf cluster: Commodity machines Private cluster network Open source software Two of these are hardware-related

THE MOVES INSTITUTE Racks You need some place to keep all the hardware organized. Racks are the usual solution. The things you put into racks are measured in something called “U”s. The standard rack is 42 U’s tall--A U is a measure of vertical height, about 1.75”. CPUs are often one or two U’s, ie they take up one or two slots on the rack.

THE MOVES INSTITUTE Racks It’s easy to get a cable mess in the back of the rack. Aside from management, a cable mess can affect air flow. Be consistent about routing power and network cables. Color-coded cables are an excellent idea; for example, make all private network cables one color, public network cables another Use zip ties to group cables Make sure you have enough space for air flow (typically front to back)

THE MOVES INSTITUTE KVM Keyboard-Video-Mouse. While usually you want to remotely manage the cluster, you’ll often wind up attaching a monitor to a box to watch it boot or troubleshoot. It’s good to have a lightweight flat panel display, keyboard, and mouse you can attach as needed

THE MOVES INSTITUTE CPU: 64 vs. 32 Bit Isn’t 64 bits twice as good as 32 bits? Not really. The 64 bits usually refers to the amount of memory that can be addressed. With 32 bits you can ‘count’ up to 4 billion, which usually means you can’t have more than 4 GB of memory in one process. Because of the way VM is set up, 2 GB is a more typical maximum address space for 32 bit processes 64 bits allows 4.5 Petabytes of memory address This can be important if you have an application that breaks the 2 GB address space barrier This is plausible now that many machines can be configured with more than 4 GB of real memory

THE MOVES INSTITUTE 64 vs. 32 bit 64 bit CPUs have bigger registers (high speed holding areas) but this often isn’t all that big of a deal. Almost every 64 bit CPU has a compatibility mode that allows 32 bit applications to be run To run in 64 bit mode your application will need to be compiled for 64 bits and link to 64 bit libraries.

THE MOVES INSTITUTE 64 bit: The Dark Side A pointer is just a piece of data that holds the address of something in memory int count = 100; int *aPtr = &count; 100 0 4 GB 0x2248 aPtr

THE MOVES INSTITUTE 64 bit: The Dark Side Since a pointer refers to a place in memory, it needs to be big enough to reference any place in the memory range. For 32 bit applications that is 4 bytes. For 64 bit applications that is 8 bytes, twice as big. That means code that contains 64 bit pointers will take up more space than code with 32 bit pointers. If you have a cache with a fixed size you can fit less 64 bit code into it p1p2 p3p4 p5p6 24 byte cache can hold six 32 bit pointers p1 p2 p3 24 byte cache can hold three 64 bit pointers

THE MOVES INSTITUTE 64 bit: Dark Side The extra space that 64 bit code takes up in caches can reduce performance by increasing cache misses. For this reason you probably shouldn’t compile to 64 bits unless your application requires a large address space (In fact under OS X the GUI libraries are all 32 bit, even though the underlying hardware may be 64 bit. Compiling them to 64 bit would have reduced performance.)

THE MOVES INSTITUTE Dual Core CPU designers have been running up against a wall lately; they can’t increase clock speed as much as in the past As a result they are adding more silicon, in the case of dual-core CPUs replicating two or more CPUs and cache on one chip This doesn’t increase sequential speed, but can increase speed for multiple process or multiple thread programs

THE MOVES INSTITUTE CPUs This is a major religious issue. The major contenders in the HPC Beowulf cluster space are AMD Opteron (64/32 bit) Intel Itanium (64/32 bit) Intel Xeon MP (Dual-core, EM64T) Intel Pentium M (32 bit, low power) PowerPC G5/970 (64 bit, OS X usually)

THE MOVES INSTITUTE CPUs The Opteron is a strong choice. Good price/performance, good power consumption, good SMP capabilities But some vendors (famously including Dell) don’t sell Opterons

THE MOVES INSTITUTE SMP Each node in the cluster can be an SMP machine, for example a dual-processor 1U box. If the CPU is dual core this can give 4 cores per box Four and eight CPU SMP boxes are also available. More cores per CPU are likely in the future.

THE MOVES INSTITUTE Blades Lots of people want extremely dense installations, with the most possible CPUs per square foot. Blades are essentially hosts on plug-in cards. They’re inserted into a blade chassis.

THE MOVES INSTITUTE Blades The blades have an interconnect to the chassis and can share some resources such as the power supply, CD and floppy, network cables, etc. They may implement features such as hot swap The downside is that they can be somewhat more expensive than plain 1U rackmounts, may be more proprietary, and are somewhat less flexible. Also, while they take up less floor space, they still generate a similar amount of heat IBM is a major blade vendor

THE MOVES INSTITUTE Memory How much memory do you need on a compute node? That depends on your application and current computer part economics. Roughly 1/2 GB per GFLOP of speed seems to do OK. Opterons are at this writing roughly 3 GFLOP/processor, so about 3-4 GB per dual processor box. This may change with dual core CPUs. You don’t want your application to page-fault. If you’re running 64 bit that probably means you need an address space over 4 GB, so you may well need 4-8 GB or more of physical memory

THE MOVES INSTITUTE Virtual Memory Virtual Process Address Space (Often 4 GB) Process Working Set in Physical Memory Max size = Physical memory Size Page faults happen when a page isn’t in working memory and has to be retrieved from disk. If you can fit the entire process into physical memory you can avoid page faults and speed up the process.

THE MOVES INSTITUTE Disk There are several schools of though about disk on compute nodes: Leave them diskless. - Less heat, less to fail, single image on server Put a big enough disk in to swap locally - Don’t have to swap/page across the network Put a big enough disk in to run the OS - Don’t have to mess with net booting Put a big enough disk in to run the OS and keep some data cached locally I favor the last method. Disk space is cheap, labor expensive

THE MOVES INSTITUTE CD If you do go the disk-per-compute node route you should have at least a CDROM drive in order to boot from CD and install. OS bloat has made it likely that a DVD may be required in a few years.

THE MOVES INSTITUTE Video You should have at least an el-cheapo VGA card in order to hook up a monitor. If you’re doing visualization work a high- end graphics card may be needed. Note that a fancy graphics card may consume more power and generate more heat.

THE MOVES INSTITUTE Networking The compute nodes have to communicate with each other and the front end over the private network. What network should be used for this? We want: High speed Low latency Cheap Major technologies are gigabit ethernet, Myrinet, and Infiniband

THE MOVES INSTITUTE Networking Myrinet is a 2 Gb/sec, 2-3 ms latency networking standard that uses multimode fiber NIC price (as of this writing) is about $500, and a 16 port switch about $5K.

THE MOVES INSTITUTE Infiniband Infiniband is all-singing, all-dancing, 10 Gb/Sec (for 4X rate), ~4 ms latency Pricing seems to be similar to that of Myrinet. HP prices on the order of $1K for a PCI adapter, $10K for a 24 port switch. Other places are probably cheaper. Often a cluster will have both an infiniband and an ethernet network, the Infiniband for MPI and the ethernet for conventional communications

THE MOVES INSTITUTE Gigabit Ethernet 1 Gb/S, high latency (it goes through the TCP/IP protocol stack by default) Advantage: dirt cheap, ubiquitous. Gbit ethernet is built into most server mobos by default, unmanaged L2 gigabit switches are at aprox $10/port You can reuse existing expertise, cables, etc. It’s pretty tough to argue against gigabit ethernet on an economics basis

THE MOVES INSTITUTE Front End The front end is the cluster’s face to the rest of the world. It alone is connected to the public network. It typically runs a web server, scheduling software, NFS, DHCP on the private network, a firewall, and a few other utilities More memory than a compute node is good. More disk is good. (disk is the subject of another talk.)

THE MOVES INSTITUTE Price Very roughly: Compute nodes at $3K/each (dual opterons, 4 gig mem, 120 gig disk) Front end at $4K Rack at $2K Small L2 GigE switch ($150) For aprox $40K you can get a ~10 node cluster with ~20 CPUs that has a peak performance of around 30-50 GFLOPS. (this pretty much ignores disk space)

Hardware. THE MOVES INSTITUTE Hardware So you want to build a cluster. What do you need to buy? Remember the definition of a beowulf cluster: Commodity.

Similar presentations

Presentation on theme: "Hardware. THE MOVES INSTITUTE Hardware So you want to build a cluster. What do you need to buy? Remember the definition of a beowulf cluster: Commodity."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hardware. THE MOVES INSTITUTE Hardware So you want to build a cluster. What do you need to buy? Remember the definition of a beowulf cluster: Commodity.

Similar presentations

Presentation on theme: "Hardware. THE MOVES INSTITUTE Hardware So you want to build a cluster. What do you need to buy? Remember the definition of a beowulf cluster: Commodity."— Presentation transcript:

Similar presentations

About project

Feedback