Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace.

Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace Corporation El Segundo, CA {brooks,lee,mauyeung}@aero.org, Gary.B.Green@aero.org

HPC Clustering Basics ● HPC Cluster features: – Commodity computers – Networked to enable distributed, parallel computations – Vastly lower cost compared to traditional supercomputers ● Many, but not all HPC applications work well on clusters

Cluster Overview ● Fellowship is the Aerospace Corporate Cluster – Name is short for "The Fellowship of the Ring" ● Running FreeBSD 4.8-STABLE ● Over 183GFlops of floating point performance using the LINPACK benchmark

Cluster Overview Nodes and Servers ● 160 Nodes (320 CPUs) – dual CPU 1U systems with Gigabit Ethernet – 86 Pentium III (7 1GHz, 40 1.26GHz, 39 1.4GHz – 74 Xeon 2.4GHz ● 4 Core Systems – frodo – management server – fellowship – shell server – gamgee – backup, database, monitoring server – legolas – scratch server (2.8TB)

Cluster Overview Network and Remote Access ● Gigabit Ethernet network – Cisco Catalyst 6513 switch – Populated with 11 16-port 10/100/1000T blades ● Serial console access – Cyclades TS2000 and TS3000 Terminal Servers ● Power control – Baytech RPC4 and RPC14 serial power controllers

Cluster Overview Physical Layout

Design Issues ● Operating System ● Hardware Architecture ● Network Interconnects ● Addressing and Naming ● Node Configuration Management ● Job Scheduling ● System Monitoring

Operating System ● Almost anything can work ● Considerations: – Local experience – Needed applications – Maintenance model – Need to modify OS ● FreeBSD – Diskless support – Cluster architect is a committer – Ease of upgrades – Linux Emulation

Hardware Architecture ● Many choices: – i386, SPARC, Alpha ● Considerations: – Price – Performance – Power/heat – Software support (OS, apps, dev tools) ● Intel PIII/Xeon – Price – OS Support – Power

Network Interconnects ● Many choices – 10/100 Ethernet – Gigabit Ethernet – Myrinet ● Issues – price – OS support – application mix ● Gigabit Ethernet – application mix ● middle ground between tightly and loosely coupled applications – price

Addressing and Naming Schemes ● To subnet or not? ● Public or private IPs? ● Naming conventions – The usual rules apply to core servers – Large cluster probably want more mechanical names for nodes ● 10.5/16 private subnet ● Core servers named after Lord of the Rings characters ● Nodes named and numbed by location – rack 1, node 1: ● r01n01 ● 10.5.1.1

Node Configuration Management ● Major methods: – individual installs – automated installs – network booting ● Automation is critical ● Network booted nodes – PXE ● Automatic node disk configuration – version in MBR – diskprep script ● Upgrade using copy of root

Job Scheduling ● Options – manual scheduling – batch queuing systems (SGE, OpenPBS, etc.) – custom schedulers ● Sun Grid Engine – Ported to FreeBSD starting with Ron Chen's patches

System Monitoring ● Standard monitoring tools: – Nagios (aka Net Saint) – Big Sister ● Cluster specific tools: – Ganglia – Most schedulers ● Ganglia – port: sysutils/ganglia- monitor-core ● Sun Grid Engine

System Monitoring Ganglia

Lessons Learned ● Hardware attrition can be significant ● Neatness counts in cabling ● System automation is very important – If you do it to a node, automate it ● Much of the HPC community thinks the world is a Linux box

FY 2004 Plans ● Switch upgrades: Sup 720 and 48-port blades ● New racks: another row of racks adding 6 more node racks (192 nodes) ● More nodes: either more Xeons or Opterons ● Upgrade to FreeBSD 5.x

Future Directions ● Determining a node replacement policy ● Clustering on demand ● Schedular improvements ● Grid integration (Globus Toolkit) ● Trusted clusters

Wish List ● Userland: – Database driven, PXE/DHCP server ● Kernel: – Distributed files system support (i.e. GFS) – Checkpoint and restart capability – BProc style distributed process management

Acknowledgements  Aerospace – Michael AuYeung – Brooks Davis – Alan Foonberg – Gary Green – Craig Lee  Vendors – iXsystems – Off My Server – Iron Systems – Raj Chahal ● iXsystems, Iron Systems, ASA Computers

Resources ● Paper and presentation: – http://people.freebsd.org/~brooks/papers/bsdcon2003/

Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace.

Similar presentations

Presentation on theme: "Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace.

Similar presentations

Presentation on theme: "Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace."— Presentation transcript:

Similar presentations

About project

Feedback