Loosely Coupled Parallelism: Clusters
Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which require that special systems are built. This is also true for the switched closely coupled architectures such as Crossbar systems, Omega switched networks etc. Special builds implies cost (sometimes massive cost) An obvious idea is to take standard computers (e.g. Pentium PC’s) and standard connections between them (e.g. Ethernet links or the Internet), connect them up and devise some suitable software to deliver parallel computing at much lower cost and potentially on a very big scale. Such systems were first set up in the 1990’s (at NASA) and are now among the most important ways of achieving parallel processing power. They are called Clusters, or Clusters of Workstations (COWS)
Cluster basics Clusters can be 2 or more (even 1000’s) of individual machines Can be used for true parallel processing i.e. to apply more computing power to hard problems. This is difficult – devising distributed parallel algorithms to attack a problem at the many nodes of a cluster simultaneously. Solutions will be largely problem specific, although groups of similar problems will, of course, have similar solutions Can be used for ‘load balancing’, i.e. where a computer system has to service many largely independent jobs, use the cluster to ‘contract out' and divide up the jobs among the nodes of the cluster. Obviously, this is a way to set up powerful servers for large numbers of clients. This kind of load balancing is easier to do than the ‘single problem’ application of the previous paragraph and it can be done with generic software tools – see later slides about this. Clusters provide very good scaling and fault tolerance. Since there is no special hardware and since all the computers at each node are the exactly the same, nodes can be added to or lost without problems – the only effect will be on the aggregate power available
Small Clusters The most common way to set these up is to use Linux, along with open source software such as Mosix and OSCAR, which allows PC clusters for load balancing to be set up in quite a straightforward manner. Can be used for true parallel processing i.e. to apply more computing power to hard problems. This is difficult – devising distributed parallel algorithms to attack a problem at the many nodes of a cluster simultaneously. Solutions will be largely problem specific, although groups of similar problems will, of course, have similar solutions Can be used for ‘load balancing’, i.e. where a computer system has to service many largely independent jobs, using the cluster to ‘contract out' and divide up the jobs among the nodes of the cluster. Obviously, this is a way to set up powerful servers for large numbers of clients. This kind of load balancing is easier to do than the ‘single problem’ application of the previous paragraph and it can be done with generic software tools – see later slides about this. Provides very good scaling and fault tolerance. Since there is no special hardware and since all the computers at each node are the exactly the same, nodes can be added to or lost without problems – the only effect will be on the aggregate power available
Big Clusters Again, these are often based on Linux. The early NASA software was named ‘Beowulf’ and Beowulf systems are still being developed. The top 500 supercomputer web site ( from the University of Mannheim now lists Linux based clusters among the world’s biggest beasts, alongside the specially built giants from Cray, Silicon Graphics, IBM and Fujitsu (who built the Earth Simulator). In the latest Nov 2004 list, Linux clusters are at no’s 4 and 5. Top computer in this list is a special IBM 3D torus with 32,768 nodes, each of which is a complete computer+memory on a chip. No 4 is ‘MareNostrum’ a 2268 node Linux cluster in Barcelona, Spain. The next list is due in June The (search for Extra terrestrial Intelligence) project is in effect a cluster of up to 3.5 million personal computers in 226 countries. A recent estimate of the aggregate PC time devoted in this way is 800,000 years!
Some details: Mosix Mosix – Multiple Computer Operating System for Unix. Extends the Linux kernal so that any standard Linux process can migrated to another node to take advantage of better resources or execute faster. The migrated process doesn’t know where it is. As far as its home node is concerned, the process is running locally. That’s transparency for you! – no special programming is required. Migration can be automatic, each node is a master for locally created processes and a server for remote process that migrated from other nodes. There are monitoring algorithms showing what is going on where and how efficient everything is. But Mosix can’t run a single process on two or more nodes at the same time, so it comes into the ‘load balancing’ category.
Some details: OSCAR and Beolwulf OSCAR (Open Source Cluster Application Resource) clusters consist of one server node and lots of clients, all of which must have the same hardware setup. OSCAR provides an installation framework to install the necessary files and software resources among the clients. Mostly aimed at High Performance Computing (HPC). Beowulf. The best known Linux based type of cluster. Basically provides multiple machine libraries of clustering tools. The most interesting allowing PVM (Parallel Virtual Machines) and MPI (Message Passing Interface). Both these allow for true parallel attack on difficult problems, but both need users who know how to write special software to hook into PVM or MPI to take advantage of the cluster. MPI in particular is a vendor independent standard for message passing, with a large user based applications forum