Presentation is loading. Please wait.

Presentation is loading. Please wait.

"Practical Considerations in Building Beowulf Clusters" Lessons from Experience and Future Directions Arch Davis (GS*69) Davis Systems Engineering.

Similar presentations


Presentation on theme: ""Practical Considerations in Building Beowulf Clusters" Lessons from Experience and Future Directions Arch Davis (GS*69) Davis Systems Engineering."— Presentation transcript:

1 "Practical Considerations in Building Beowulf Clusters" Lessons from Experience and Future Directions Arch Davis (GS*69) Davis Systems Engineering

2

3

4

5

6

7 Poor light-socket coordination

8 Parallel Computing Architectures 1. (not parallel) Fastest Possible serial –a. Make it complex –b. Limits 2. Old superscalar, vector Crays, etc. 3. Silicon graphics shared memory (<64 CPUs) 4. Intel shared memory: 2-32 processor servers 5. Distributed memory: “Beowulf” clusters 6. Biggest D.m.: NEC SS6 “Earth Simulator”

9 + glue → Cluster ? Building a Beowulf cluster

10 Some Design Considerations 1. Processor type and speed 2. Single or Dual 3. Type of memory 4. Disk topology 5. Interconnection technology 6. Physical packaging 7. Reliability

11 “Just a bunch of ordinary PCs” But to be reliable, more must be watched. –Power supplies –Fans –Motherboard components –Packaging layout –Heat dissipation –Power quality To be cost effective, configure carefully. –Easy to overspecify and cost >2x what is necessary –Don’t overdo the connections, they cost a lot. –The old woman swallowed a fly. Be careful your budget doesn’t die.

12 1. Processor type & speed A. Pentium 4 Inexpensive if not leading edge speed B. Xeon =dual processor P4. Shares a motherboard. C. AMD Opteron 64-bit Needed for >2GB mem. D. (future) Intel 64-bit Will be AMD compatible! E. IBM 970 (G5) True 64-bit design Apple is using F. Intel Itanium “Ititanic” 64-bit long instruction word

13 Disk Topology 1. Disk per board 2. Diskless + RAID

14 Interconnect Options Always a desire for way more speed than possible Latency is ultimately an issue of light speed Existing options: 1. Ethernet, including Gigabit Switched Very Robust, by Dave Boggs EECS’72 Affordable, even at Gigabit 2. Infiniband Switched 3. Proprietary: Myrinet, Quadrics, Dolphin Various topologies, including 2&3-D meshes Remote DMA may be transfer method Assumes noise-free channel, may have CRC

15

16 Physical Packaging It’s not “rocket science,” but it takes care. A few equations now and then never hurt when you are doing heat transfer design. How convenient is it to service? How compact is the cluster? What about the little things? Lights & buttons? “Take care of yourself, you never know how long you will live.”

17

18

19 Reliability Quality is designed-in, not an accident. Many factors affect reliability. Truism: “All PCs are the same. Buy the cheapest and save.” Mil-spec spirit can be followed without gold plate. Many components and procedures affect the result. Early philosophy: triage of failing modules Later philosophy: Entire cluster uptime Consequence of long uptime: user confidence, greatly accelerated research

20

21

22 Benchmarks ● Not a synthetic ● 100 timesteps of Terra code (John R. Baumgardner, LANL) ● Computational fluid dynamics application ● Navier-Stokes equation with ∞ Prandtl number ● 3D spherical shell multi-grid solver ● Global elliptic problem with 174,000 elements ● Inverting and solving at each timestep Results are with Portland Group pf90 Fortran compiler on –fastsse option And with Intel release 8 Fortran: Compiler MachineIntel Portland baselineP4 2.0319s362 sec lowpowerP4M 1.6342s358 sec Router2Xeon 2.4264s305 sec epiphanyXeon 2.2264s312 sec pntium28P4 2.8/800 172s209 sec opteron146AMD 2.0 160s164 sec Cray designNEC SX-6 ~50 sec

23

24 Software Usually is Linux with MPI for communication. Could be Windows, but not many. Compilers optimize. Management and monitoring software Scheduling software

25 Linux32-bit/64-bit Windows Pentium 4 Athlon Xeon Opteron PGI ® Workstation – 1 to 4 CPU Systems

26 Workstation Clusters PGI CDK ™ = PGI Compilers + Open Source Clustering Software A turn-key package for configuration of an HPC cluster from a group of networked Linux workstations or dedicated blades

27

28 What about the future? Always go Beowulf if you can. Work on source code to minimize communication. Compilers may never be smart enough to automatically parallelize or second-guess the programmer or the investigator. Components will get faster, but interconnects will always lag processors.

29

30

31 Future Hardware No existing boards are made for clustering. Better management firmware is needed. Blade designs may be proprietary. They may require common components to operate at all. Hard disks need more affordable reliability. Large, affordable Ethernet switches are needed.

32 General advice? Think of clusters as “personal supercomputers.” They are simplest if used as a departmental or small-group resource. Clusters too large may cost too much: –Overconfigured –Massive interconnect switches –Users can only exploit so many processors at once –Multiple runs may beat one massively parallel run. –Think “lean and mean.”

33

34

35

36 Opportunities 1. Test these machines with your code. 2. Get a consultation on configuration

37

38

39

40

41

42

43

44 More are Coming Peter Bunge sends his greetings In anticipation of a Deutsche Geowulf 256 Processors… And many more clusters here and there. Happy Computing! But, NOT The End


Download ppt ""Practical Considerations in Building Beowulf Clusters" Lessons from Experience and Future Directions Arch Davis (GS*69) Davis Systems Engineering."

Similar presentations


Ads by Google