Astrophysical Applications on Superclusters Matthew Bailes Swinburne Centre for Astrophysics and Supercomputing
Outline No: –Linpack Mflops –latencies –bandwidths –evangelism Why a Supercluster? What is the Supercluster? How do we use the Supercluster? What does it do?
Why a Supercluster? Swinburne wants reputation. Hypothesis: –30 times the power –Six years of Moore’s law We can do problems 30x as complex as other groups.
Centre Goals: Fundamental Research. Public Outreach and Education. Commercial Supercomputing. –Astrophysical Special Effects –Cluster Monitoring Tools –Commercial Rendering
What is the Supercluster? Supercluster sounds better than Beowulf if you are an astronomer. Design Goals SSI I (1998): –No one component worth more than A10K –Order of magnitude more than single workstation. –Dedicated resource. (dispel various myths) –10 GB scratch/node. –10 MB/s IO node-node. –Decent fortran/C/C++ compiler.
Case Study: CSIRO Astronomy 1984: VAX 11/ : Convex C2 ( > 10 times speed up) 1995: Power Challenge ( 10 processors ) 1999: Linux Boxes Unless package supports parallelism, users won’t use clusters or even SMP/Numa unless their science is obviously constrained.
Theorists: Possess and use clusters effectively. Know what MPI is. Can’t get money.
SSI I (Jan 1998) 16 DEC 500 MHz alphas 2MB cache 192 MB RAM 13 GB disk 24-port CISCO switch MPICH/f77/C++/FFTw/emacs/gcc Zeroeth Law of Cluster Computing: Cluster Computing is inevitable if your budget is finite.
SSI II (Nov 1998). SSI I + 8 x 600 MHz DECs 4 MB cache. Corollary: Your first cluster is your happiest. First Law of Cluster Computing: Your cluster soon becomes hetereogeneous.
SSI III (March 1999) SSI II + – MHz ev6 processors –512 MB RAM/node –18 GB disk/node CISCO 5500 switch –3.2 Gb/s backplane Virtual Reality Theatrette –Seats 37 Second Law of Cluster Computing: MTBF = MTBF 0 /N
How do we use the Supercluster? Linux Workstations. (despite free OS) No batch system (just 3 “power” users). Home-grown MPI programs. C++/fortran/java.
Problems: Distributed TB disk rarely has > 10% free. MPI hangs on FPE or “p4pg” errors. CPUs too powerful for fast ethernet and tape drive on some applications. Difficult to monitor.
Applications. Neutron Star Searches. –Looked at 10% of the Southern Sky –Recorded 1.4 TB in 21 days. –1 ev56 workstation take 7 years. –SSI III took 25 days. Discovered 7 “millisecond” pulsars. –Could scale to 1000 nodes on TCP/IP. 17 MB256MBFFTSearchFoldSave
Discovery Implications: Discovered most relativistic Neutron Star + white dwarf binary known. Emit gravitational waves –Coalesce in 7 Gyr. Population of ultra- relativistic systems.
Problems. Most interesting systems are relativistic. Full sensitivity requires coherent addition. If observation time > 10 minutes, computational penalty becomes very large.
Coherent Dedispersion. Problem: –Cosmic Signals are Weak –Cosmic radio signals propagate at v!=c In 1971 new method proposed: –record electric field –Apply numerical filter to it.
What does this mean? 20 MHz = 20 MB/second. 200 times real time to process (ev6) Gives 50 nanosecond time resolution Need 7*8 hour observations to do science –One node 1.5 yr –50 nodes 9 days –1985 VAX 11/780 (one century)
Discovered? Millisecond pulsars emit short (1us wide) pulses across GHz bandwidths –Implies seed areas of 30 cm or less PSR in a 5.7 day orbit –1 Mkm in radius a-b = mm a b
Future: Search for us wide pulses in SN 1987A –25 day search HIPASS GB in < 12 hours. SSI III + servernet can mimic CSIRO’s correlator SSI IV: –ES40 + TB disk SSI V: –128 nodes + Inifiniband/servernet II?
Conclusions: Clusters are too hard to code for most astronomers. MPIwhat? Breakthroughs are possible with radical increases in computer power.