Download presentation
Presentation is loading. Please wait.
Published byGregory Greenwell Modified over 10 years ago
1
Low-Frequency Pulsar Surveys and Supercomputing Matthew Bailes
2
Outline: Baseband Instrumentation MultiBOB MWA survey vs PKSMB survey Data rates CPU times Low-Frequency Pulsar Monitoring The Future Supercomputers
3
Pulsar “Dedispersion” Incoherent
4
Coherent Dedispersion Unresolved on us timescales From young or millisecond pulsars Power-law distribution of energies PSR J0218+4232
5
1022+1001 Pulsar Timing (Kramer et al.)
6
CPSR2 Timing (Hotan, Bailes & Ord)
7
Swinburne Baseband Recorders etc 1998: Canadian S2 to computer (16 MHz x 2) 100K system + video tapes 2000: CPSR 20 MHz x 2 + DLT7000 drives x 4 2002: CPSR2 128 MHz x 2 + real-time supercomputer (60 cores) 2006: DiFX (Deller, Tingay, Bailes & West) Software Correlator (ATNF adopted) 2007: APSR 1024 MHz x 2 + real-time supercomputer (160 cores) 2008: MultiBOB 13 x 1024 ch x 64us + fibre + 1600-core supercomputer
8
dspsr software Mature Delivers < 100 ns timing on selected pulsars Total power estimation every 8us with RFI excision Write a “loader” Can do: Giant pulse work Pulsar searching (coherent filterbanks) Pulsar timing/polarimetry Interferometry with pulsar gating
9
PSRDADA (van Straten) psrdada.sourceforge.net Generic UDP data capture system (APSR/MultiBOB) Ring Buffer(s) Can attach threads to fold/dedisperse etc Hierachical buffers Shares available CPU resources/disk Web-based control/monitoring Free! + hooks to dspsr & psrchive.
10
APSR Takes 8 Gb/s voltages Forms: 16 x 128 channels (with coherent dedispersion) 4 Stokes, umpteen pulsars Real-time fold to DM=250 pc/cc. O(100) Ops/sample Sustaining >>100 Gflops ~100K computers. June 2008 192 MHz working @ 4bits 768 MHz working @ 2bits
11
Coherent Dedispersion BW/time 1998 2000 2002 2004 2006 2008 x x x x 16 20 128 1024 (100K) (300K) BW year
12
Coherent Dedispersion Now “trivial” FFT ease ~ B -2 / 3
13
MultiBOB High Resolution Universe Survey (PALFA of the South) Werthimer’s iBOB boards 1024 channels, down to 10us sampling Two pols FPGA coding hard… Use software gain equalizer/summer ~5 MB/s beam 1 Gb/s Fibre to Swinburne (>1000 km fibre) Real time searching!
14
New PKS MB Survey: Bailes 13 beams 9 minutes/pointing 1024 channels 300 MHz BW 64 us sampling +/- 15 deg Kramer 13 beams 70 minutes/pointing 1024 channels 300 MHz BW 64 us sampling +/- 3.5 deg Johnston 13 beams 4.5 minutes/pointing 1024 channels 300 MHz BW 32 us sampling The rest
15
MWA Samples Takes (24x1.3MHz=32 MHz) x 2 x 512 “Just” 32 GB/s (64 Gsamples/s) FFTs it (5 N log2 ops/pt = 2.2 Tflops) XMultiplies & adds (512)*256*B*4 = 16 TMACs
16
Sensitivity: ~3-5x PKS 32 vs 288 MHz 350 vs 25 K 700 vs 0.6 deg 2 (folded factor)
17
PKS vs MWA G ~ 3-5 x better T sys ~ 14 x worse ? B 1/2 ~ 3 x worse Flux ~ 25 x better (1400 vs 200 MHz) t 1/2 ~ 32 x better ~ Parity Single Pulse work ~ Comparable Coherent search ~ 32x improvement! But: There is a limit to the time you can observe a pulsar! 4m vs 144m -> 5x deeper.
18
Scattering b=0 1,10,100,1000ms
19
Scattering b=5d 1,10,50,100ms
20
b=30 0.5,1ms
21
36 GB/s Search instrumentation? 32 MHz... FX... 30 GB/s 5 bits x 512 Grid... 2D FFT -1... Volts SpectraVisibilities uv FBanks Dedisp... Spectra FFT Fold...... Pulsars <1 bit/s 200 GB/s 32 bits x 512 1024 GB/s 32 bits x 512 x 256 600 GB/s x 192 2 36 GB/s Correlator Us ? ?
22
Search Timings 36,000 “coherent beams” (768m/4m=192) 2 36 gigapixels/s Dedisperse/CPU core Gigapixel/120s 36 x 120 = 4320 cores = 500 machines = 250 kW N FFT = 36,000 * 1024 (DMs)/8192 = 4608 FFTs/sec Seek (3s / 8192 x 1024 pt FFT) 14,000 cores ~ 1800 machines = MW. (M$/yr)
23
Supercomputing @ Swinburne The Green Machine installed May/June 2007 185 Dell PowerEdge1950 nodes 2 quad-core processors (Clovertown: Intel Xeon 64-bit 2.33 GHz) 16GB RAM 1TB disk -> 300 TB total 1640 cores/14 Tflops dual channel gigabit ethernet CentOS Linux OS job queue submission 20 Gb infiniband (Q1 2008) 83 kW.vs. 130 kW cooling Machines: ~1.2M Fuel: ~100K/yr
24
Search Times: Depend only upon: Npixels x Nchans x Tsamp -1 Requires: No acceleration trials PSR J0437-4715 In 8192s, small width from acceleration
25
Search Timings (32x32 tiles) 36000->1024 “coherent beams” 36->1 gigapixels/s Dedisperse/core Gigapixel/120s 120 = 120 cores = 15 machines = 7 kW N FFT = 1024 * 1024 (DMs)/8192(s/FFT) = 128 FFTs/sec Seek (3s / (8192 x 1024) pt FFT) 378 cores ~ 50 machines = 25 kW.
26
RRATs Log N - Log S (helps with long pointings…) 1000 x integration time. Maybe good RRAT finder.
27
Monitoring: Monitoring?
28
Monitoring:
29
Build Your Own Telescope? May be cheaper to build dedicated PSR telescope than attempt to process everything from existing telescopes! 32x32 tile: (2D FFT - 1D FFT - dedisperse - FFT) ~2M telescopes ~2M “beamformer/receivers” ~1M correlator ~1M Supercomputer ~1M construction ~7-8M
30
Next-Gen Supercomputers (IO or Tflops?) Infiniband 20 Gb (40Gb) 288 port switch ~10 Tb/s IO Capacity (1-2K/node) Teraflop CPU capacities/node (140 Gflops now) Teraflop Server or Tflop GPU? 10 GB/s vs 76 GB/s Power (0.1W/$) 2M = 200 kW
31
Architecture (2011??): 288 Ports 40 Gb/s 288 Ports 40 Gb/s 144 Tflops 300K ~1M FX
32
Summary: Strong motivation for multiple (~100) tied array beams PSRs/deg^2 Surveys only possible with compact configurations At present Future Supercomputers may allow search even with MWA-like telescopes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.