DiFX Overview Adam Deller NRAO 3rd DiFX workshop, Curtin University, Perth
Adam Deller3rd DiFX workshop, Curtin University, Perth Outline SVN layout mpifxcorr: The heart of DiFX Data management and flow The “Core” of the mpifxcorr Scaling and writing of visibilities I will focus on this segment of DiFX Briefly, the surrounding infrastructure: vex2difx: input/calc file generator calcif2: Geometric model generator difx2fits: FITS builder
Adam Deller3rd DiFX workshop, Curtin University, Perth SVN layout Two ways to get what you need from SVN Tagged versions: these live under “master_tags” at the top level. Current version is DiFX Active versions: branches/difx-1.5, trunk==difx2.0 Tagged versions: frozen. Recommended. Flat structure ie master_tags/mpifxcorr/*, master_tags/vexdifx/*, … Active versions: Can change! Structure includes “branch” underneath: applications/vex2difx/branches/difx-1.5/*, …
Adam Deller3rd DiFX workshop, Curtin University, Perth Mpifxcorr architecture Master Node Core 1DataStream 1 DataStream 2 DataStream N Core 2 Core M … … Timerange, destination Baseband data Visibilities Source data MPI is used for inter-process communications Each data transfer is double buffered Large, segmented ring buffer Up to 100s MB/ a few or more seconds Visibility buffer processing buffer
Adam Deller3rd DiFX workshop, Curtin University, Perth FxManager correlation flow Start at the requested time, step one block of FFTs at a time until end of correlation DiFX1.5 implementation: Contiguous time. Skip if time has no active “Configuration” DiFX2.0 implementation: correlate one scan at a time - whole scan either matches Configuration or not (contiguous not required) As visibilities are completed, release lock on visibility buffer slot (second thread writes out)
Adam Deller3rd DiFX workshop, Curtin University, Perth Datastream correlation flow Two threads: Main (receives instructions, sends data) and read (fills the buffer) Each maintains a lock on at least one segment of the databuffer at all times While data remains, the read thread will keep populating the data buffer until told to stop Main thread just dumbly fulfils requests until told to stop by Manager Sends a short flag to Core if no valid data
Adam Deller3rd DiFX workshop, Curtin University, Perth Datastream correlation flow Data buffer Start time Valid samples Num sent MPI_Send * handle Lock Read thread Send thread “Segment” “Send” ….. FFT = 2x num channels Requested time sent to Core
Adam Deller3rd DiFX workshop, Curtin University, Perth Core correlation flow N+1 threads: 1 for send/receive, the rest to do actual correlation (.threads file) One buffer slot is processed at a time - each process thread gets 1/Nth of the FFTs More locking is required so the threads can aggregate their results, which are stored in one long array (for ease of sending back) Keeps looping until a terminate message is received from FxManager
Adam Deller3rd DiFX workshop, Curtin University, Perth Under the hood in Core Each thread is identical, and has an array of “Mode” objects, which handle the station- based processing for each Datastream Mode knows how to unpack the different formats, and then handles fringe rotation, FFT and fractional sample correction After telling each Mode to do its thing, the thread grabs the appropriate results and XMACs, as described in the input file
Adam Deller3rd DiFX workshop, Curtin University, Perth Core in pictures Core object Subint slot Baseband data from each telescope Subint visibilities Proc. thread Thread visibilities Mode objects for each datastream Read/send thread Repeated for each subband Baseband data pointer unpacked data Intermediatiate data Final data for XMACXMAC
Adam Deller3rd DiFX workshop, Curtin University, Perth Scaling and writing visibilities Along with the sub-integrated visibilities, the Core maintains a count of valid samples that were used, also sends to FxManager This is divided by the expected number of samples and used to adjust the visibility amplitude up (and weight down) at the FxManager, before writing to disk At the same time, visibilities can be scaled by mean autocorrelations and predicted Tsys, but this is not recommended with difx2fits
Adam Deller3rd DiFX workshop, Curtin University, Perth vex2difx: control file generator Walter will speak in much more detail on this For the running of DiFX, its two important outputs are the.input file (which describes the setup of the correlator) and the.calc file (which describes the antenna positions, scan durations, source coords etc) Input.v2d file must provide vex file that describes observation; can also override defaults for numchannels, int time, …
Adam Deller3rd DiFX workshop, Curtin University, Perth.input file tables Common: other files, start/stop etc Configuration: Number of channels, anything that might want to change from one scan to the next Freq: IF frequencies, bandwidth, sideband Datastream: Setup for each telescope Baseline: What bands to correlate Data/network: Where to load the data from
Adam Deller3rd DiFX workshop, Curtin University, Perth.calc file setup Not explicitly tables like the.input file Start/stop time (not necessarily same as.input file Antenna info (xyz, mount type, axis offset, …) Source info (RA, dec, [parallax/pm…]) Scan info (start, duration, source) Earth Orientation Parameters (EOPs)
Adam Deller3rd DiFX workshop, Curtin University, Perth calcif2: geometric model gen. calcif2 takes the info given in the.calc file and produces predicted delays and uvws for each datastream This is stored in two forms: A densely packed series of samples (usually every second). This is the.delay and.uvw file used by DiFX1.5, which interpolates the samples A less densely sampled set of polynomials (usually every 2 minutes). This is the.im file used by DiFX2.0 - it’s a bit smaller/more efficient
Adam Deller3rd DiFX workshop, Curtin University, Perth difx2fits: FITS-IDI builder The visibilities written out of the correlator are in a very simple format: ascii header, followed by 32bit float binary data difx2fits takes these visibilities and the metadata from the control files, and builds a FITS-IDI file that can be loaded into eg AIPS More than one correlation can be combined into a single FITS file if compatible Further scaling of amplitudes is done
Adam Deller3rd DiFX workshop, Curtin University, Perth The thorny scaling problem At least 5 different amplitude effects need to be corrected (van Vleck, unpack vals, …) FITLD in AIPS is hardcoded to do some of these Therefore the combination of what is done at FxManager + difx2fits must match what FITLD expects Further discussion of this in the Tuesday afternoon session, hopefully
Adam Deller3rd DiFX workshop, Curtin University, Perth Recap: correlation flow chart Generate.v2d file vex2difx calcif2 Generate.threads/machine file Correlate: mpifxcorr Difx2fits In:Out: nothing.v2d.input,.calc.calc..delay/uvw/im ?? local info,.input.threads/machines.input/delay/uvw machines,.threads.difx (visibilities).difx,.input,.im FITS file