Observational procedures and data reduction Lecture 4: Data reduction process XVII Canary Islands Winter School of Astrophysics: ‘3D Spectroscopy’ Tenerife,

Slides:



Advertisements
Similar presentations
Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
Advertisements

Jeroen Stil Department of Physics & Astronomy University of Calgary Stacking of Radio Surveys.
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Spectral Combination Software Jerry Kriss 5/8/2013.
Discussion of CalWebb contents M. Robberto (facilitator)
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Maxent interface.
Lecture 4 Linear Filters and Convolution
Statistics for the Social Sciences
MAE 552 Heuristic Optimization
The Calibration Process
CCD Image Processing: Issues & Solutions. Correction of Raw Image with Bias, Dark, Flat Images Flat Field Image Bias Image Output Image Dark Frame Raw.
Regression Eric Feigelson. Classical regression model ``The expectation (mean) of the dependent (response) variable Y for a given value of the independent.
Inferential Statistics
Weights of Observations
…….CT Physics - Continued V.G.WimalasenaPrincipal School of radiography.
Objectives of Multiple Regression
Colorado Center for Astrodynamics Research The University of Colorado STATISTICAL ORBIT DETERMINATION Project Report Unscented kalman Filter Information.
Pipeline calibrations of ACS data Max Mutchler Hubble Space Telescope Calibration Workshop October 2005.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Beating Noise Observational Techniques ASTR 3010 Lecture 11 Textbook.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
3D meeting Tenerife July 2002 Ian Parry, Andrew Dean, Dave King, Andy Bunker, Steve Medlen, Jim Pritchard, Rachel Johnson, Anamparabu Ramaprakash, Rob.
Fundamentals of Data Analysis Lecture 10 Management of data sets and improving the precision of measurement pt. 2.
CCD Detectors CCD=“charge coupled device” Readout method:
Spectroscopic Observations (Massey & Hanson 2011, arXiv v2.pdf) Examples of Spectrographs Spectroscopy with CCDs Data Reduction and Calibration.
1 Statistical Distribution Fitting Dr. Jason Merrick.
G52IVG, School of Computer Science, University of Nottingham 1 Edge Detection and Image Segmentation.
ACS Drizzling Overview J. Mack; DA Training 10/5/07 Distortion Dither Strategies MultiDrizzle ‘Fine-tuning’ Data Quality Photometry.
Cosmic Microwave Background Carlo Baccigalupi, SISSA CMB lectures at TRR33, see the complete program at darkuniverse.uni-hd.de/view/Main/WinterSchoolLecture5.
JWST Calibration Error Budget Jerry Kriss. 15 March 20072/14 JWST Flux & Wavelength Calibration Requirements SR-20: JWST shall be capable of achieving.
Counting individual galaxies from deep mid-IR Spitzer surveys Giulia Rodighiero University of Padova Carlo Lari IRA Bologna Francesca Pozzi University.
Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
MOS Data Reduction Michael Balogh University of Durham.
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
Practical applications: CCD spectroscopy Tracing path of 2-d spectrum across detector –Measuring position of spectrum on detector –Fitting a polynomial.
CSC321: Lecture 7:Ways to prevent overfitting
Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015
JWST Time-Series Pipeline Nikole K. Lewis STScI. Data Pipeline for Transiting Exoplanets The foundation for the Spitzer and Hubble data pipelines were.
14 January Observational Astronomy SPECTROSCOPIC data reduction Piskunov & Valenti 2002, A&A 385, 1095.
October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],
Introduction to statistics I Sophia King Rm. P24 HWB
Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability.
Lens to interferometer Suppose the small boxes are very small, then the phase shift Introduced by the lens is constant across the box and the same on both.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
CCD Image Processing: Issues & Solutions. CCDs: noise sources dark current –signal from unexposed CCD read noise –uncertainty in counting electrons in.
1 Notes about eis_prep Alessandro Gardini University of Oslo Oslo, July 9th, 2009.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
In conclusion the intensity level of the CCD is linear up to the saturation limit, but there is a spilling of charges well before the saturation if.
Lesson 5.1 Evaluation of the measurement instrument: reliability I.
Introduction to Digital Image Analysis Kurt Thorn NIC.
Filters– Chapter 6. Filter Difference between a Filter and a Point Operation is that a Filter utilizes a neighborhood of pixels from the input image to.
Binomial Distribution Possion Distribution Gaussian Distribution Lorentzian Distribution Error Analysis.
Environmental Remote Sensing GEOG 2021
Background on Classification
CCD Image Processing …okay, I’ve got a bunch of .fits files, now what?
New static DQ masks for NICMOS
Introduction, class rules, error analysis Julia Velkovska
Preliminary Design Review
Preliminary Design Review
Basics of Photometry.
COMP60621 Fundamentals of Parallel and Distributed Systems
Confidence intervals for the difference between two means: Independent samples Section 10.1.
Subscript and Summation Notation
COMP60611 Fundamentals of Parallel and Distributed Systems
DIGITAL IMAGE PROCESSING Elective 3 (5th Sem.)
ATLAS full run-2 luminosity combination
Presentation transcript:

Observational procedures and data reduction Lecture 4: Data reduction process XVII Canary Islands Winter School of Astrophysics: ‘3D Spectroscopy’ Tenerife, Nov-Dec 2005 James E.H. Turner Gemini Observatory

Data reduction process Overview ● The last lecture gave an overview of the different data reduction stages and what is involved at each step ● This lecture briefly discusses the reduction as process: – Error and data quality propagation – File formats – A couple of example reduction sequences

Data reduction process Error propagation ● At the end of the reduction, it’s important to have a good estimate of the errors in data values – For faint sources, we can estimate the statistical significance of detection – Want a measure of the reliability of ages, metallicities etc. derived from line strength indices (Cardiel et al., 1997) or the intrinsic random errors in velocity measurement – etc. ● The raw data have quite well-defined errors due to photon statistics and read noise ● After numerous processing stages, it is difficult at best (impossible at worst) to estimate errors directly from the data values

Data reduction process Error propagation ● Solution – Keep track of the errors in data values throughout the processing – For each detector pixel, store an error value in a separate error image, alongside the main science data array – During each processing step, process the error array in parallel with the science image, to reflect how the errors have changed ● For example, when adding two science images, add the corresponding error images in quadrature

Data reduction process Error propagation ● Poisson statistics – A process where discrete values vary statistically around a well- defined mean, eg. counting photons, is described by a Poisson distribution:  with a mean (expected number of photons) of n=  – The standard deviation from the mean is simply  =√  ● So when counting photons (really electrons), the statistical error is the square root of the expected number of photons (electrons) ● In practice, estimate the error as the square root of the measured number of electrons, since that is what we know – For large , the Poisson distribution is a Gaussian disribution with  =√ 

Data reduction process Error propagation ● Random sources of measurement error (noise) in the data – Detector read noise – Poisson noise from the science target and sky – Poisson noise from detector dark current ● Also have systematic errors introduced during processing – Eg. due to inaccuracies in flat fielding – Usually present at the level of a few percent; difficult to reduce to zero – These effects can be more difficult to account for, but typically the statistical errors are dominant ● If we get enough signal-to-noise with an IFU to worry about errors of a few percent, we’re usually going to be pretty happy! From the pixel values }

Data reduction process Error propagation ● Detectors don’t usually report exactly 1 count per stored electron – Poisson statistics apply to electrons, rather than detector counts (ADUs) – The detector ‘gain’ equals the number of electrons per measured count ● Really an inverse gain, but that’s what it’s called! ● Controls how much light can be measured before saturating ● Typical gains are a few e-/ADU (CCDs), up to >10 e-/ADU (NIR) – To estimate Poisson noise in electrons, multiply the counts by the gain and take the square root ● When adding values, their errors add in quadrature (sum of squares) – Therefore when propagating errors, we use the error array to store variance (  2 ) values, rather than the actual noise (  )

Data reduction process Error propagation ● Error propagation procedure – Start by creating a variance array containing the square of the detector read noise,  read 2, which affects every pixel independently of the counts ● Read noise is counted in electrons, so if we are storing science data values as detector counts, the variance should be (  read /gain) 2 ● Alternatively, multiply the science array through by the gain to begin with – Estimate the statistical variance in the measured counts, n, for each pixel and add to the array of read noise values ● If working in electrons the statistical variance to add is just n  gain ● In ADUs, the number is n / gain

Data reduction process Error propagation – At each subsequent reduction step, manipulate the variance array according to the operation being performed on the science array: ● When adding or subtracting images, their errors add in quadrature – Simply add the variance arrays for each image ● When scaling an image (multiplying or dividing by a number), the error is scaled accordingly – Multiply the variance by the square of the scaling factor ● When multiplying/dividing images, their fractional errors add in quadrature – Divide each input variance array by the square of the corresponding image, add the results together and multiply by the square of the final science image to get the final variance image

Data reduction process Error propagation ● For more complicated operations on the science data, ie. some arbitrary function, f(n) – Take the first derivative of f(n), to estimate how the output values vary with small changes in the input values – Multiply the variance by |df/dn| at the appropriate value of n – At the end of the data reduction process, can take the square root of the variance array to get to the final noise values ● Resampling – In the raw data, each pixel has an independent statistical error – If resampling causes smoothing, the errors in different pixels may become correlated

Data reduction process Error propagation – One could attempt to propagate a separate covariance matrix Covariance = expected value of the product of deviations from the means – Usually software doesn’t track covariance, but it’s important to be aware that the variance numbers may not be exactly correct after resampling ● Eg. linear interpolation at the midpoint between 2 samples is an average – The error on the result is therefore reduced by √2 – The number of pixels hasn’t changed, but each pixel has higher S/N! – However, summing 2 of the resampled pixels does not reduce the error by a further factor of √2 because the errors are no longer independent

Data reduction process Data quality ● As well as storing variance values alongside each science image, it is useful to store data quality information – Use an integer valued array to flag which pixels are good, bad, noisy etc. in the main science array ● Each bit of the integer represents yes/no for a particular defect, allowing more than one problem to be recorded for a particular pixel ● Different pixel values indicate, for example: – Good pixel – Cosmic ray – Saturated pixel – Hot pixel (etc…) ● The convention for the values depends on the processing software – Useful for masking out values appropriately at each reduction stage

Data reduction process File storage format ● Data are typically stored in FITS files – Flexible Image Transport System, overseen by a NASA technical panel ● Standard definition document available at – Each single FITS file can contain ● One or more N-dimensional image arrays ● ASCII header information, using keyword = value pairs – Header keywords can have values of different data types – Eg. OBJECT = ‘NGC1068’ or EXPTIME = 120 ● One or more binary tables – Using named columns (eg. XCOORD, YCOORD) and mixed data types, rather than a simple array of numbers ● Other, less common formats of data

Data reduction process File storage format – Within a FITS file, data can be divided into separate extensions ● The primary header contains keywords relevant to the whole file – Eg. object name, telescope pointing, airmass, filter, central wavelength ● Each image, binary table etc. has its own numbered/named extension – Contains both the data and any extra header keywords that are only relevant to that dataset – Example FITS file structure during processing:  EXT# EXTTYPE EXTNAME EXTVE DIMENS BITPI INH OBJECT 0 trnS S0166_s 16 Galaxy 1 BINTABLE MDF 32x IMAGE SCI 1 32x F Galaxy 3 IMAGE VAR 1 32x F Variance 4 IMAGE DQ 1 32x F DQ 5 IMAGE SCI 2 32x F Galaxy 6 IMAGE VAR 2 32x F Variance 7 IMAGE DQ 2 32x F DQ [ … etc … ]

Data reduction process Reduced data formats ● Row-stacked spectra (etc.) – One option is just to work with extracted spectra in 2D – Limited to spectral analysis (eg. velocity measurement, not imaging) – Still have to create a 2D spatial map from the results afterwards ● Datacube – A 3D image array, with two spatial axes and one wavelength axis – Easy to read and manipulate in IRAF, IDL, Python etc. – Usually requires resampling the processed IFU data onto a 3D grid ● Except for IFUs that have a square lens grid to begin with – If we want to oversample after interpolating, to produce ‘smoother’ images (good for visualization etc), the file sizes can become quite large

Data reduction process Reduced data formats ● Euro3D format – Both image data (1D spectra) and information describing the spectra are stored in a binary table – Native format for the ‘E3D’ visualization tool (can also read datacubes) – Closer to the raw data than a cube—attempts to avoid resampling until it is necessary, during visualization or analysis – Minimal file size, like row-stacked spectra, since there is no interpolation until it is needed – Requires having special software/libraries to work with the format

Data reduction process Reduced data formats

Data reduction process Example reduction sequence — optical ● GMOS IFU (optical fibre) data, using the Gemini IRAF package

Data reduction process Example reduction sequence — optical

Data reduction process Example reduction sequence — optical

Data reduction process Example reduction sequence —infrared ● GNIRS IFU (image slicer) data, using the Gemini IRAF package

Data reduction process Example reduction sequence —infrared

Data reduction process Summary ● The data reduction process is typically based on FITS files, with one or more image extensions ● Propagating error and data quality arrays through the process is helpful for understanding how accurate the results are ● The final data format for analysis depends on the application, software, user preference etc. – Euro3D format, datacubes or in some cases just row-stacked spectra ● The example reduction sequences for optical fibre data and NIR image slicer data give an idea of how the steps are ordered for science data