WFIRST Photo-z Study Peter L. Capak, Andreas Faisst, Shooby Hemmati, Dan Masters, Nathaniel Stickley Caltech Infrared Processing and Analysis Center
Part I – Calibration Overview P. Capak
Why Do Photometric Redshifts Work? Rahman et al. 2015 Galaxies are very similar in many ways Same physics Cluster in color space Photo-z is a map of the color space manifold to redshift High dimensionality Complex manifold Lots of ways to do this mapping Error distributions and systematics are important
Mapping the Color-Redshift relation Can visualize the 7-dimentional color data Test if the analytic model fits Test where the data driven model is valid Target grey areas with spectroscopy Analytic (Template) Model Data (spectra) Driven Model Now we can see where and how well our models work Need much more data to fill in the data driven model Will be applying for NASA keck time to do that Essential for the success of Euclid and WFIRST
Mapping the Color-Redshift relation The SOM map also encodes P(color) and cosmic variance P(color) is important because it allows you to use the whole sample to estimate the likelihood of degeneracy Cosmic variance encoded in density at level of photometry Typical cell has dz~0.02 Mean occupation by cell
Why Do Photometric Redshifts Work? 30 Band Weight photometry by likelihood of data point Weight models by likelihood they are represented in the data Using only u,g,r,I,z,Y,J,H photometry Significantly improve over raw template fitting Outlier fraction 10.2 -> 1.5% Sigma_NMAD=0.03 -> 0.02 Bias = 0.004 Photometric Redshift 8 Band 5 Photometric Redshift 4 3 2 1 1 2 3 4 5 Spectroscopic Redshift Spectroscopic Redshift
Application to WFIRST/LSST COSMOS Measured GOODS-N interpolated Used CANDELS data to simulate a real WFIRST lensing sample Interpolated to LSST/WFIRST bands based on existing CANDELS photometry Matches actual CFHT-LS + VISTA data Collected ~50k high-quality redshifts across CANDELS + COSMOS + SXDS + VVDS fields Single CANDELS field samples only a small fraction of CFHT-LS + VISTA color space Z-H G-Z
Application to WFIRST/LSST Euclid samples ~90% of LSST/WFIRST Color space WFIRST will require more z>3 spectroscopy than Euclid G-H H G-z I
Part II – WFIRST Specific Tests D. Masters
WFIRST Specific Tests Need an analog to WIFIRST CANDELS is closest data set It is small (0.2 sq deg) and heterogeneous Need to match it to LSST + WFIRST
Why Do Photometric Redshifts Work? Galaxies are very similar in many ways Same physics Cluster in color space Photo-z is a map of the color space manifold to redshift High dimensionality Complex manifold Lots of ways to do this mapping Error distributions and systematics are important Rahman et al. 2015
Color distribution to Euclid depth SDSS + COSMOS g-r vs. g-i SDSS g-r vs. g-i Galaxy distribution in multicolor space is (still) limited and measurable to Euclid depth! 11/14/2019
Moving CANDELS catalogs to LSST+WFIRST filter set after applying WFIRST lensing criteria ~30,000 galaxies LSST+WFIRST 11/14/2019
SOM on WFIRST+LSST catalog 11/14/2019
SOM on WFIRST+LSST catalog SOM is trained well. 11/14/2019
Redshifts on WFIRST SOM 11/14/2019
EUCLID, C3R2, the rest… riz>25 11/14/2019
EUCLID, C3R2, the rest… 11/14/2019
EUCLID, C3R2, the rest… While WFIRST galaxies that will not be in the EUCLID sample (riz>25) fill ~50% of the WFIRST color-space The fraction of the WFIRST color-space not filled with EUCLID sources with similar colors is less than 5%, which will need further spectroscopy. 11/14/2019
Part III – Simulating Spectra P. Capak
Simulating Spectra An essential part of calibration is the spectroscopy This is hard to simulate correctly due to the wide variety of galaxies and conditions We developed a data driven model to do this Based on SPHEREx mission simulations Designed to re-produce emission line strengths and evolution as well as colors Stickley et al. 2016
Simulating Spectra Using SED fits from CANDELS -> WFIRST conversion to estimate spectra Based on SPHEREx interpolative simulations Using Brown et al templates Predict actual emission line properties to 0.2 dex Continuum SNR to 20% for Keck
Simulating spectra Working for Keck instruments Adapted for WFIRST IFU and grism Being used to determine how much and what spectroscopy is required
IFU Redshifts Weak lines at >0.8um Can be done with IFC 85% of WFIRST sample (17% of total) Also “Hard” with GRISM Need many 10’s of hour exposures
Part IV – Beyond Redshifts P. Capak
Advanced Computational Techniques – Beyond Redshifts The SOM map also encodes P(color) Can be used to construct hierarchical priors If redshift is well constrained P(color) = cosmic variance Can be used to correct cosmic variance in small area surveys Redshifts Density Of Sources Masters, Capak et al. 2015
Advanced Computational Techniques – Beyond Redshifts Generated a 15-dimensional SOM on the SPLASH data Verify sample selection and photo-z for high-z sample Davizon (Capak) et al. 2017
Application Of Big Data Tools Analytic Models Unsupervised Learning Supervised Learning Analytic Models Fundamental understanding of underlying problem Supervised Learning -Create an empirical model of complex systems based on data -Reproduce human vision/intuition on large data sets\ -Complex interpolation Unsupervised Learning -Discover correlations you didn’t know existed -Visually explore high-dimensional data sets -Compare analytic models and complex data to look for discrepancies -Understand data properties
Advanced Computational Techniques – Model Testing χ2 of Best Fit Model Best fit analytic models with respect to SOM grid Clear regions where the fit is much worse Our analytic model is not good in these regions Why? Capak et all in prep
Advanced Computational Techniques – Model Testing The fit is bad because we do not correctly model red galaxies Goodness of fit Model Dust Obscuration Good Bad Red Blue Blue Red Capak et all in prep
Application Of Big Data Tools – Next Generation Surveys Need a “Standard Model” of galaxies Represents what we know about galaxies Combines information from multiple surveys Allows for optimized analysis Need a “Standard Model” of galaxies Needed to interpret large statistical data sets Represents what we know about galaxies Largely data driven statistics at this point Allows for interpretation of data from telescope Without modeling individual galaxies Allows for clean combination of information from multiple surveys Allows for optimized cosmological analysis Choose what you marginalize over
Next Generation Surveys – Standard Model Physical Models Red Blue Data Model Age Using SOM as illustrative tool, but there are better methods for this Need to taylor to the problem Mass
Next Generation Surveys – Standard Model Physical model of galaxies Red Blue Mass Function Log10(Space Density of Galaxies) Redshift Model Sky Density Log10(Mass of Galaxy)
Part V – Model Fitting with SOM A. Faisst
SED fitting Challenges Empirical libraries suffer from completeness Theoretical libraries are not constrained Can we combine the two? Future missions such as LSST, EUCLID, WFIRST, will provide photometric data for millions of galaxies where fast and accurate measurement of their physical properties is helpful. Can we measure physical properties faster using machine learning techniques? 11/14/2019
SOM to visualize galaxy model libraries 1) Make a library of model SEDs using different physical parameters: 2) Convolve model SEDs with filter sets dictated by observations. lets say COSMOS 3) Multi-dimensional space defined by colors of the model SEDs in the filter set. 4) SOM will reduce multi-dimension to two dimensions visualize 11/14/2019
SOM to visualize galaxy model libraries 11/14/2019
SOM to visualize galaxy model libraries Each SOM cell now represents different physical properties of galaxies 11/14/2019
SOM to Optimize galaxy model libraries Observations can be used to optimize the parameter space used to build the model library So lets map cosmos galaxies to the trained SOM: 11/14/2019
SOM to Measure galaxy physical properties 11/14/2019
Uncertainty of parameters measured with SOM 11/14/2019
More applications, galaxy selections such as the UVJ 11/14/2019
More applications, resolved stellar populations 11/14/2019
Part VI – Measuring Emission Lines With Photometry A. Faisst
Optical Emission Lines at z > 4 with SPLASH (COSMOS) Statistically estimate emission lines (Hα and [OIII]) at z > 4 from Spitzer [3.6μm]-[4.5μm] color Faisst et al. (2016a) also: Rasappu et al. (2016), Marmol-Queralto et al. (2015); Shim et al. (2011); Stark et al. (2013)
Optical Emission Lines at z > 4 with SPLASH (COSMOS) Success on COSMOS (Faisst et al. 2016a) SPLASH: deep (25.5AB) Spitzer over 2 deg2 of COSMOS ~500 spectroscopic redshifts (COSMOS/DEIMOS & VUDS) Scoville+07, LeFevre+14 Hα in 3.6μm [OIII] in 3.6μm Hα in 4.5μm no emission lines Faisst et al. (2016a)
Optical Emission Lines at z > 4 with SPLASH (COSMOS) Observed color vs. redshift relation = Intrinsic color (dust, age, metallicity, SFH) + Redshift dependent emission line strengths and ratios (Hα, Hβ, [OII], [OIII]) Faisst et al. (2016a)
Optical Emission Lines at z > 4 with SPLASH (COSMOS) Observed color vs. redshift relation = Intrinsic color (dust, age, metallicity, SFH) + color not sensitive to these for young galaxies (< 1 Gyr)! Redshift dependent emission line strengths and ratios (Hα, Hβ, [OII], [OIII]) Faisst et al. 2016a
Optical Emission Lines at z > 4 with SPLASH (COSMOS) Consistent results at z < 3 where spectroscopy is available Strong increase of Hα EW beyond z = 3 Hα Equivalent-Width Faisst et al. (2016a)
Optical Emission Lines at z > 4 with SPLASH (COSMOS) Consistent results at z < 3 where spectroscopy is available Strong increase of Hα EW beyond z = 3 Convert EW(Hα) into sSFR (e.g., Cowie et al. 2011) Hα Equivalent-Width Specific Star Formation Rate Hα derived sSFR out to z = 6 from Spitzer colors 5 times faster mass growth at z = 6 compared to z = 2 Faisst et al. (2016a)
Optical Emission Lines at z > 4 with SPLASH (COSMOS) Use Spitzer colors to estimate [OIII]/Hβ ratios sensitive to metallicity probe metallicity of z > 4 galaxies statistically Faisst et al. 2016a Faisst et al. 2016b [OIII] emitters! see also Masters, Faisst, Capak 2016
Part VII – Combining Heterogeneous Data P. Capak
Information from Heterogeneous data WFIRST will cover a large area of the sky Including deep multi-wavelength fields We can use this information to calibrate photo-z Looking at the galaxy population as a density field provides a way of doing this SOM a convenient tool for this
Next Generation Surveys – Standard Model Generate model with one data set Expand it to higher dimensions with another LSST WFIRST Spitzer
Next Generation Surveys – Standard Model Generate model with one data set Expand it to higher dimensions with another Uniform Cell Bifurcated population Non-Uniform Cell
How to do this in practice Deeper data contains more information Better localization on SOM/density field Target previous deep fields in survey with deep data CDFS, SXDS, VVDS-2h, COSMOS, EGS, GOODS-N, ect These can then be mapped onto the wider survey Using SOM including Spitzer for high-z studies Reduce but don’t completely remove caustics Need to explore even higher dimensional data Galex, HST UV