SDSS photo-z with model templates
Photo-z Estimate redshift (+ physical parameters) –Colors are special „projection” of spectra, like PCA
LIGHT Spectrum 1M objects BROADBAND FILTERS MAGNITUDE SPACE 270M objects REDSHIFT PHYSICAL PARAMETRS age, dust, SFH, etc. GALAXY early, late 3000 DIMENSIONAL POINT DATA 5 DIMENSIONAL POINT DATA 5-10? DIMENSION 3-10 DIMENSION PCA
Photo-z techniques Empirical –Polyfit –Neural net –Nearest neighbor Tempate fitting –Empirical templates Repair –Model templates All the same: –generate a reference set (from observed photometry, synthetic photometry of observed or model spectra) –Linear (weighted sum) or nonlinear function of neighbors’ redshift The key issue: get a good reference set –Easy to get good results for a good reference set –Extrapolation: only hope is better models
Catalogs Test set: DR6 spectro set, galaxies (few outliers removed) Charlot et al.: 100k stochastic SFH model library –u,g,r,i,z synthetic magnitudes, 200 redshift bins in z=0-1 Using colors only for redshift estimation
Fast kd-tree based NN in SQL server Index the color space with a search tree –Find k-nearest neighbors quickly Implemented in SQL server (SQL+CLR) –Local polyfit, average, weighted (photo errors) sum Time to calculate photoz for DR5 (200M object) –Tempate fitting: 150 processor- day –Kd-fit: 10 processor-day
Spectro training set Local linear fit 150 NN Δz= Average 150 NN Δz=0.0306
100k Stochastic library Average 150 NN Δz= Local linear fit 150 NN Δz=0.2275
Best subset Conclusion: too rich reference set with probably un-physical templates Subset 100 of 100k –Iteration Closest templates in color+redshift space Removing templates those cause systematic errors
Best subset: 1st step Average of 150 NN Δz =
Best subset: final iteration Average of 150 NN Dz = Local linear fit of 150 NN Dz =
Open questions Cannot reach as good estimation as with the empirical reference set –Are the templates in the best set „physical”? Systematic calibration mismatch –SDSS filter curves? –Model: dust ? Different sampling –How to compare the sets and find the „offset” in colors –Correcting for the average difference vector Δ(ug,gr,ri,iz)= ( ,0.0036, ,0.0212) does not improve the results Find optimal subset with matching models and observations in spectral space (Laszlo)