Alex Szalay Department of Physics and Astronomy The Johns Hopkins University The Sloan Digital Sky Survey.

Slides:



Advertisements
Similar presentations
Trying to Use Databases for Science Jim Gray Microsoft Research
Advertisements

An Experience of Aladin Usage for the RC Catalog Radio Source Investigation Zhelenkova Olga, SAO RAS.
Jeroen Stil Department of Physics & Astronomy University of Calgary Stacking of Radio Surveys.
End of the Cosmic Dark Ages -- the First Galaxies and the Cosmic Renaissance Xiaohui Fan Steward Observatory The University of Arizona.
Hierarchical Clustering Leopoldo Infante Pontificia Universidad Católica de Chile Reunión Latinoamericana de Astronomía Córdoba, septiembre 2001.
Observational tests of an inhomogeneous cosmology by Christoph Saulder in collaboration with Steffen Mieske & Werner Zeilinger.
Star-Formation in Close Pairs Selected from the Sloan Digital Sky Survey Overview The effect of galaxy interactions on star formation has been investigated.
Growth of Structure Measurement from a Large Cluster Survey using Chandra and XMM-Newton John R. Peterson (Purdue), J. Garrett Jernigan (SSL, Berkeley),
Galaxy Distributions Analysis of Large-scale Structure Using Visualization and Percolation Technique on the SDSS Early Data Release Database Yuk-Yan Lam.
A Web service for Distributed Covariance Computation on Astronomy Catalogs Presented by Haimonti Dutta CMSC 691D.
Brian Schmidt, Paul Francis, Mike Bessell, Stefan Keller.
K.S. Dawson, W.L. Holzapfel, E.D. Reese University of California at Berkeley, Berkeley, CA J.E. Carlstrom, S.J. LaRoque, D. Nagai University of Chicago,
The SDSS is Two Surveys The Fuzzy Blob Survey The Squiggly Line Survey.
KDUST Supernova Cosmology
Nikos Nikoloudakis and T.Shanks, R.Sharples 9 th Hellenic Astronomical Conference Athens, Greece September 20-24, 2009.
Galaxy Clusters & Large Scale Structure Ay 16, April 3, 2008 Coma Cluster =A1656.
March 21, 2006Astronomy Chapter 27 The Evolution and Distribution of Galaxies What happens to galaxies over billions of years? How did galaxies form?
SDSS Web Services Tamás Budavári Johns Hopkins University Coding against the Universe.
Galaxies What is a galaxy? How many stars are there in an average galaxy? About how many galaxies are there in the universe? What is the name of our galaxy?
Astro-DISC: Astronomy and cosmology applications of distributed super computing.
Teaching Science with Sloan Digital Sky Survey Data GriPhyN/iVDGL Education and Outreach meeting March 1, 2002 Jordan Raddick The Johns Hopkins University.
Supernovae and scale of the universe. SN Ia have extremely uniform light curves → standard candles!
The Structure of the Universe AST 112. Galaxy Groups and Clusters A few galaxies are all by themselves Most belong to groups or clusters Galaxy Groups:
The Statistical Properties of Large Scale Structure Alexander Szalay Department of Physics and Astronomy The Johns Hopkins University.
Sky Surveys and the Virtual Observatory Alex Szalay The Johns Hopkins University.
Spatial Indexing and Visualizing Large Multi-dimensional Databases I. Csabai, M. Trencséni, L. Dobos, G. Herczegh, P. Józsa, N. Purger Eötvös University,
Alex Szalay Department of Physics and Astronomy The Johns Hopkins University The Sloan Digital Sky Survey.
Cosmological Tests using Redshift Space Clustering in BOSS DR11 (Y. -S. Song, C. G. Sabiu, T. Okumura, M. Oh, E. V. Linder) following Cosmological Constraints.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
Radio Galaxies and Quasars Powerful natural radio transmitters associated with Giant elliptical galaxies Demo.
Spatial Indexing of large astronomical databases László Dobos, István Csabai, Márton Trencséni ELTE, Hungary.
1 The Terabyte Analysis Machine Jim Annis, Gabriele Garzoglio, Jun 2001 Introduction The Cluster Environment The Distance Machine Framework Scales The.
National Virtual Observatory Theory,Computation, and Data Exploration Panel of the AASC Charles Alcock, Tom Prince, Alex Szalay.
Alex Szalay Department of Physics and Astronomy The Johns Hopkins University and the SDSS Project The Sloan Digital Sky Survey.
XXXVth Recontres de Moriond, Energy Densities in the Universe ONE MILLION GALAXIES Cosmography and Cosmology Michael S. Vogeley Department of Physics Drexel.
Public Access to Large Astronomical Datasets Alex Szalay, Johns Hopkins Jim Gray, Microsoft Research.
EÖTVÖS UNIVERSITY BUDAPEST Department of Physics of Complex Systems VO Spectroscopy Workshop, ESAC Spectrum Services 2007 László Dobos (ELTE)
Designing and Mining Multi-Terabyte Astronomy Archives: The Sloan Digital Sky Survey Alexander S. Szalay, Peter Z. Kunszt, Ani Thakar Dept. of Physics.
Clustering in the Sloan Digital Sky Survey Bob Nichol (ICG, Portsmouth) Many SDSS Colleagues.
2004 January 27Mathematical Challenges of Using Point Spread Function Analysis Algorithms in Astronomical ImagingMighell 1 Mathematical Challenges of Using.
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
Chapter 16 The Milky Way Galaxy 16.1 Overview n How many stars are in the Milky Way? – About 200 billion n How many galaxies are there? – billions.
Indexing and Visualizing Multidimensional Data I. Csabai, M. Trencséni, L. Dobos, G. Herczegh, P. Józsa, N. Purger Eötvös University,Budapest.
Constraining Cosmology with Peculiar Velocities of Type Ia Supernovae Cosmo 2007 Troels Haugbølle Institute for Physics & Astronomy,
Surveying the Universe with SNAP Tim McKay University of Michigan Department of Physics Seattle AAS Meeting: 1/03 For the SNAP collaboration.
PHY306 1 Modern cosmology 3: The Growth of Structure Growth of structure in an expanding universe The Jeans length Dark matter Large scale structure simulations.
G. Miknaitis SC2006, Tampa, FL Observational Cosmology at Fermilab: Sloan Digital Sky Survey Dark Energy Survey SNAP Gajus Miknaitis EAG, Fermilab.
Galaxy Dynamics Lab 11. The areas of the sky covered by various surveys.
SNAP Calibration Program Steps to Spectrophotometric Calibration The SNAP (Supernova / Acceleration Probe) mission’s primary science.
The Statistical Properties of Large Scale Structure Alexander Szalay Department of Physics and Astronomy The Johns Hopkins University.
Ching-Wa Yip Johns Hopkins University.  Alex Szalay (JHU)  Rosemary Wyse (JHU)  László Dobos (ELTE)  Tamás Budavári (JHU)  Istvan Csabai (ELTE)
DDM Kirk. LSST-VAO discussion: Distributed Data Mining (DDM) Kirk Borne George Mason University March 24, 2011.
Log Likelihood Estimate the log likelihood in the KL basis, by rotating into the diagonal eigensystem, and rescaling with the square root of the eigenvalues.
Kevin Cooke.  Galaxy Characteristics and Importance  Sloan Digital Sky Survey: What is it?  IRAF: Uses and advantages/disadvantages ◦ Fits files? 
1 Baryon Acoustic Oscillations Prospects of Measuring Dark Energy Equation of State with LAMOST Xuelei Chen ( 陳學雷 ) National Astronomical Observatory of.
IPHAS Early Data Release E. A. Gonzalez-Solares IPHAS Consortium AstroGrid National Astronomy Meeting, 2007.
Photometric Calibration Jorge F. García Yus GEMINI Observatory Barolo 2001.
Large Scale Computations in Astrophysics: Towards a Virtual Observatory Alex Szalay Department of Physics and Astronomy The Johns Hopkins University ACAT2000,
FIRST LIGHT A selection of future facilities relevant to the formation and evolution of galaxies Wavelength Sensitivity Spatial resolution.
Lecture 3 With every passing hour our solar system comes forty-three thousand miles closer to globular cluster 13 in the constellation Hercules, and still.
Brenna Flaugher for the DES Collaboration; DPF Meeting August 27, 2004 Riverside,CA Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO 1 Dark Energy and.
Chapter 25 Galaxies and Dark Matter. 25.1Dark Matter in the Universe 25.2Galaxy Collisions 25.3Galaxy Formation and Evolution 25.4Black Holes in Galaxies.
Budapest Group Eötvös University MAGPOP kick-off meeting Cassis 2005 January
In conclusion the intensity level of the CCD is linear up to the saturation limit, but there is a spilling of charges well before the saturation if.
Wide-field Infrared Survey Explorer (WISE) is a NASA infrared- wavelength astronomical space telescope launched on December 14, 2009 It’s an Earth-orbiting.
Color Magnitude Diagram VG. So we want a color magnitude diagram for AGN so that by looking at the color of an AGN we can get its luminosity –But AGN.
Spatial Searches in the ODM. slide 2 Common Spatial Questions Points in region queries 1.Find all objects in this region 2.Find all “good” objects (not.
Galaxy Evolution and WFMOS
National Virtual Observatory
Presentation transcript:

Alex Szalay Department of Physics and Astronomy The Johns Hopkins University The Sloan Digital Sky Survey

Alex Szalay, JHU A project run by the Astrophysical Research Consortium (ARC) Goal: To create a detailed multicolor map of the Northern Sky over 5 years, with a budget of approximately $80M Data Size: 40 TB raw, 1 TB processed Goal: To create a detailed multicolor map of the Northern Sky over 5 years, with a budget of approximately $80M Data Size: 40 TB raw, 1 TB processed The University of Chicago Princeton University The Johns Hopkins University The University of Washington Fermi National Accelerator Laboratory US Naval Observatory The Japanese Participation Group The Institute for Advanced Study SLOAN Foundation, NSF, DOE, NASA The University of Chicago Princeton University The Johns Hopkins University The University of Washington Fermi National Accelerator Laboratory US Naval Observatory The Japanese Participation Group The Institute for Advanced Study SLOAN Foundation, NSF, DOE, NASA The Sloan Digital Sky Survey

Alex Szalay, JHU Scientific Motivation Create the ultimate map of the Universe:  The Cosmic Genome Project! Study the distribution of galaxies:  What is the origin of fluctuations?  What is the topology of the distribution? Measure the global properties of the Universe:  How much dark matter is there? Local census of the galaxy population:  How did galaxies form? Find the most distant objects in the Universe:  What are the highest quasar redshifts?

Alex Szalay, JHU Cosmology Primer The spatial distribution of galaxies is correlated, due to small ripples in the early Universe P(k): P(k): power spectrum v = H o r Hubble’s law v = H o r Hubble’s law The Universe is expanding: the galaxies move away from us spectral lines are redshifted  = density/critical  = density/critical if  <1, expand forever The fate of the universe depends on the balance between gravity and the expansion velocity  d >  * Most of the mass in the Universe is dark matter, and it may be cold (CDM)

Alex Szalay, JHU The ‘Naught’ Problem What are the global parameters of the Universe? H 0 the Hubble constant55-75 km/s/Mpc  0 the density parameter  0 the cosmological constant Their values are still quite uncertain today... Goal: measure these parameters with an accuracy of a few percent What are the global parameters of the Universe? H 0 the Hubble constant55-75 km/s/Mpc  0 the density parameter  0 the cosmological constant Their values are still quite uncertain today... Goal: measure these parameters with an accuracy of a few percent High Precision Cosmology!

Alex Szalay, JHU The Cosmic Genome Project The SDSS will create the ultimate map of the Universe, with much more detail than any other measurement before Gregory and Thompson 1978 deLapparent, Geller and Huchra 1986 daCosta etal 1995 SDSS Collaboration 2002

Alex Szalay, JHU Area and Size of Redshift Surveys

Alex Szalay, JHU Clustering of Galaxies We will measure the spectrum of the density fluctuations to high precision even on very large scales The error in the amplitude of the fluctuation spectrum 1970x x ± ± ± ±0.05 The error in the amplitude of the fluctuation spectrum 1970x x ± ± ± ±0.05

Alex Szalay, JHU Relevant Scales Distances measured in Mpc [megaparsec] 1 Mpc = 3 x cm 5 Mpc = distance between galaxies 3000 Mpc= scale of the Universe if >200 Mpc fluctuations have a PRIMORDIAL shape if <100 Mpc gravity creates sharp features, like walls, filaments and voids Biasing conversion of mass into light is nonlinear light is much more clumpy than the mass

Alex Szalay, JHU The Topology of Local Universe Measure the Topology of the Universe Does it consist of walls and voids or is it randomly distributed?

Alex Szalay, JHU Finding the Most Distant Objects Intermediate and high redshift QSOs Multicolor selection function. Luminosity functions and spatial clustering. High redshift QSO’s (z>5). Intermediate and high redshift QSOs Multicolor selection function. Luminosity functions and spatial clustering. High redshift QSO’s (z>5).

Alex Szalay, JHU Special 2.5m telescope, located at Apache Point, NM 3 degree field of view. Zero distortion focal plane. Two surveys in one: Photometric survey in 5 bands. Spectroscopic redshift survey. Huge CCD Mosaic 30 CCDs 2K x 2K(imaging) 22 CCDs 2K x 400(astrometry) Two high resolution spectrographs 2 x 320 fibers, with 3 arcsec diameter. R=2000 resolution with 4096 pixels. Spectral coverage from 3900Å to 9200Å. Automated data reduction Over 70 man-years of development effort. (Fermilab + collaboration scientists) Very high data volume Expect over 20 TB of raw data. About 1 TB processed catalogs. Data made available to the public. Features of the SDSS

Alex Szalay, JHU Apache Point Observatory Located in New Mexico, near White Sands National Monument Located in New Mexico, near White Sands National Monument

Alex Szalay, JHU The Telescope Special 2.5m telescope 3 degree field of view Zero distortion focal plane Wind screen moved separately Special 2.5m telescope 3 degree field of view Zero distortion focal plane Wind screen moved separately

Alex Szalay, JHU Northern Galactic Cap 5 broad-band filters ( u', g', r', i', z’ ) limiting magnitudes (22.3, 23.3, 23.1, 22.3, 20.8) drift scan of 10,000 square degrees 55 sec exposure time 40 TB raw imaging data -> pipeline -> 100,000,000 galaxies 50,000,000 stars calibration to 2% at r'=19.8 only done in the best seeing (20 nights/yr) pixel size is 0.4 arcsec, astrometric precision is 60 milliarcsec Southern Galactic Cap multiple scans (> 30 times) of the same stripe Continuous data rate of 8 Mbytes/sec The Photometric Survey

Alex Szalay, JHU The Footprint of the Survey

Alex Szalay, JHU Survey Strategy Overlapping 2.5 degree wide stripes Avoiding the Galactic Plane (dust) Multiple exposures on the three Southern stripes Overlapping 2.5 degree wide stripes Avoiding the Galactic Plane (dust) Multiple exposures on the three Southern stripes

Alex Szalay, JHU Measure redshifts of objects  distance SDSS Redshift Survey: 1 million galaxies 100,000 quasars 100,000 stars Two high throughput spectrographs spectral range Å. 640 spectra simultaneously. R=2000 resolution. Automated reduction of spectra Very high sampling density and completeness Objects in other catalogs also targeted The Spectroscopic Survey

Alex Szalay, JHU Optimal Tiling Fields have 3 degree diameter Centers determined by an optimization procedure A total of 2200 pointings 640 fibers assigned simultaneously Fields have 3 degree diameter Centers determined by an optimization procedure A total of 2200 pointings 640 fibers assigned simultaneously

Alex Szalay, JHU The Mosaic Camera

Alex Szalay, JHU Photometric Calibrations The SDSS will create a new photometric system: u' g' r' i' z' Primary standards: observed with the USNO 40-inch telescope in Flagstaff Secondary standards: observed with the SDSS 20-inch telescope at Apache Point – calibrating the SDSS imaging data The SDSS will create a new photometric system: u' g' r' i' z' Primary standards: observed with the USNO 40-inch telescope in Flagstaff Secondary standards: observed with the SDSS 20-inch telescope at Apache Point – calibrating the SDSS imaging data

Alex Szalay, JHU The Spectrographs Two double spectrographs very high throughput two 2048x2048 CCD detectors mounted on the telescope light fed through slithead Two double spectrographs very high throughput two 2048x2048 CCD detectors mounted on the telescope light fed through slithead

Alex Szalay, JHU The Fiber Feed System Galaxy images are captured by optical fibers lined up on the spectrograph slit Manually plugged during the day into Al plugboards 640 fibers in each bundle The largest fiber system today Galaxy images are captured by optical fibers lined up on the spectrograph slit Manually plugged during the day into Al plugboards 640 fibers in each bundle The largest fiber system today

Alex Szalay, JHU Spectrograph Status Spectrographs: Laboratory observations of solar spectrum First astronomical observations March 1999 Spectrographs: Laboratory observations of solar spectrum First astronomical observations March 1999

Alex Szalay, JHU JHU Contributions Fiber spectrographs P. Feldman A. Uomoto S. Friedman S. Smee Fiber spectrographs P. Feldman A. Uomoto S. Friedman S. Smee Science Archive A. Szalay A. Thakar P. Kunszt I. Csabai Gy. Szokoly A. Connolly A. Chaudhaury Science Archive A. Szalay A. Thakar P. Kunszt I. Csabai Gy. Szokoly A. Connolly A. Chaudhaury Management T. Heckman T. Poehler J. Crocker A. Davidsen A. Uomoto A. Szalay R. Wyse Management T. Heckman T. Poehler J. Crocker A. Davidsen A. Uomoto A. Szalay R. Wyse

Alex Szalay, JHU First Light Images Telescope: First light May 9th 1998 Equatorial scans Telescope: First light May 9th 1998 Equatorial scans

Alex Szalay, JHU The First Stripes Camera: 5 color imaging of >100 square degrees Multiple scans across the same fields Photometric limits as expected Camera: 5 color imaging of >100 square degrees Multiple scans across the same fields Photometric limits as expected

Alex Szalay, JHU NGC 2068

Alex Szalay, JHU UGC 3214

Alex Szalay, JHU NGC 6070

Alex Szalay, JHU The First Quasars Three of the four highest redshift quasars have been found in the first SDSS test data !

Alex Szalay, JHU SDSS Data Flow

Alex Szalay, JHU Data Processing Pipelines

Alex Szalay, JHU Concept of the SDSS Archive Operational Archive (raw + processed data) Science Archive (products accessible to users) Other Archives

Alex Szalay, JHU All raw data saved in a tape vault at Fermilab Object catalog400 GB parameters of >10 8 objects Redshift Catalog 1 GB parameters of 10 6 objects Atlas Images 1.5 TB 5 color cutouts of >10 8 objects Spectra 60 GB in a one-dimensional form Derived Catalogs 20 GB - clusters - QSO absorption lines 4x4 Pixel All-Sky Map 60 GB heavily compressed Object catalog400 GB parameters of >10 8 objects Redshift Catalog 1 GB parameters of 10 6 objects Atlas Images 1.5 TB 5 color cutouts of >10 8 objects Spectra 60 GB in a one-dimensional form Derived Catalogs 20 GB - clusters - QSO absorption lines 4x4 Pixel All-Sky Map 60 GB heavily compressed SDSS Data Products

Alex Szalay, JHU Who will be using the archive? Power Users sophisticated, with lots of resources research is centered around the archive data moderate number of very intensive queries mostly statistical, large output sizes General Astronomy Public frequent, but casual lookup of objects/regions the archives help their research, but not central to it large number of small queries a lot of cross-identification requests Wide Public browsing a ‘Virtual Telescope’ can have large public appeal need special packaging could be a very large number of requests Power Users sophisticated, with lots of resources research is centered around the archive data moderate number of very intensive queries mostly statistical, large output sizes General Astronomy Public frequent, but casual lookup of objects/regions the archives help their research, but not central to it large number of small queries a lot of cross-identification requests Wide Public browsing a ‘Virtual Telescope’ can have large public appeal need special packaging could be a very large number of requests

Alex Szalay, JHU How will the data be analyzed? The data are inherently multidimensional => positions, colors, size, redshift Improved classifications result in complex N-dimensional volumes => complex constraints, not ranges Spatial relations will be investigated => nearest neighbors => other objects within a radius Data Mining: finding the ‘needle in the haystack’ => separate typical from rare => recognize patterns in the data Output size can be prohibitively large for intermediate files => import output directly into analysis tools The data are inherently multidimensional => positions, colors, size, redshift Improved classifications result in complex N-dimensional volumes => complex constraints, not ranges Spatial relations will be investigated => nearest neighbors => other objects within a radius Data Mining: finding the ‘needle in the haystack’ => separate typical from rare => recognize patterns in the data Output size can be prohibitively large for intermediate files => import output directly into analysis tools

Alex Szalay, JHU Geometric Approach The Main Problem: fast, indexed, complex searches of Terabytes in k-dim space searches are not necessary parallel to the axes => traditional indexing (b-tree) does not work The Main Problem: fast, indexed, complex searches of Terabytes in k-dim space searches are not necessary parallel to the axes => traditional indexing (b-tree) does not work Geometric Approach: Use the geometric nature of the k-dimensional data Quantize data into containers of ‘friends’: objects of similar colors close on the sky stored together => efficient cache performance Containers represent a coarse grained density map of the data multidimensional index tree: k-d tree + r-tree Geometric Approach: Use the geometric nature of the k-dimensional data Quantize data into containers of ‘friends’: objects of similar colors close on the sky stored together => efficient cache performance Containers represent a coarse grained density map of the data multidimensional index tree: k-d tree + r-tree

Alex Szalay, JHU Organization of Searches Queries are inherently geometric the primitive constraint is a half-space formed by a linear combination => k-dimensional hyperplane Boolean combinations are allowed the constraints form k-dimensional polyhedra Queries are run on the coarse grained map determine intersections of index tree and query polyhedron List of containers is prepared for query projections of full query time and output volume created The list of containers and query is sent to the Search Engine actual searches quantized by containers Searches can be optimized, executed in parallel Queries are inherently geometric the primitive constraint is a half-space formed by a linear combination => k-dimensional hyperplane Boolean combinations are allowed the constraints form k-dimensional polyhedra Queries are run on the coarse grained map determine intersections of index tree and query polyhedron List of containers is prepared for query projections of full query time and output volume created The list of containers and query is sent to the Search Engine actual searches quantized by containers Searches can be optimized, executed in parallel

Alex Szalay, JHU Geometric Indexing “Divide and Conquer” Partitioning 3  N  M3  N  M 3  N  M3  N  M Hierarchical Triangular Mesh Split as k-d tree Stored as r-tree of bounding boxes Using regular indexing techniques AttributesNumber Sky Position 3 Multiband FluxesN = 5+ Other M= 100+ AttributesNumber Sky Position 3 Multiband FluxesN = 5+ Other M= 100+

Alex Szalay, JHU Sky coordinates Stored as Cartesian coordinates: projected onto a unit sphere Longitude and Latitude lines: intersections of planes and the sphere Boolean combinations: query polyhedron Stored as Cartesian coordinates: projected onto a unit sphere Longitude and Latitude lines: intersections of planes and the sphere Boolean combinations: query polyhedron

Alex Szalay, JHU Sky Partitioning Hierarchical Triangular Mesh - based on octahedron

Alex Szalay, JHU Hierarchical Subdivision Hierarchical subdivision of spherical triangles represented as a quadtree In SDSS the tree is 5 levels deep triangles Hierarchical subdivision of spherical triangles represented as a quadtree In SDSS the tree is 5 levels deep triangles

Alex Szalay, JHU Result of the Query

Alex Szalay, JHU Magnitudes and Multicolor Searches Galaxy fluxes large dynamic range errors divergent as x  0 ! But: this is an artifact of the logarithm at zero flux, in flux space the object is well localized But: this is an artifact of the logarithm at zero flux, in flux space the object is well localized For multicolor magnitudes the error contours can be very anisotropic and skewed, extremely poor localization!

Alex Szalay, JHU Novel Magnitude Scale b: softness c: set to match normal magnitudes Advantages:  monotonic  degrades gracefully  objects have small error ellipse  unified handling of detections and upper limits! Disadvantages:  unusual (Lupton, Gunn and Szalay, AJ 99)

Alex Szalay, JHU Flux Indexing Split along alternating flux directions Create balanced partitions Store bounding boxes at each step Build a level tree in each triangle Split along alternating flux directions Create balanced partitions Store bounding boxes at each step Build a level tree in each triangle

Alex Szalay, JHU Therefore: first create a local density and split on its value (Csabai etal 96) typical (98%)rare (2%) Therefore: first create a local density and split on its value (Csabai etal 96) typical (98%)rare (2%) The SDSS will measure fluxes in 5 bands => asinh magnitudes Axis-parallel splits in median flux, in 8 separate zones in Galactic latitude => 5 dimensional bounding boxes The SDSS will measure fluxes in 5 bands => asinh magnitudes Axis-parallel splits in median flux, in 8 separate zones in Galactic latitude => 5 dimensional bounding boxes How to build compact cells? The fluxes are strongly correlated => 2 +  dimensional distribution of typical objects => widely scattered rare objects => large density contrasts

Alex Szalay, JHU Analysis Engine Query Support Data WarehouseUser Interface Archive Coarse Grained Design

Alex Szalay, JHU User Interface Analysis Engine Master Objectivity RAID Slave Objectivity RAID Slave Objectivity RAID Slave Objectivity RAID Slave SX Engine Objectivity Federation Distributed Implementation

Alex Szalay, JHU Exploring new methods New spectral classification techniques galaxy spectra can be expressed as a superposition of a few ( objective classification of 1 million spectra! Photometric redshifts galaxy colors systematically change with redshift, the SDSS photometry works like a 5-pixel spectrograph =>  z=0.05, but with 100 million objects! Measuring cosmological parameters before:data analysis was limited by small number statistics after:dominant errors are systematic (extinction) => new analysis methods are required!

Alex Szalay, JHU Photometric redshifts Multicolor photometry maps physical parameters luminosity L redshift z spectral type T Inversion: u’,g’,r’,I’,z’ => z, L, T Redshifts are statistical, with large errors:  z  0.05 The data set is huge, more than 100 million galaxies Easy to subdivide into coarse z bins, and by type => study evolution => enormous volume - 1 Gpc 3 Redshifts are statistical, with large errors:  z  0.05 The data set is huge, more than 100 million galaxies Easy to subdivide into coarse z bins, and by type => study evolution => enormous volume - 1 Gpc 3 observed fluxes

Alex Szalay, JHU Spectra from Photometry New development: low resolution spectra from multicolor photometry many galaxies => oversampling => spectra (Csabai, Budavari, Connolly, Szalay 99)

Alex Szalay, JHU Measuring P(k) Karhunen-Loeve transform: Signal-to-noise eigenmodes of the redshift survey Optimal extraction of clustering signal Maximal rejection of systematic errors (Vogeley and Szalay 96, Matsubara, Szalay and Landy 99) Pilot project using the Las Campanas Redshift Survey with 22,000 galaxies We simultaneously measure the values of the redshift-distortion parameter (  =  0.6 /b), the normalization (  8 ) and the CDM shape parameter (  =  h).

Alex Szalay, JHU The SDSS Science Archive will support complex multidimensional queries of Terabyte catalogs Queries are primarily I/O limited distributed parallel architecture Our usage patterns required a fast multidimensional index sky positions object colors Advanced query capabilities required spatial relations classifications condensed representations Soon other large astronomical archives will emerge cross-referencing seamless interoperability New paradigm in Astronomy The SDSS Science Archive

Alex Szalay, JHU The next generation of astronomical archives with TB catalogs will dramatically change astronomy top-down design large sky coverage built on sound statistical plans uniform, homegeneous, well calibrated well controlled and documented systematics The technology to store and index the data is here Data mining in such vast archives will be a challenge, but possibilities are quite unimaginable Integrating these archives into a single entity is a project for the whole community Future of Archives

Alex Szalay, JHU The next generation of astronomical archives with Terabyte catalogs will dramatically change astronomy top-down design large sky coverage built on sound statistical plans uniform, homogeneous, well calibrated well controlled and documented systematics The technology to store and index the data is here Data mining in such vast archives will be a challenge, but possibilities are quite unimaginable Integrating these archives into a single entity is a project for the whole community => Virtual National Observatory The Age of Mega-Surveys

Alex Szalay, JHU SummarySummary The SDSS project combines astronomy, physics, and computer science It promises to fundamentally change our view of the universe It will determine how the largest structures in the universe were formed Its ‘virtual universe’ can be explored by both scientists and the public It will serve as the standard astronomy reference for several decades Through its archive it will create a new paradigm in astronomy

Alex Szalay, JHU