Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alex Szalay Department of Physics and Astronomy The Johns Hopkins University The Sloan Digital Sky Survey.

Similar presentations


Presentation on theme: "Alex Szalay Department of Physics and Astronomy The Johns Hopkins University The Sloan Digital Sky Survey."— Presentation transcript:

1 Alex Szalay Department of Physics and Astronomy The Johns Hopkins University The Sloan Digital Sky Survey

2 Alex Szalay, JHU A project run by the Astrophysical Research Consortium (ARC) Goal: To create a detailed multicolor map of the Northern Sky over 5 years, with a budget of approximately $80M Data Size: 40 TB raw, 2 TB processed Goal: To create a detailed multicolor map of the Northern Sky over 5 years, with a budget of approximately $80M Data Size: 40 TB raw, 2 TB processed The University of Chicago Princeton University The Johns Hopkins University The University of Washington Fermi National Accelerator Laboratory US Naval Observatory The Japanese Participation Group The Institute for Advanced Study Max Planck Inst, Heidelberg SLOAN Foundation, NSF, DOE, NASA The University of Chicago Princeton University The Johns Hopkins University The University of Washington Fermi National Accelerator Laboratory US Naval Observatory The Japanese Participation Group The Institute for Advanced Study Max Planck Inst, Heidelberg SLOAN Foundation, NSF, DOE, NASA The Sloan Digital Sky Survey

3 Alex Szalay, JHU Scientific Motivation Create the ultimate map of the Universe:  The Cosmic Genome Project! Study the distribution of galaxies:  What is the origin of fluctuations?  What is the topology of the distribution? Measure the global properties of the Universe:  How much dark matter is there? Local census of the galaxy population:  How did galaxies form? Find the most distant objects in the Universe:  What are the highest quasar redshifts?

4 Alex Szalay, JHU Cosmology Primer The spatial distribution of galaxies is correlated, due to small ripples in the early Universe P(k): P(k): power spectrum v = H o r Hubble’s law v = H o r Hubble’s law The Universe is expanding: the galaxies move away from us spectral lines are redshifted  = density/critical  = density/critical if  <1, expand forever The fate of the universe depends on the balance between gravity and the expansion velocity  d >  * Most of the mass in the Universe is dark matter, and it may be cold (CDM)

5 Alex Szalay, JHU The ‘Naught’ Problem What are the global parameters of the Universe? H 0 the Hubble constant55-75 km/s/Mpc  0 the density parameter0.25-1  0 the cosmological constant0 - 0.7 Their values are still quite uncertain today... Goal: measure these parameters with an accuracy of a few percent What are the global parameters of the Universe? H 0 the Hubble constant55-75 km/s/Mpc  0 the density parameter0.25-1  0 the cosmological constant0 - 0.7 Their values are still quite uncertain today... Goal: measure these parameters with an accuracy of a few percent High Precision Cosmology!

6 Alex Szalay, JHU The Cosmic Genome Project The SDSS will create the ultimate map of the Universe, with much more detail than any other measurement before Gregory and Thompson 1978 deLapparent, Geller and Huchra 1986 daCosta etal 1995 SDSS Collaboration 2002

7 Alex Szalay, JHU Area and Size of Redshift Surveys

8 Alex Szalay, JHU Clustering of Galaxies We will measure the spectrum of the density fluctuations to high precision even on very large scales The error in the amplitude of the fluctuation spectrum 1970x100 1990 x2 1995 ±0.4 1998 ±0.2 1999 ±0.1 2002 ±0.05 The error in the amplitude of the fluctuation spectrum 1970x100 1990 x2 1995 ±0.4 1998 ±0.2 1999 ±0.1 2002 ±0.05

9 Alex Szalay, JHU Relevant Scales Distances measured in Mpc [megaparsec] 1 Mpc = 3 x 10 24 cm 5 Mpc = distance between galaxies 3000 Mpc= scale of the Universe if >200 Mpc fluctuations have a PRIMORDIAL shape if <100 Mpc gravity creates sharp features, like walls, filaments and voids Biasing conversion of mass into light is nonlinear light is much more clumpy than the mass

10 Alex Szalay, JHU The Topology of Local Universe Measure the Topology of the Universe Does it consist of walls and voids or is it randomly distributed?

11 Alex Szalay, JHU Finding the Most Distant Objects Intermediate and high redshift QSOs Multicolor selection function. Luminosity functions and spatial clustering. High redshift QSO’s (z>5). Intermediate and high redshift QSOs Multicolor selection function. Luminosity functions and spatial clustering. High redshift QSO’s (z>5).

12 Alex Szalay, JHU Special 2.5m telescope, located at Apache Point, NM 3 degree field of view. Zero distortion focal plane. Two surveys in one: Photometric survey in 5 bands. Spectroscopic redshift survey. Huge CCD Mosaic 30 CCDs 2K x 2K(imaging) 22 CCDs 2K x 400(astrometry) Two high resolution spectrographs 2 x 320 fibers, with 3 arcsec diameter. R=2000 resolution with 4096 pixels. Spectral coverage from 3900Å to 9200Å. Automated data reduction Over 120 man-years of development effort. (Fermilab + collaboration scientists) Very high data volume Expect over 40 TB of raw data. About 2 TB processed catalogs. Data made available to the public. Features of the SDSS

13 Alex Szalay, JHU Apache Point Observatory Located in New Mexico, near White Sands National Monument Located in New Mexico, near White Sands National Monument

14 Alex Szalay, JHU The Telescope Special 2.5m telescope 3 degree field of view Zero distortion focal plane Wind screen moved separately Special 2.5m telescope 3 degree field of view Zero distortion focal plane Wind screen moved separately

15 Alex Szalay, JHU Northern Galactic Cap 5 broad-band filters ( u', g', r', i', z’ ) limiting magnitudes (22.3, 23.3, 23.1, 22.3, 20.8) drift scan of 10,000 square degrees 55 sec exposure time 40 TB raw imaging data -> pipeline -> 100,000,000 galaxies 50,000,000 stars calibration to 2% at r'=19.8 only done in the best seeing (20 nights/yr) pixel size is 0.4 arcsec, astrometric precision is 60 milliarcsec Southern Galactic Cap multiple scans (> 30 times) of the same stripe Continuous data rate of 8 Mbytes/sec The Photometric Survey

16 Alex Szalay, JHU The Footprint of the Survey

17 Alex Szalay, JHU Survey Strategy Overlapping 2.5 degree wide stripes Avoiding the Galactic Plane (dust) Multiple exposures on the three Southern stripes Overlapping 2.5 degree wide stripes Avoiding the Galactic Plane (dust) Multiple exposures on the three Southern stripes

18 Alex Szalay, JHU Measure redshifts of objects  distance SDSS Redshift Survey: 1 million galaxies 100,000 quasars 100,000 stars Two high throughput spectrographs spectral range 3900-9200 Å. 640 spectra simultaneously. R=2000 resolution. Automated reduction of spectra Very high sampling density and completeness Objects in other catalogs also targeted The Spectroscopic Survey

19 Alex Szalay, JHU Optimal Tiling Fields have 3 degree diameter Centers determined by an optimization procedure A total of 2200 pointings 640 fibers assigned simultaneously Fields have 3 degree diameter Centers determined by an optimization procedure A total of 2200 pointings 640 fibers assigned simultaneously

20 Alex Szalay, JHU The Mosaic Camera

21 Alex Szalay, JHU Photometric Calibrations The SDSS will create a new photometric system: u' g' r' i' z' Primary standards: observed with the USNO 40-inch telescope in Flagstaff Secondary standards: observed with the SDSS 20-inch telescope at Apache Point – calibrating the SDSS imaging data The SDSS will create a new photometric system: u' g' r' i' z' Primary standards: observed with the USNO 40-inch telescope in Flagstaff Secondary standards: observed with the SDSS 20-inch telescope at Apache Point – calibrating the SDSS imaging data

22 Alex Szalay, JHU The Spectrographs Two double spectrographs very high throughput two 2048x2048 CCD detectors mounted on the telescope light fed through slithead Two double spectrographs very high throughput two 2048x2048 CCD detectors mounted on the telescope light fed through slithead

23 Alex Szalay, JHU The Fiber Feed System Galaxy images are captured by optical fibers lined up on the spectrograph slit Manually plugged during the day into Al plugboards 640 fibers in each bundle The largest fiber system today Galaxy images are captured by optical fibers lined up on the spectrograph slit Manually plugged during the day into Al plugboards 640 fibers in each bundle The largest fiber system today

24 Alex Szalay, JHU Spectrograph Status Spectrographs: Laboratory observations of solar spectrum First astronomical observations March 1999 Spectrographs: Laboratory observations of solar spectrum First astronomical observations March 1999

25 Alex Szalay, JHU JHU Contributions Fiber spectrographs P. Feldman A. Uomoto S. Friedman S. Smee Fiber spectrographs P. Feldman A. Uomoto S. Friedman S. Smee Science Archive A. Szalay A. Thakar P. Kunszt I. Csabai Gy. Szokoly A. Connolly A. Chaudhaury Science Archive A. Szalay A. Thakar P. Kunszt I. Csabai Gy. Szokoly A. Connolly A. Chaudhaury Management T. Heckman T. Poehler J. Crocker A. Davidsen A. Uomoto A. Szalay R. Wyse Management T. Heckman T. Poehler J. Crocker A. Davidsen A. Uomoto A. Szalay R. Wyse

26 Alex Szalay, JHU First Light Images Telescope: First light May 9th 1998 Equatorial scans Telescope: First light May 9th 1998 Equatorial scans

27 Alex Szalay, JHU The First Stripes Camera: 5 color imaging of >100 square degrees Multiple scans across the same fields Photometric limits as expected Camera: 5 color imaging of >100 square degrees Multiple scans across the same fields Photometric limits as expected

28 Alex Szalay, JHU NGC 2068

29 Alex Szalay, JHU UGC 3214

30 Alex Szalay, JHU NGC 6070

31 Alex Szalay, JHU The First Quasars The four highest redshift quasars have been found in the first SDSS test data !

32 Alex Szalay, JHU Methane/T Dwarf Discovery of several new objects by SDSS & 2MASS SDSS T-dwarf (June 1999)

33 Alex Szalay, JHU Detection of Gravitational Lensing 28,000 foreground galaxies and 2,045,000 background galaxies in test data (McKay etal 1999)

34 Alex Szalay, JHU The first 35,000 redshifts

35 Alex Szalay, JHU SDSS Data Flow

36 Alex Szalay, JHU Data Processing Pipelines

37 Alex Szalay, JHU Concept of the SDSS Archive Operational Archive (raw + processed data) Science Archive (products accessible to users) Other Archives

38 Alex Szalay, JHU All raw data saved in a tape vault at Fermilab Object catalog400 GB parameters of >10 8 objects Redshift Catalog 1 GB parameters of 10 6 objects Atlas Images 1.5 TB 5 color cutouts of >10 8 objects Spectra 60 GB in a one-dimensional form Derived Catalogs 20 GB - clusters - QSO absorption lines 4x4 Pixel All-Sky Map 60 GB heavily compressed Object catalog400 GB parameters of >10 8 objects Redshift Catalog 1 GB parameters of 10 6 objects Atlas Images 1.5 TB 5 color cutouts of >10 8 objects Spectra 60 GB in a one-dimensional form Derived Catalogs 20 GB - clusters - QSO absorption lines 4x4 Pixel All-Sky Map 60 GB heavily compressed SDSS Data Products

39 Alex Szalay, JHU Who will be using the archive? Power Users sophisticated, with lots of resources research is centered around the archive data moderate number of very intensive queries mostly statistical, large output sizes General Astronomy Public frequent, but casual lookup of objects/regions the archives help their research, but not central to it large number of small queries a lot of cross-identification requests Wide Public browsing a ‘Virtual Telescope’ can have large public appeal need special packaging could be a very large number of requests Power Users sophisticated, with lots of resources research is centered around the archive data moderate number of very intensive queries mostly statistical, large output sizes General Astronomy Public frequent, but casual lookup of objects/regions the archives help their research, but not central to it large number of small queries a lot of cross-identification requests Wide Public browsing a ‘Virtual Telescope’ can have large public appeal need special packaging could be a very large number of requests

40 Alex Szalay, JHU How will the data be analyzed? The data are inherently multidimensional => positions, colors, size, redshift Improved classifications result in complex N-dimensional volumes => complex constraints, not ranges Spatial relations will be investigated => nearest neighbors => other objects within a radius Data Mining: finding the ‘needle in the haystack’ => separate typical from rare => recognize patterns in the data Output size can be prohibitively large for intermediate files => import output directly into analysis tools The data are inherently multidimensional => positions, colors, size, redshift Improved classifications result in complex N-dimensional volumes => complex constraints, not ranges Spatial relations will be investigated => nearest neighbors => other objects within a radius Data Mining: finding the ‘needle in the haystack’ => separate typical from rare => recognize patterns in the data Output size can be prohibitively large for intermediate files => import output directly into analysis tools

41 Alex Szalay, JHU Geometric Approach The Main Problem: fast, indexed, complex searches of Terabytes in k-dim space searches are not necessary parallel to the axes => traditional indexing (b-tree) does not work The Main Problem: fast, indexed, complex searches of Terabytes in k-dim space searches are not necessary parallel to the axes => traditional indexing (b-tree) does not work Geometric Approach: Use the geometric nature of the k-dimensional data Quantize data into containers of ‘friends’: objects of similar colors close on the sky stored together => efficient cache performance Containers represent a coarse grained density map of the data multidimensional index tree: k-d tree + r-tree Geometric Approach: Use the geometric nature of the k-dimensional data Quantize data into containers of ‘friends’: objects of similar colors close on the sky stored together => efficient cache performance Containers represent a coarse grained density map of the data multidimensional index tree: k-d tree + r-tree

42 Alex Szalay, JHU Organization of Searches Queries are inherently geometric the primitive constraint is a half-space formed by a linear combination => k-dimensional hyperplane Boolean combinations are allowed the constraints form k-dimensional polyhedra Queries are run on the coarse grained map determine intersections of index tree and query polyhedron List of containers is prepared for query projections of full query time and output volume created The list of containers and query is sent to the Search Engine actual searches quantized by containers Searches can be optimized, executed in parallel Queries are inherently geometric the primitive constraint is a half-space formed by a linear combination => k-dimensional hyperplane Boolean combinations are allowed the constraints form k-dimensional polyhedra Queries are run on the coarse grained map determine intersections of index tree and query polyhedron List of containers is prepared for query projections of full query time and output volume created The list of containers and query is sent to the Search Engine actual searches quantized by containers Searches can be optimized, executed in parallel

43 Alex Szalay, JHU Geometric Indexing “Divide and Conquer” Partitioning 3  N  M3  N  M 3  N  M3  N  M Hierarchical Triangular Mesh Split as k-d tree Stored as r-tree of bounding boxes Using regular indexing techniques AttributesNumber Sky Position 3 Multiband FluxesN = 5+ Other M= 100+ AttributesNumber Sky Position 3 Multiband FluxesN = 5+ Other M= 100+

44 Alex Szalay, JHU Sky coordinates Stored as Cartesian coordinates: projected onto a unit sphere Longitude and Latitude lines: intersections of planes and the sphere Boolean combinations: query polyhedron Stored as Cartesian coordinates: projected onto a unit sphere Longitude and Latitude lines: intersections of planes and the sphere Boolean combinations: query polyhedron

45 Alex Szalay, JHU Sky Partitioning Hierarchical Triangular Mesh - based on octahedron

46 Alex Szalay, JHU Hierarchical Subdivision Hierarchical subdivision of spherical triangles represented as a quadtree In SDSS the tree is 5 levels deep - 8192 triangles Hierarchical subdivision of spherical triangles represented as a quadtree In SDSS the tree is 5 levels deep - 8192 triangles

47 Alex Szalay, JHU Result of the Query

48 Alex Szalay, JHU Magnitudes and Multicolor Searches Galaxy fluxes large dynamic range errors divergent as x  0 ! But: this is an artifact of the logarithm at zero flux, in flux space the object is well localized But: this is an artifact of the logarithm at zero flux, in flux space the object is well localized For multicolor magnitudes the error contours can be very anisotropic and skewed, extremely poor localization!

49 Alex Szalay, JHU Novel Magnitude Scale b: softness c: set to match normal magnitudes Advantages:  monotonic  degrades gracefully  objects have small error ellipse  unified handling of detections and upper limits! Disadvantages:  unusual (Lupton, Gunn and Szalay, AJ 99)

50 Alex Szalay, JHU Flux Indexing Split along alternating flux directions Create balanced partitions Store bounding boxes at each step Build a 10-12 level tree in each triangle Split along alternating flux directions Create balanced partitions Store bounding boxes at each step Build a 10-12 level tree in each triangle

51 Alex Szalay, JHU Therefore: first create a local density and split on its value (Csabai etal 96) typical (98%)rare (2%) Therefore: first create a local density and split on its value (Csabai etal 96) typical (98%)rare (2%) The SDSS will measure fluxes in 5 bands => asinh magnitudes Axis-parallel splits in median flux, in 8 separate zones in Galactic latitude => 5 dimensional bounding boxes The SDSS will measure fluxes in 5 bands => asinh magnitudes Axis-parallel splits in median flux, in 8 separate zones in Galactic latitude => 5 dimensional bounding boxes How to build compact cells? The fluxes are strongly correlated => 2 +  dimensional distribution of typical objects => widely scattered rare objects => large density contrasts

52 Alex Szalay, JHU Analysis Engine Query Support Data WarehouseUser Interface Archive Coarse Grained Design

53 Alex Szalay, JHU User Interface Analysis Engine Master Objectivity RAID Slave Objectivity RAID Slave Objectivity RAID Slave Objectivity RAID Slave SX Engine Objectivity Federation Distributed Implementation

54 Alex Szalay, JHU Exploring new methods New spectral classification techniques galaxy spectra can be expressed as a superposition of a few ( objective classification of 1 million spectra! Photometric redshifts galaxy colors systematically change with redshift, the SDSS photometry works like a 5-pixel spectrograph =>  z=0.05, but with 100 million objects! Measuring cosmological parameters before:data analysis was limited by small number statistics after:dominant errors are systematic (extinction) => new analysis methods are required!

55 Alex Szalay, JHU Photometric redshifts Multicolor photometry maps physical parameters luminosity L redshift z spectral type T Inversion: u’,g’,r’,I’,z’ => z, L, T Redshifts are statistical, with large errors:  z  0.05 The data set is huge, more than 100 million galaxies Easy to subdivide into coarse z bins, and by type => study evolution => enormous volume - 1 Gpc 3 Redshifts are statistical, with large errors:  z  0.05 The data set is huge, more than 100 million galaxies Easy to subdivide into coarse z bins, and by type => study evolution => enormous volume - 1 Gpc 3 observed fluxes

56 Alex Szalay, JHU Spectra from Photometry New development: low resolution spectra from multicolor photometry many galaxies => oversampling => spectra (Csabai, Budavari, Connolly, Szalay 99)

57 Alex Szalay, JHU Measuring P(k) Karhunen-Loeve transform: Signal-to-noise eigenmodes of the redshift survey Optimal extraction of clustering signal Maximal rejection of systematic errors (Vogeley and Szalay 96, Matsubara, Szalay and Landy 99) Pilot project using the Las Campanas Redshift Survey with 22,000 galaxies We simultaneously measure the values of the redshift-distortion parameter (  =  0.6 /b), the normalization (  8 ) and the CDM shape parameter (  =  h).

58 Alex Szalay, JHU TrendsTrends Future dominated by detector improvements Total area of 3m+ telescopes in the world in m 2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels. Moore’s Law growth in CCD capabilities Gigapixel arrays on the horizon Improvements in computing and storage will track growth in data volume Investment in software is critical, and growing

59 Alex Szalay, JHU The next generation of astronomical archives with Terabyte catalogs will dramatically change astronomy top-down design large sky coverage built on sound statistical plans uniform, homogeneous, well calibrated well controlled and documented systematics The technology to acquire, store and index the data is here we are riding Moore’s Law Data mining in such vast archives will be a challenge, but possibilities are quite unimaginable Integrating these archives into a single entity is a project for the whole community => National Virtual Observatory The Age of Mega-Surveys

60 Alex Szalay, JHU New Astronomy – Different! Systematic Data Exploration will have a central role in the New Astronomy Digital Archives of the Sky will be the main access to data Data “Avalanche” the flood of Terabytes of data is already happening, whether we like it or not! Transition to the new may be organized or chaotic

61 Alex Szalay, JHU NVO: The Challenges Size of the archived data 40,000 square degrees is 2 trillion pixels One band: 4 Terabytes Multi-wavelength: 10-100 Terabytes Time dimension: few Petabytes The development of new archival methods new analysis tools new standards (metadata, interchange formats) Hardware/networking requirements Training the next generation!

62 Alex Szalay, JHU SummarySummary The SDSS project combines astronomy, physics, and computer science It promises to fundamentally change our view of the universe It will determine how the largest structures in the universe were formed Its ‘virtual universe’ can be explored by both scientists and the public It will serve as the standard astronomy reference for several decades Through its archive it will create a new paradigm in astronomy

63 Alex Szalay, JHU


Download ppt "Alex Szalay Department of Physics and Astronomy The Johns Hopkins University The Sloan Digital Sky Survey."

Similar presentations


Ads by Google