Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sky Query: A distributed query engine for astronomy

Similar presentations


Presentation on theme: "Sky Query: A distributed query engine for astronomy"— Presentation transcript:

1 Sky Query: A distributed query engine for astronomy
László Dobos1, Tamás Budavári2, Alex Szalay2, István Csabai1 1 Eötvös Loránd University, Hungary 2 Johns Hopkins University, Baltimore Sky Query: A distributed query engine for astronomy

2 The multiwavelength sky
infrared (2MASS) visible (DSS) ultraviolet (Galex)

3 Crossmatching Astronomical catalogs Done by coordinates in RDBMS
o(100 million) objects o(1TB – 10TB) DB size Done by coordinates RA, Dec Astrometric error Different sky coverage Different wavelength range Moving objects etc.

4 Crossmatching on demand
Crossmatch any number of catalogs All combinations cannot be precomputed Maybe catalog pairs? User can specify List of catalogs to match Region of interes Priors for non-coordinate-based matching

5 Problem description Astronomers „script” what they do
multiple re-runs, tweak parameters etc. huge web forms: no-no All data in RDBMS run computation inside the database use multiple servers and parallelize must be transparent for users Problem description in SQL functions and language extensions to support astronomy syntax to formulate the coordinate-based probabilistic join spatial constraints: celestial regions

6 Sample SQL query Standard SQL Probabilistic crossmatch
SELECT s.objId, g.objID, t.objID, s.ra, s.dec, g.ra, g.dec, t.ra, t.dec, x.ra, x.dec FROM SDSSDR7:Galaxies AS s CROSS JOIN Galex:Galaxies AS g CROSS JOIN TwoMASS:ExtendedSources AS t XMATCH BAYESIAN AS x MUST s ON POINT(s.cx, s.cy, s.cz), 0.1 MUST g ON POINT(g.ra, g.dec), 0.2 MAY t ON POINT(t.ra, t.dec), 0.5 HAVING LIMIT 1e3 REGION CIRCLE J , 0.3, 60 Standard SQL Probabilistic crossmatch Spatial constraint

7 Zone algorithms Pure SQL: Can leverage from query optimizer of SQL Server Divide sphere into zones ZoneID: very simple hash on declination Indexes built on ZoneID and right ascension help very quick pre-filtering of match candidates very well parallelized on multi-core machines [Gray, Szalay & Nieto-Santisteban 2006, The Zones Algorithm for Finding Points-Near-a-Point or Cross-Matching Spatial Datasets]

8


Download ppt "Sky Query: A distributed query engine for astronomy"

Similar presentations


Ads by Google