Download presentation
Presentation is loading. Please wait.
Published byRolf Lawson Modified over 9 years ago
1
To the Problem of Organizing Heterogeneous Information Olga Zhelenkova 1,2, Vladimir Vitkovskij 1,2 (1) SAO RAS (Nizhnij Arkhyz), (2) ITMO University (Saint-Petersburg) 1 Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands SAO RAS
2
The science use case: a multi-band study of a sample of radio sources (I) Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 2 Series of blind surveys of 20´ sky strip centered on δ 1981 =+04° 57´± 20´ (SS433) carried out on the radio telescope RATAN-600 in 1980-1999 on 3.9GHz. (1) RC (RATAN COLD) catalogue obtained from observations of the deep survey COLD in 1980 (a,b). The steep spectrum RC- sample studied since the early 90s (c,d). (2) Refined RC (RCR) catalogue obtained from the blind survey observations, 1980-1999 (e). 562 RCR radio sources are in the range α 2000 = [07 h – 17 h ] (~100 □ °) intersecting with SDSS and FIRST surveys; 90%-completeness on S 3.9GHz >15mJy (S 1.4GHz >28mJy) for α mean ~0.52 (S ν ~ν -α ). They are almost completely identified (96%), with 260 objects identified the first time (f). a- Parijskij et al., 1992A&AS...96..583P; b- Parijskij et al., 1993A&AS...98..391P; c- Goss et al., 1992AZh....69..673G; d- Parijskij et al., 2010ARep...54..675P; e- Soboleva et al., 2010AstBu..65...42S; f- Zhelenkova et al., 2013AstBu..68…26Z.
3
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 3 collect all available in free access data for optical identification and investigate of the RCR-sample; data collecting, visualization, statistic analysis with VO tools – ALADIN (1), TOPCAT (2), VIZIER (3), NED (4), ds9 (5), casjobs (6), SkyView (7) ; organize collected data (PostgrSQL + web-inteface) for a further study (8). The science use case: a multi-band study of a sample of radio sources (II) (1) Bonnarel et al., 2000A&AS..143...33B; (2) Taylor, 2005ASPC...347..29; (3) Ochsenbein et al., 2000A&AS..143…23O; (4) Mazarrella et al., 2007ASPC..376..153M; (5) Joye&Mandel, 2003ASPC..295..489J; (6) O’Mullane et al., 2005cs........2072O; (7) McGlynn, 2007ASPC..382...43M; (8) http://www.sao.ru/fetch/cgi-bin/SkyObj/rcrn.cgi
4
Catalogues surveys Spectral rangeResolution, error Limit radio VLSS74 МГц80”500mJy TXS365 MГц~10”150 mJy NVSS1.4 GГц45” 2.5 mJy FIRST1.4 GГц5.4”1 mJy GB64.85 GГц3.5'28-37mJy mm, submm Planck30GHz, 44GHz, 70GHz, 100GHz, 143GHz, 217GHz, 353GHz, 545GHz, 857GHz 33', 27', 13', 10', 7', 5', 4', 4', 4' 0.5Jy, 0.6Jy, 0.5Jy, 0.3Jy, 0.2Jy, 0.2Jy, 0.2Jy, 0.4Jy, 0.7Jy IR WISE3.4μm, 4.4μm, 12μm, 22μm0.2', 0.1', 0.1', 0.1' 0.2” 16.6 m, 15.6 m, 11.3 m, 8.0 m 2MASSJ,H,K0.2”, 10%15.8 m, 15.1 m, 14.3 m LAS UKIDSSY, J,H,K (H+K)<0.1”20.5 m, 20.0 m, 18.8 m,18.4 m optics DSS-IIblue, red, IR~21 m SDSSu, g, r, i, z (g+r+i)±0.1”22.0 m, 22.2 m, 22.2 m, 21.3 m, 20.5 m (~23 m ) USNO-B1B1, R1, B2, R2, I0.2”, 0.3 m V =21 m GSC 2.3.2J, F, N0.2”- 0.28” 0.13 m -0.22 m R F =20 m 4 The science use case: a multi-band study of a sample of radio sources (III)
5
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 5 The science use case: a multi-band study of a sample of radio sources (IV)
6
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 6 The science use case: a multi-band study of a sample of radio sources (V)
7
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 7 The science use case: problems – manipulate with many parameters and images 1 st stage: VLSS, NVSS, FIRST, GB6 and DSS (USNO-B1, GSC.2.3), SDSS DR1, 2MASS, also NED; 2 nd stage: added LAS UKIDSS, used newer release SDSS; 3 rd stage: added WISE, used newer releases SDSS LAS UKIDSS; 4 th stage: added Planck, used SDSS DR10, LAS UKIDSS DR9. 1)9 catalogues (~110 parameters) and images from 7 digital surveys (12 maps, contour overplays); 2)10 catalogues (~130 parameters) and images from 8 digital surveys (16 maps, contour overlays); 3)11 catalogues (~150 parameters) and images from 9 digital surveys (18 maps, contour overlays). Results: RCR-sources are almost completely identified (96%), with ~45% objects identified the first time; 4)12 catalogues (>150 parameters) and images from 10 digital surveys (28 maps, contour overlays).
8
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 8 The science use case: what we need Thanks for efforts of the International Virtual Observatory Alliance we now have excellent tools providing web-services for access and visualization data like ALADIN, SAOImage DS9, TOPCAT, Vizier, NED and so on. But other problems need further activities. i.Easy access to data – request and download - ++ ii.Visualization of different type of data - + iii.Keep the collected data up to date - ? iv.Can easily manipulate collected data - ? v.Interchange and publish new knowledge about objects - ? vi.Store together different data and knowledge about an object - ?
9
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 9 Available projects: keep the collected data up to date VO Data Keeping-up Agent (VOdka) - the web-application for support users’ data [O.Laurino & S.Smareglia, ASP 442, 571 (2011)]: possibility for users to be asynchronously notified when new data are available, give users a quick look of what data, relevant to their research interests, can be found in the Virtual Observatory, make the users’ queries and results persistent.
10
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 10 Available projects: interchange and publish new knowledge about objects with annotations AstroDAS (Bose et al. 2006IPAW..1445..154B): annotating astronomy catalogues to provide astronomers with the ability to share their assertions about matching celestial objects. AstroDAbis (Gray N. et al., arXiv:1111.6116, http://astrodabis.roe.ac.uk) service provides a stand-off annotation service for astronomical catalogue entries. AstroDAbis service will implicitly create URI names for every object in catalogues. SKUA (Semantic Knowledge Underpinning Astronomy, N. Gray & T. Linde, ASP, 2009, https://code.google.com/p/skua/) is a web-application for a semantic infrastructure for astronomy based on the organisation of annotation services. ADSASS (ADS All-Sky Survey, Pepe A. et al., arXiv:1111.3983) is an ongoing effort aimed at turning the NASA Astrophysics Data System (ADS) into a data resource based on ideas of geo-information systems.
11
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 11 Available formats: store together different data about an object. FITS FITS is a simple and easily understood self-describing format which holds its information in metadata and data blocks. Metadata are captured via key-value pairs. Headers may or may not be then grouped with data blocks. The first header is denoted as the “primary” header and subsequent headers known as “extensions”. The standard supports rules for development new data structure – extension (Pence et al., A&A 524, A42 (2010).
12
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 12 Available formats: store together different data about an object. VOTable VOTable is designed as a flexible storage and exchange format for tabular data. Its interoperability is encouraged through the use of XML. VOTable has built-in features for big-data and Grid computing. It allows metadata and data to be stored separately, with the remote data linked. (VOTable Format Definition V.1.093 (http://cdsweb.u- strasbg.fr/doc/VOTable/1.092/votable.htx).
13
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 13 Astronomy is a very good science at free sharing data, but poorer at sharing knowledge. The fundamental problem remains - data and knowledge store in different places: archives contain only basic observational data, whereas all the astrophysical interpretation of that data is contained in journal papers. Need to do the next step which may help for more effective discovery and research - to keep all collected about an object/objects of researcher’s interest data together also add annotations and textual representation of queries (for possibility of repeat updating requests). Summary
14
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 14 ALADIN stack as a new FITS-extension (or VOTable, of HDF5 variant) The internal format of ALADIN is named a stack. It is a flat XML-similar file represented all-collected (images and tables) about an object information as planes with appropriate descriptions and results of requests. This data format proved convenient when working with heterogeneous information collected for the study of the objects of interest to the researcher. Structure of the ALADIN stack can be represented as a new extension of FITS.
15
Big Data Across Disciplines: In Search of Symbiosis. 3-5 November, 2014. Groningen, Netherlands 15 Thank you for attention ! Work supported by the Russian Fund of Basic Research, grants 12-07-00503-a, 14-07-00361-a
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.