Nicholas Cross, Rob Blake, Ross Collins, Mark Holliman, Mike Read, Eckhard Sutorius, Nigel Hambly, Andy Lawrence, Bob Mann, Keith Noddle Wide Field Astronomy.

Slides:



Advertisements
Similar presentations
Archiving and Retrieving Purchase Orders and Invoices
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Introduction to XHTML Programming the World Wide Web Fourth edition.
Chapter 1: The Database Environment
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Chapter 1 The Study of Body Function Image PowerPoint
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Effective Change Detection Using Sampling Junghoo John Cho Alexandros Ntoulas UCLA.
State of the TCG Christophe Arviset For the TCG. TCG State of the TCG 17 May 2010 Christophe Arviset – TCG chair Page 2 Technical Coordination Group Technical.
UNITED NATIONS Shipment Details Report – January 2006.
1. 2 Joy Nichols, Jennifer Lauer, Doug Morgan, and Beth Sundheim Harvard-Smithsonian Center for Astrophysics Eric Martin Northrop Grumman Space Technology.
1 Hyades Command Routing Message flow and data translation.
18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
Making the System Operational
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Solve Multi-step Equations
Law School 1 Using Blackboard Assignment tool for e-submission, e-marking, e-feedback Jane Daly 21 st March 2013.
26 January 2007 CAL 07 Garching 1 The VISTA Data Flow System Jim Lewis, Mike Irwin, Peter Bunclark, Simon Hodgkin Cambridge Astronomy Survey Unit.
Mark Holliman Wide Field Astronomy Unit Institute for Astronomy University of Edinburgh.
Information Systems Today: Managing in the Digital World
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
ABC Technology Project
1 Web-Enabled Decision Support Systems Access Introduction: Touring Access Prof. Name Position (123) University Name.
INSERT BOOK COVER 1Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall. Exploring Microsoft Office Excel 2010 by Robert Grauer, Keith.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
1 Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this proposal or quotation. An Introduction to Data.
VOORBLAD.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
4 Oracle Data Integrator First Project – Simple Transformations: One source, one target 3-1.
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
April 2003 ONLINE SERVICE DELIVERY Presentation. 2 What is Online Service Delivery? Vision The current vision of the Online Service Delivery program is.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Executional Architecture
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
Slippery Slope
H to shape fully developed personality to shape fully developed personality for successful application in life for successful.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Chapter 13 The Data Warehouse
Energy Generation in Mitochondria and Chlorplasts
CpSc 3220 Designing a Database
© Paradigm Publishing, Inc Access 2010 Level 2 Unit 2Advanced Reports, Access Tools, and Customizing Access Chapter 8Integrating Access Data.
Management Information Systems, 10/e
From Model-based to Model-driven Design of User Interfaces.
WFAU Science Archives surveys.roe.ac.uk WFAU Demonstration Nick Cross, Rob Blake, Ross Collins, Mike Read, Eckhard Sutorius, Mark Holliman, Stellios Voutsinas,
Nicholas Cross, Rob Blake, Ross Collins, Mark Holliman, Mike Read, Eckhard Sutorius, Nigel Hambly, Andy Lawrence, Bob Mann, Keith Noddle Wide Field Astronomy.
VISTA pipelines summit pipeline: real time DQC verified raw product to Garching standard pipeline: instrumental signature removal, catalogue production,
18 April 2007 Second Generation VLT Instruments 1 VIRCAM & CPL: Lessons Learned Jim Lewis and Peter Bunclark Cambridge Astronomy Survey Unit.
Data provenance in astronomy Bob Mann Wide-Field Astronomy Unit University of Edinburgh
Science Archive for Sky Surveys Data Providers and the VO - NeSC 2003 March Wide Field Astronomy Unit Institute for Astronomy.
NEON Obs School 11-Aug-2005 Archival Data and Virtual Observatories 1 Virtual Observatories...or how to do your research from a beach in the Bahamas rather.
WFCAM Science Archive Critical Design Review, April 2003 The SuperCOSMOS Science Archive (SSA) WFCAM Science Archive prototype Existing ad hoc flat file.
IPHAS Early Data Release E. A. Gonzalez-Solares IPHAS Consortium AstroGrid National Astronomy Meeting, 2007.
Critical Design Review, April 2003
First Public Data Releases from the VISTA Science Archive
Presentation transcript:

Nicholas Cross, Rob Blake, Ross Collins, Mark Holliman, Mike Read, Eckhard Sutorius, Nigel Hambly, Andy Lawrence, Bob Mann, Keith Noddle Wide Field Astronomy Unit, Institute for Astronomy, University of Edinburgh, UK

What is the VSA? Relational Database Management System Linked tables containing different types of data Design emulates data structure. meta-data from images Catalogue data – detections, merged sources, variability statistics Self-describing: information about each programme and processing in curation tables Microsoft SQL Server: reliable product used by SDSS, WSA, 2 of the most successful astronomical archives. Main DB and documented static releases Multiple interfaces for different scientific usage 18th January 2013VISTA, A Celebration!2

Purposes of Science Archive Interface for survey teams and community to explore survey products to do science. Interface for survey teams to check data for quality control. Repository of VISTA data from reduced images to complex, catalogue products. Requires both: a dynamic main-DB which is updated with new data, better calibration, reprocessing, quality control, higher order products. Static, well documented release-DBs that can be referred to in publications. 18th January 2013VISTA, A Celebration!3

The VSA: 18th January 2013VISTA, A Celebration!4

WFAU tasks Ingest nightly processed image and catalogue data from CASU – See MJI talk Provenance - link related images Quality Control: Automated + input from teams. Link tile and pawprint data Process data for semester - done per programme : Produce and ingest deep stacks/tiles/mosaics + catalogues Merge pass-band catalogues to create source tables Create neighbour tables to link external catalogues Link multi-epoch data and calculate variability statistics Release a documented, static data product to users Create useful interface tools for users to query specific data, view and analyse it 18th January 2013VISTA, A Celebration!5

Automated Pipeline Post QC tasks run in automated pipeline Uses DB to determine what needs to be done How many pointings, how many filters, how many epochs? What has already been completed? Have processes been done in correct order? Consistency between expected products and actual Reduces workload on operations (when it is working) Essential for processing many PI programmes with same range of products as main surveys. 18th January 2013VISTA, A Celebration!6

VISTA complications Technical: Pawprints + Tiles: two layers of products, detections from both kept 10x increase in catalogue data VVV so time consuming that a separate server is needed, but some tables and data common – synchronisation Political: WFCAM: WFAU deal with UKIDSS, CASU, UKIRT (v. occasionally) VISTA: WFAU deal with separate survey teams, ESO, CASU 18th January 2013VISTA, A Celebration!7

VISTA Tiles 18th January 2013VISTA, A Celebration!8 Pawprint: 16 detectors spaced 90% apart in X direction and 42% in Y 6 pawprints can make a tile: 2x depth on average, but ears have single depth.

Tile-pawprint detections linked Able to compare tile and pawprint photometry. Select data from specific regions of the tile or specific pawprints. 18th January 2013VISTA, A Celebration!9

Data Rates WFCAM VISTA Images: WFCAM 1720 raw image frames a day (4 2kx2k) VISTA 580 raw image frames a day (16 2kx2k) Only 30% increase in raw image volume. Catalogues: More area (2.9x), increased sensitivity (~2x), tile + pawprints (~3x). Expect 15-20x as many detections. 14x WSA: 3.2M detections per day, VSA: 44M Catalogues are important factor for relational DB VVV: VSA currently has >21billion rows, expect ~ th January 2013VISTA, A Celebration!10

VVV takes archive to new level Up to P87: VMC: 500 million detections, 18 million objects VHS: 1.9 billion detections, 270 million objects VVV: 21 billion detections, 500 million objects Data rate of VVV is 10x other surveys. Combination of shallow and dense fields ~1 billion stars, 80 – 100 epochs. 18th January 2013VISTA, A Celebration!11

Galactic Plane and Bulge from VISTA and WFCAM 18th January 2013VISTA, A Celebration!12 VVV + GPS mosaic ~ 1 billion stars March 29 th 2012, #1 most read article on BBC

VVV processing VVV takes months to process 1 year of data. Variability statistics is slowest stage. Many speed ups already separate servers for main and release DBs Split detection tables into months Parallelisation of CPU intensive processes that do not call DB I/O is main bottleneck now. Solutions: Improved optimisation of curation queries Column orientated databases Solid state disks for DB. Reprocessing of data a major headache 18th January 2013VISTA, A Celebration!13

Using the Archive Support Documentation / Publications Quality Control Different Interfaces 18th January 2013VISTA, A Celebration!14

Expert support: VSA: helpdesk queries per month Similar in WSA, more mature archive. Types of queries posed Detailed knowledge of data (and data quality) Detailed knowledge of SQL WFAU have a mixture of technical and scientific knowledge. 18th January 2013VISTA, A Celebration!15

Provenance tracking is crucial Rare object search is major analysis mode Selecting on wide range of attributes Reproducibility Is this an unusual object or a junk data value? Quality control at image and detection level is vital. Track back from each data value to parent data and processing chain Complex data structure – absent in flat-file catalogue At file level, e.g. tiles – pawprints, deeps – OBs, stack – raw. Detection level, multi-band – single band, tile detection matched to pawprint detections. 18th January 2013VISTA, A Celebration!16

Bit-wise flagging of detection data VMC deep tile. Under-exposed strip Edges Low confidence Deblends Saturated Bad pixels Detector 16 Bright stars to come 18th January 2013VISTA, A Celebration!17

Provenance: ZP and seeing of input files to a VIDEO mosaic wrt time 18th January 2013VISTA, A Celebration!18

Using the VSA Freeform SQL query (enhanced) Cross ID (now includes Synoptic Source) Archive Listing GetImage / MultiGetImage 18th January 2013VISTA, A Celebration!19

VSA – schema browser 28/08/2012 SpS15 Data Intensive Astronomy

28/1/2010VSA catalogues VISTA PSPI Meeting

Archive Listing 18th January 2013VISTA, A Celebration!22

An Example: Selecting point source variables in VIDEO 18th January 2013VISTA, A Celebration!23 Select light curve data Select variable: old selection + zRange>4 mag Type 1a SN2010gy

Enhanced Queries 18th January 2013VISTA, A Celebration!24 Take data from outside the VSA, e.g. own data, online catalogue. Use crossID to match with main survey Source table. Save matched table as FITS/VOTable Upload matched table as #userTable, use in standard queries

Surveys are not independent entities Importance of multi-wavelength astronomy GAMA, COSMOS fields, deciphering galaxy evolution, with complex star formation histories with dust. Rare objects (e.g. high-z QSOs, BDs, odd transients) Common survey fields VST-ATLAS and VHS KIDS and VIKING (GAMA extends this in some areas) VVV, GPS, IPHAS, VPHAS+ Need for data integration Cross-neighbours tables, publishing to VO Matched aperture photometry 18th January 2013VISTA, A Celebration!25

18th January 2013VISTA, A Celebration!26 VHS -- ATLAS VIKING -- KIDS VVV -- VPHAS+

WFAU Science Archives 18th January 2013VISTA, A Celebration!27 EARLIER PROCESSING AT CASU WFAU

Combining Data from Surveys Neighbour Tables (Main Existing method) Joins to main external surveys: 2MASS, SDSS, SSA, GALEX, FIRST, WISE, ….. Matched aperture photometry. Pipeline almost ready. First use on P90 data. Using VO interfaces. Output in VOTables, launcher for TOPCAT SIAP services, footprints for Aladdin. New MyDB style applications on the way. 18th January 2013VISTA, A Celebration!28

WFAU VDFS Publications Hambly et al. 2008, MNRAS, 384, 637 (WSA) Cross et al. 2009, MNRAS, 399, 1730 (Multi-Epoch processing) Cross et al. 2012, A&A, 548A, 119 (VSA) 18th January 2013VISTA, A Celebration!29

Summary VISTA an order of magnitude increase in catalogue data volume of WFCAM VSA met the challenge and producing regular releases to the survey teams and wider community. VVV very difficult, but challenge helps to keep WFAU as one of the leading data centres in the world. VSA is the most efficient way for many types of science: to find rare objects (transients in the VVV, very cool brown dwarfs, z>7 QSOs). Working with data across wide areas, multi-wavelengths Edinburgh Data Centre with VSA at the centre, linking with WSA, OSA, SSA and external data. VISTA will be very successful, many papers already published, many to come: see PI talks. 18th January 2013VISTA, A Celebration!30