Astronomical Data Archiving and Curation Clive Page AstroGrid Project University of Leicester 2004 March 22.

Slides:



Advertisements
Similar presentations
September 13, 2004NVO Summer School1 VO Protocols Overview Tom McGlynn NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY.
Advertisements

September 13, 2004NVO Summer School1 VO Protocols Overview Tom McGlynn NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY.
What does LOFAR have to do with the Virtual Observatory (VO)? LOFAR Science Day 16 December 2003 Melbourne David Barnes The University of Melbourne.
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
A PPARC funded project AstroGrid Framework Consortium meeting, Dec 14-15, 2004 Edinburgh Tony Linde Programme Manager.
A PPARC funded project The Grid Data Warehouse Description of prototype work in progress by AstroGrid. Access-Grid lecture to Universities of Leeds and.
AstroGrid Consortium Meeting PM Report AstroGrid Consortium Meeting Overview Activities Finance Recruitment Collaboration Phase B.
19-20 March 2003 IVOA Registry Workgroup LeSc Astrogrid Registry: Early Designs Elizabeth Auden Astrogrid Registry Workgroup Leader IVOA Registry Workgroup.
Solar and STP Physics with AstroGrid 1. Mullard Space Science Laboratory, University College London. 2. School of Physics and Astronomy, University of.
National Astronomy Meeting 5 th April 2006 The X-Ray Synoptic Viewer: X-Ray Data Access and Reduction in the Virtual Observatory Duncan Law-Green (LEDAS:
Leicester Database & Archive Service J. D. Law-Green, J. P. Osborne, R. S. Warwick X-Ray & Observational Astronomy Group, University of Leicester What.
Virtual Observatory Single Sign-on U.S. National Virtual Observatory National Center for Supercomputing Applications Ray Plante, Bill Baker.
Leicester Database & Archive Service J. D. Law-Green, S. W. Poulton, J. Osborne, R. S. Warwick Dept. of Physics & Astronomy, University of Leicester LEDAS.
Data preservation & the Virtual Observatory Bob Mann Wide-Field Astronomy Unit Royal Observatory Edinburgh
AstroGrid Group 7: Teemu Toivola Tero Viitala. Problem several separate databases no common interface between databases difficulties of joining related.
BinX and Astronomy Bob Mann Institute for Astronomy and National e-Science Centre.
18 April 2007 Second Generation VLT Instruments 1 VIRCAM & CPL: Lessons Learned Jim Lewis and Peter Bunclark Cambridge Astronomy Survey Unit.
Data provenance in astronomy Bob Mann Wide-Field Astronomy Unit University of Edinburgh
Aus-VO: Progress in the Australian Virtual Observatory Tara Murphy Australia Telescope National Facility.
Introduction to Sky Survey Problems Bob Mann. Introduction to sky survey database problems Astronomical data Astronomical databases –The Virtual Observatory.
2003 April 151 Data Centres: Connecting to the Real World Clive Page.
S. Derriere et al., ESSW03 Budapest, 2003 May 20 UCDs - metadata for astronomy Sébastien Derriere François Ochsenbein Thomas Boch CDS, Observatoire astronomique.
Astrogrid Resource Registry Querying the Registry 1.Mullard Space Science Laboratory, University College London, Holmbury St. Mary, Dorking, Surrey RH5.
A PPARC funded project AstroGrid: new technology for the virtual observatory SC2004 Pittsburgh, PA November 2004 Guy Rixon AstroGrid Technical Architect.
A In-Memory Compressed XML Representation of Astronomical Data PPARC UK e-Science Postgraduate School ’05 O’Neil Delpratt – PhD Student University of Leicester.
WSRF Supported Data Access Service (VO-DAS)‏ Chao Liu, Haijun Tian, Dan Gao, Yang Yang, Yong Lu China-VO National Astronomical Observatories, CAS, China.
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
Talk structure who are we ? what is a VO ? what are the challenges ? what is an e-project ? Andy Lawrence Garching June 2002.
Hello!. International Virtual Observatory Alliance Ajit Kembhavi, IUCAA, Pune.
Science Archive for Sky Surveys Data Providers and the VO - NeSC 2003 March Wide Field Astronomy Unit Institute for Astronomy.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
Spectroscopy in VO, ESAC Mar Access to Spectroscopic Data In the VO Doug Tody (NRAO/US-NVO ) for the IVOA DAL working group I NTERNATIONAL.
NEON Obs School 11-Aug-2005 Archival Data and Virtual Observatories 1 Virtual Observatories...or how to do your research from a beach in the Bahamas rather.
How to Adapt existing Archives to VO: the ISO and XMM-Newton cases Research and Scientific Support Department Science Operations.
AstroGrid Overview AG-SAG Cambridge IoA 19 th June 2003 Tony Linde AstroGrid Project Manager University of Leicester, Dept. Physics & Astronomy.
Summary of distributed tools of potential use for JRA3 Dugan Witherick HPC Programmer for the Miracle Consortium University College.
July 16, 2004P. Padovani, NEON Archive School Science with multi-wavelength Archival Data Paolo Padovani (ESO) Virtual Observatory Systems Department &
Strasbourg astronomical Data Centre (DS) Françoise GENOVA.
1 Database Management Systems: part of the solution or part of the problem? Clive Page 2004 April 28.
AstroGrid: The UK’s Virtual Observatory Dr Dugan Witherick – Astrophysics Group, UCL Wednesday 5 th December 2007 The University of Warwick.
1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data.
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
Understand how the future will work exchange information on key projects understand PPARC priorities debate (conclude?) community approach to the PPARC.
Research Networks and Astronomy Richard Schilizzi Joint Institute for VLBI in Europe
The Virtual Observatory Europe and the VO: the Astrophysical Virtual Observatory and the EURO-VO Astrophysical Virtual Observatory and the EURO-VO Paolo.
Solar and space physics datasets within a Virtual Observatory: the AstroGrid experience Silvia Dalla * and Nicholas A Walton  * School of Physics & Astronomy,
AstroGrid Solar/STP planning meeting Agenda: Helioscope Preparing for Solar-B Time-series viewing application IVOA and time series A PPARC funded project.
Data Centre Activities at Leicester LEDAS and Swift UKDC AstroGrid Consortium Meeting: 11 July 2005.
A PPARC funded project VOTech AstroGrid DSA Update Kona Andrews Institute for Astronomy University of Edinburgh.
● Radio telescope arrays – km diameter – Resolution arcmin to micro-arcsec at radio wavelengths ● Similar (baseline/ wavelength) for other regimes.
EURO-VO Structure Data Centre Alliance (DCA) A collaborative and operational network of European data centres who, by the uptake of new VO technologies.
Who are we ? what is a VO ? what is a Grid ? how do we get there ? Andy Lawrence S.P.I.E. Hawaii Aug 2002 AstroGrid
Sky Survey Database Design National e-Science Centre Edinburgh 8 April 2003.
The International Virtual Observatory Alliance (IVOA) interoperability in action.
The ATNF Pulsar Data Archive Matthew Whiting (ATNF) Albert Teoh, David Smith, Lucyna Kedziora-Chudczer, Dick Manchester, Vince McIntyre 2nd Gravitational.
Data Archives: Migration and Maintenance Douglas J. Mink Telescope Data Center Smithsonian Astrophysical Observatory NSF
Science ESAC Cluster Final Archive Pedro Osuna Head of the Science Archives and VO Team Science Operations Department CAA-CFA Review Meeting.
AstroGrid How to make your data famous OR One-click PhD creation.
AstroGrid NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL.
The Large Synoptic Survey Telescope Project Bob Mann Wide-Field Astronomy Unit University of Edinburgh.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
Publishing Combined Image & Spectral Data Packages Introduction to MEx M. Sierra, J.-C. Malapert, B. Rino VO ESO - Garching Virtual Observatory Info-Workshop.
AstroGrid Datacenters AstroGrid Consortium Review Dec 2004 Martin Hill
AstroGrid & VO Structure NeSC, Edinburgh 21-March-2003 UK Astronomical Data Centres.
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
Critical Design Review, April 2003
Archiving of solar data Luis Sanchez Solar and Heliospheric Archive Scientist Research and Scientific Support Department.
Moving towards the Virtual Observatory Paolo Padovani, ST-ECF/ESO
Google Sky.
Web Application Development Using PHP
Presentation transcript:

Astronomical Data Archiving and Curation Clive Page AstroGrid Project University of Leicester 2004 March 22

Importance of Data Archiving in Astronomy No observation can be repeated exactly, as the sky is always changing –After a violent event (e.g. supernova explosion) earlier observations are crucial Observations over a long period can identify –Variability –Proper motions In recent years all data come in digital form Important earlier datasets on photographic plates have now mostly been digitised.

Principal Data Types in Archives Raw data from telescopes Observing logs Calibration datasets Calibrated/reduced data: –Images –Spectra –Time-series Derived data products: –Source catalogues –Sky survey image collections

Data Formats A variety, but FITS format predominates: –FITS can store arrays and tables, and encapsulates data and metadata, but… Standards have evolved, older FITS files less compatible Individual observatory conventions also exist Metadata vital - sometimes to be found only: –In associated software packages or documentation –In the heads of those developing the software

Important UK data archive sites Cambridge - Astronomical Survey Unit (CASU): –INT wide-field survey, APM catalogue, VIZIER mirror, UKIRT archive. In future: WFCAM, VISTA. Edinburgh – Wide-field Astronomy Unit (WFAU) –SuperCOSMOS images and catalogue, 6df galaxy survey, SLOAN DSS copy. In future: WFCAM, VISTA. Leicester - Data Archive Service (LEDAS): –EXOSAT, GINGA, ASCA, ROSAT, XMM; Chandra mirror, many optical datasets. In future: SWIFT, SuperWASP source archive.

Important UK data archive sites (continued) Manchester - Jodrell Bank: –Merlin, HI surveys, European VLBI datasets, pulsar catalogues. Future: e-Merlin archive. Rutherford Laboratory: –World Data Centre for STP, CLUSTER and ISO UK data centres, Starlink software collection and data archive. In future: SuperWASP image archive. UCL - Mullard Space Science Laboratory: –YOHKOH, SOHO, TRACE, ReSIK and other solar/STP archives.

Database management systems DBMS currently used by UK archives include: –BROWSE – written at ESOC/ESTEC in 1980s. –DB2 (IBM) –Ingres –miniSQL – free simple DBMS –MySQL – open source, supports many web sites –PostgreSQL – open source, good spatial indexing –SQL Server (Microsoft) –Sybase ASE –WFCtools – written at Harvard/SAO for accessing large optical catalogues

User access methods Residual telnet/ssh services –Allows registered users to perform DBMS operations store their own subsets etc. –Mostly obsolescent FTP access for large downloads Web interfaces use CGI with Perl, PHP, or Python –Results mostly returned as HTML tables/GIFs, with some FITS and VOtable. No use (pre-AstroGrid) of XML-based Web Services (Xforms, SOAP, WSDL etc.)

Problems – (1) technical Data storage: thanks to Moore’s Law, new datasets are much bigger than old ones. May get adequate storage for existing data from: –new big projects like WFCAM, SWIFT, e-MERLIN, VISTA? –SRIF funding? International Virtual Observatory Alliance (IVOA) is developing new standards e.g. for tabular data, registry, query language. –These have to be implemented before fully stable. DBMS: freeware like MySQL, PostgreSQL improving rapidly, probably adequate. –If not, licence costs may be substantial. Database middleware (OGSA-DAI, ELDAS) –still developing, not quite ready for large-scale use

Problems – (2) structural Data preservation requires migration to new platforms, new DBMS every few years Many DBMS in use are incapable of supporting functionality required e.g. no spatial indexing –Also implies migration to new DBMS AstroGrid (and other VO projects) will supply the middleware, but have no remit (and no funding) to update the archives themselves. Serious data mining research will require serious processing power near the data stores (e.g. an Astronomical Data Warehouse).

Problems – (3) managerial VO software from AstroGrid includes MySpace: a temporary user space on remote systems. –Optional, but highly desirable because of need to “shift the results not the data” –will sites give space to users unknown to them? –how to administer many ad-hoc groups of users? Creation of the VO Registry will require considerable input from managers of existing data archives – exact mechanism TBD.

Manpower Additional manpower needed for: Migration of existing data collections to new platforms, and often to new DBMS Installation of AstroGrid and other VO software Provision of metadata to the Registry Implementation and operation of MySpace Setting up astronomical data warehouse facilities at a few sites

Funding problems SRIF funding is for hardware only, not manpower AstroGrid2 bid failed to get support for elements of data centre support PPARC grant applications to support data archiving and curation have an unhappy history: they tend to fall between research and projects funding lines.

Summary Archives have a vital role in astronomy –They are basically in good shape in that no important bits have been lost (as far as I know) –But we have been muddling through Technical problems look soluble Data storage – we may be able to find enough Much work needed on current archives for them to survive into the VO era. Additional skilled manpower will be essential – sources of support for this are lacking Continuity is vital for archives – this is a long- term problem with no obvious solution.