The Virtual Observatory and other data issues computers and astronomy today background technology the future : opportunities and problems VO vision VO.

Slides:



Advertisements
Similar presentations
IVOA Interoperability Workshop Boston, May 2004.
Advertisements

September 13, 2004NVO Summer School1 VO Protocols Overview Tom McGlynn NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY.
What does LOFAR have to do with the Virtual Observatory (VO)? LOFAR Science Day 16 December 2003 Melbourne David Barnes The University of Melbourne.
Remote Visualisation System (RVS) By: Anil Chandra.
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes.
A PPARC funded project AstroGrid Framework Consortium meeting, Dec 14-15, 2004 Edinburgh Tony Linde Programme Manager.
AstroGrid GSC : MRC 1 AstroGrid Science Update.
A PPARC funded project The Grid Data Warehouse Description of prototype work in progress by AstroGrid. Access-Grid lecture to Universities of Leeds and.
Development of China-VO ZHAO Yongheng NAOC, Beijing Nov
Extremely Large Telescopes and Surveys Mark Dickinson, NOAO.
A PPARC funded project Tony Linde Programme Manager eScience meets eFrameworks 28 th April 2006 NeSC, Edinburgh.
Solar and STP Physics with AstroGrid 1. Mullard Space Science Laboratory, University College London. 2. School of Physics and Astronomy, University of.
Data-Intensive Computing in the Science Community Alex Szalay, JHU.
CS597A: Managing and Exploring Large Datasets Kai Li.
ADASS, London Sep 23-26, 2007 VOExplorer : Visualising Data Discovery in the Virtual Observatory Jonathan Tedds University of Leicester/AstroGrid/ EuroVO.
Leicester Database & Archive Service J. D. Law-Green, J. P. Osborne, R. S. Warwick X-Ray & Observational Astronomy Group, University of Leicester What.
Virtual Observatory Single Sign-on U.S. National Virtual Observatory National Center for Supercomputing Applications Ray Plante, Bill Baker.
AstroGrid Group 7: Teemu Toivola Tero Viitala. Problem several separate databases no common interface between databases difficulties of joining related.
Developing PANDORA Mark Corbould Director, IT Business Systems.
Aus-VO: Progress in the Australian Virtual Observatory Tara Murphy Australia Telescope National Facility.
Virtual Observatory --Architecture and Specifications Chenzhou Cui Chinese Virtual Observatory (China-VO) National Astronomical Observatory of China.
A long tradition. e-science, Data Centres, and the Virtual Observatory why is e-science important ? what is the structure of the VO ? what then must we.
1 27-Sept-2004Andy Lawrence : IVOA workshop, Pune State of the IVOA state of the VO state of the IVOA.
The Dawning of the Age of Infinite Storage William Perrizo Dept of Computer Science North Dakota State Univ.
Deploying the AstroGrid: Science Use for the Black Hole Census Deploying the AstroGrid: Science Use for the Black Hole Census Nicholas Walton Institute.
A PPARC funded project AstroGrid: new technology for the virtual observatory SC2004 Pittsburgh, PA November 2004 Guy Rixon AstroGrid Technical Architect.
Alex Szalay, Jim Gray Analyzing Large Data Sets in Astronomy.
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
E-VLBI at ≥ 1 Gbps -- “unlimited” networks? Tasso Tzioumis Australia Telescope National Facility (ATNF) 4 November 2008.
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
The data challenge in astronomy archives technology problems solution DCC conference Bath Andy Lawrence Sep 2005 the virtual observatory.
Talk structure who are we ? what is a VO ? what are the challenges ? what is an e-project ? Andy Lawrence Garching June 2002.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
Section 1 # 1 CS The Age of Infinite Storage.
Computing for Space Science – Current practice and future challenges Peter Allan Head, Space Data Division.
NEON Obs School 11-Aug-2005 Archival Data and Virtual Observatories 1 Virtual Observatories...or how to do your research from a beach in the Bahamas rather.
Markus Dolensky, ESO Technical Lead The AVO Project Overview & Context ASTRO-WISE ((G)A)VO Meeting, Groningen, 06-May-2004 A number of slides are based.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Section 1 # 1 CS The Age of Infinite Storage.
AstroGrid status SOFT the VO the future AstroGrid presentation to GSC Andy Lawrence July 2003.
1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data.
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
Federated Discovery and Access in Astronomy Robert Hanisch (NIST), Ray Plante (NCSA)
Understand how the future will work exchange information on key projects understand PPARC priorities debate (conclude?) community approach to the PPARC.
Research Networks and Astronomy Richard Schilizzi Joint Institute for VLBI in Europe
EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.
The Astrophysical Virtual Observatory to EURO-VO transition Paolo Padovani ESO Virtual Observatory Systems Department EURO-VO Facility Centre Scientist.
The Project The Virtual Observatory Technical Progress Andy Lawrence Nottingham All-Hands meeting Sept 2003 AstroGrid
Astronomical Data Archiving and Curation Clive Page AstroGrid Project University of Leicester 2004 March 22.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
A PPARC funded project Workflow and Job Control in Astrogrid Jeff Lusted Dept Physics and Astronomy University of Leicester.
The Virtual Observatory Europe and the VO: the Astrophysical Virtual Observatory and the EURO-VO Astrophysical Virtual Observatory and the EURO-VO Paolo.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
EURO-VO Structure Data Centre Alliance (DCA) A collaborative and operational network of European data centres who, by the uptake of new VO technologies.
Who are we ? what is a VO ? what is a Grid ? how do we get there ? Andy Lawrence S.P.I.E. Hawaii Aug 2002 AstroGrid
1 18-Nov-2004Andy Lawrence :VO-TECH workshop, Cambridge VO-TECH : Intro Euro-VO VO-TECH project Key issues Goals of meeting.
The International Virtual Observatory Alliance (IVOA) interoperability in action.
1 8-Jun-2004Andy Lawrence : PharmaGrid talk, Diessenhofen The Virtual Observatory The VObs concept The structure of the VObs Standards, Standards, Standards.
German Astrophysical Virtual Observatory Overview and Results So Far W. Voges, G. Lemson, H.-M. Adorf.
AstroGrid NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL.
The Large Synoptic Survey Telescope Project Bob Mann Wide-Field Astronomy Unit University of Edinburgh.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
1 15-Dec-2004Andy Lawrence : AstroGrid Consortium Meeting, Edinburgh AstroGrid2 and Euro-VO AstroGrid and AVO Euro-VO VO-TECH project How to manage the.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
F. Genova, AstroNET meeting, Poitiers The Astrophysical Virtual Observatory.
1 14-Dec-2004Andy Lawrence : AstroGrid Consortium Meeting, Edinburgh Meeting Goals review achievements review architecture identify targets for AG2 and.
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
Moving towards the Virtual Observatory Paolo Padovani, ST-ECF/ESO
Jim Gray Microsoft Research
Google Sky.
Presentation transcript:

The Virtual Observatory and other data issues computers and astronomy today background technology the future : opportunities and problems VO vision VO progress next steps OECD workshop Munchen Andy Lawrence Dec 2003

computers and astronomy today (1)

key IT areas (1) facility operations (2) facility output processing (3) shared supercomputers for theory (4) science archives (5) end-user tools (1-3) : big bucks (4-5) : smaller bucks but set requirements for (1-2)

astronomical archives major archives growing at TB/yr issue not storage but management (curation) improving quality of data access and presentation needs specialist data centres

end users increasing fraction of archive re-use increasing multi-archive use most download small files and analyse at home some users process whole databases reduction standardised; analysis home grown

needles in a haystack Hambly et al faint moving object is a cool white dwarf - may be solution to the dark matter problem - but hard to find : one in a million - even harder across multiple archives

solar-terrestrial links Coronal mass ejection imaged by space- based solar observatory Effect detected hours later by satellites and ground radar

background technology (2)

dogs and fleas there is a very large dog

hardware trends ops, storage, bw : all 1000x/decade –can get 1TB IDE = $5K –backbones and LANS are Gbps but device bw 10x/decade –real PC disks 10MB/s; fibre channel SCSI poss 100MB/s and last mile problem remains –end-end b/w typically 10Mbps

operations on a TB database searching at 10 MB/s takes a day –solved by parallelism –but parallel code hard ! ==> people transfer at 10 Mbps takes a week –leave it where it is ==> data centres provide search service

network development higher level protocols ==> transparency TCP/IP message exchange HTTPdoc sharing (web) grid suiteCPU sharing XML/SOAP data exchange ==> service paradigm

on the internet horizon workflow definition dynamic semantics (ontology) software agents

the future : opportunities and problems (3)

data rich future heritage –Schmidt, IRAS, Hipparcos current hits –VLT, SDSS, 2MASS, HST, Chandra, XMM, WMAP coming up : –UKIDSS, VISTA, ALMA, JWST, Planck, Herschel cross fingers : –LSST, ELT, Lisa, Darwin,SKA, XEUS, etc. plus lots more danger is being lost in trees

data growth astronomical data is growing fast but so is computing power so whats the problem ? T 2 < 18 mths (1) Fast facilities (2) End user delivery

data rates : collection and processing reference : sky at 0.1", 1 col, 16 bits = 100 TB worst problems for FAST experiments –SKA peak PB/sec out of correlator –ELT MCAO real time control : Pflops –repeat wide field imaging, eg LSST –N**2 processing e.g. GAIA achievable but challenging –supercomputer today = 10 Tflops; Pflops ok in 20 years –may need dedicated hardwired logic –real problem is s/w development and ops logistics

data rates : archive VISTA 100 TB/yr by 2007 SKA datacubes 100PB/yr by 2020 not a technical or financial problem –LHC doing 100PB/yr by 2007 issue is logistic : data management need professional data centres

data rates : user delivery disk I/O and bandwidth –end-user bottlenecks will get WORSE –but links between data centres can be good move from download to service paradigm –leave the data where it is –operations on data (search, cluster analysis, etc) as services –shift the results not the data –networks of collaborating data centres (datagrid or VO)

user demands bar constantly raising –online ease –multi-archive transparency –easy data intensive science new requirements –automated resource discovery (intelligent Google) –cheap I/O and CPU cycles –new standards and software infrastructure

the VO vision (4)

the VO concept web all docs in the world inside your PC VO all databases in the world inside your PC

Generic science drivers data growth multi-archive science large database science can do all this now, but needs to be fast and easy empowerment

whats its not not a monolith not a warehouse

VO framework framework + standards inter-operable data inter-operable software modules no central VO-command

VO geometry not a warehouse not a hierarchy not a peer-to-peer system small set of service centres and large population of end users –note : latest hot database lives with creators / curators

yesterday browser front end CGI request html web page DB engine SQL data

today application web service SOAP/XML request SOAP/XML data DB engine SQL native data anything standard formats

tomorrow application web service job results anything web service web service web service web service web service Registry Workflow GLUE Certification VO Space standard semantics publish WSDL

VO progress (5)

parts of the big picture standards : IVOA glue writing : main VO projects implementations : national VO projects data creation : facilities data publication : data centres tools and data mining services : lots of folk

International VO alliance (IVOA)

IVOA standards formal process modelled on W3C technical working groups and interop workshops agreed functionality roadmap key standards so far –VOTable –resource and service metadata definitions –provisional semantic dictionary –provisional protocols for image and spectra

glue accomplishments two example registries (NVO and AstroGrid) conesearch service (NVO) remote analysis (AVO) virtual space management (AstroGrid) next few months –workflow, single sign-on, module interface standards

implementations several controlled demos of working s/w slowly increasing functionality can do real science now but still clunky beta testers using now, real users by end 2004 ?

AVO demo Jan 2003

tools and services smaller projects building compliant tools –eg VOPlot from India little work so far on ambitious analysis services

next steps (6)

next steps intelligent glue –ontology, agents analysis services –cluster analysis, multi-D visualisation, etc theory services –simulated data, models on demand embedding facilities –VO ready facilities –links to data creation

funding VO projects –funded off the back of e-science/grid so far –should be absorbed into mainstream : agency backing processing and data centres –big bucks...but don't skimp ! –finishing the science needs this work network infrastructure –fat pipes between service centres crucial –last mile investment will give a community an edge

castle on the hill or creative anarchy ? no VO command centre but does need co-ordination –standards body –exchange of best practice – continuing waves of technology development

Euro-VO plan three bodies VO Facility Centre (VOFC) Data Centre Alliance (DCA) VO Technology Centre (VOTC) aim to be part national, part EU funding

lessons drivers: bottlenecks, user demand, empowerment need network of healthy data centres need last mile investment need facilities to be VO ready need continuing technology development need continuing coord programme of standards need national backing and agency backing

FIN

publishing metaphor facilities are authors data centres are publishers VO portals are shops end-users are readers VO infrastructure is distribution system.

collectivisation and democratisation thirty year trend towards communal organisation –facility class (common-user) instruments –facility class data reduction s/w –calibrated archives with simple tools –information services (Vizier, ADS, NED) –large consortium projects (MACHO, 2dF, SDSS, UKIDSS, VISTA...) next steps –inter-operable archives (joint queries) –automated resource discovery (registry) –facility-class exploration and analysis tools (data mining)

How Much Information Is there? Soon everything can be recorded and indexed Most data never be seen by humans Precious Resource: Human attention Auto-Summarization Auto-Search is key technology. Yotta Zetta Exa Peta Tera Giga Mega Kilo A Book.Movi e All LoC books (words) All Books MultiMedia Everything ! Recorded A Photo 24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli

multi- views of a Supernova Remnant Shocks seen in the X- ray Heavy elements seen in the optical Dust seen in the IR Relativistic electrons seen in the radio