Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data.

Similar presentations


Presentation on theme: "1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data."— Presentation transcript:

1 1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data curation at WFAU Edinburgh e-Science

2 what is curation ?

3 3 10-June-2004Andy Lawrence : PPARC data curation panel meeting not just preservation Data product creation Documentation Physical storage, organisation, migration Release control Revision & Annotation Services attached to holdings

4 4 10-June-2004Andy Lawrence : PPARC data curation panel meeting Data Services Browsing Download Queries Analysis

5 5 10-June-2004Andy Lawrence : PPARC data curation panel meeting not just a cupboard Plain archive = organised repository Science archive = system for doing science repository + access + services

6 The Virtual Observatory

7 7 10-June-2004Andy Lawrence : PPARC data curation panel meeting VObs : Generic Science Drivers data growth : volume and richness desire to work on-line multi-archive science large database science empowerment --> professional data management

8 8 10-June-2004Andy Lawrence : PPARC data curation panel meeting Data re-use : a market fact HST : more retrieval than ingest

9 9 10-June-2004Andy Lawrence : PPARC data curation panel meeting VObs : Technical Drivers data rate, storage and flops : x 1000 /decade but device bw x 10/decade search engines next to the data backbone network bw ~Gbps but end-end bw ~10Mbps analyse in situ : shift results not data

10 10 10-June-2004Andy Lawrence : PPARC data curation panel meeting network development higher level protocols ==> transparency TCP/IP message exchange HTTPdoc sharing (web) grid suiteCPU sharing XML/SOAP data exchange

11 11 10-June-2004Andy Lawrence : PPARC data curation panel meeting data centre primacy these developments all point same way data centres take a central role they become active service centres they need to present a common front

12 12 10-June-2004Andy Lawrence : PPARC data curation panel meeting the VObs concept web all docs in the world inside your PC VObs all databases in the world inside your PC

13 13 10-June-2004Andy Lawrence : PPARC data curation panel meeting VObs geometry not a warehouse not a hierarchy not a peer-to-peer system small set of service centres and large population of end users consistent with market model or centrally planned model for DCs

14 the VObs application web service job results anything web service web service web service web service web service Registry Workflow GLUE Certification MySpace standard semantics publish WSDL grid connected

15 work needed application grid service job results anything grid service grid service grid service grid service grid service Registry Workflow GLUE AstroPass MySpace pooled resource standard semantics TOOLS STANDARDS INFRASTRUCTURE TECHNOLOGY RESEARCH DATA SERVICES (access and analysis) INF. UPTAKE DATA PIPELINES ontology PHYSICAL GRID

16 who does what ? pipelines DCs physical grid DCs but we help infrastructure uptake DCs but we help data services DCs but we help technology research AstroGrid infrastructure AstroGrid standards IVOA including AstroGrid tools various but we collaborate

17 17 10-June-2004Andy Lawrence : PPARC data curation panel meeting publishing metaphor facilities are authors data centres are publishers VObs portals are shops end-users are readers VObs infrastructure is distribution system.

18 18 10-June-2004Andy Lawrence : PPARC data curation panel meeting Data Centre Alliance (DCA) AstroGrid-2 and Euro-VO proposals both propose forming a DCA AG2 requested money for baseline DC support VO uptake not funded VEGA proposed joint pipe/archive development for VISTA, Eddington, GAIA partly funded

19 data curation at WFAU

20 20 10-June-2004Andy Lawrence : PPARC data curation panel meeting WFAU approach science products not raw data survey or large project datasets live data sets not historical archive not a one-stop shop focus on on improved service query interface, data mining work with local computer scientists eg junk detection algorithms XML compression methods

21 21 10-June-2004Andy Lawrence : PPARC data curation panel meeting WFAU now Legacy photographic atlas 17,000 plates SuperCOSMOS Science Archive all-sky pixel map and object catalogue SQL interface 6DF redshift survey spectra and image thumbnails SQL interface SDSS mirror

22 22 10-June-2004Andy Lawrence : PPARC data curation panel meeting WFAU next WFCAM Science Archive 200520TB/yr deep IR sky survey : pixels and catalogues collaboration with CASU and JAC 2MASS, USNO-B, SDSS attached VISTA 2007 100TB/yr IR and maybe optical considering : RAVE, JSA, GAIA

23 Edinburgh e-Science

24 24 10-June-2004Andy Lawrence : PPARC data curation panel meeting Edinburgh activities School of Informatics AI, databases, DM algorithms,.. National e-Science Centre EPCC National Digital Curation Centre

25 25 10-June-2004Andy Lawrence : PPARC data curation panel meeting Digital Curation Centre funded by JISC and EPSRC partners : Edinburgh, Glasgow, CLRC, UKOLN not a warehouse... research in curation technology development of standard protocols and policies advice, support, training, best practice

26 26 10-June-2004Andy Lawrence : PPARC data curation panel meeting Digital Curation Structure Archival Storage Access Management ProducerProducer Preservation Planning Administration ConsumerConsumer Ingest Data Management From CCDSD, 2001 Lord, Macdonald

27 27 10-June-2004Andy Lawrence : PPARC data curation panel meeting... + curation Scientist Research Process Secondary (derived) data Tertiary data for publication Primary publication Secondary publication Tertiary Publication Peer Review Pre-print Publication Archives Library - Peers - Public - Industry Publication Process Primary data Web Content Patent data Research Process Research based on data Curation Curator Curation Process Archived data Data repositories Metadata


Download ppt "1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data."

Similar presentations


Ads by Google