Download presentation
Presentation is loading. Please wait.
Published byMiles Copeland Modified over 9 years ago
1
1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data curation at WFAU Edinburgh e-Science
2
what is curation ?
3
3 10-June-2004Andy Lawrence : PPARC data curation panel meeting not just preservation Data product creation Documentation Physical storage, organisation, migration Release control Revision & Annotation Services attached to holdings
4
4 10-June-2004Andy Lawrence : PPARC data curation panel meeting Data Services Browsing Download Queries Analysis
5
5 10-June-2004Andy Lawrence : PPARC data curation panel meeting not just a cupboard Plain archive = organised repository Science archive = system for doing science repository + access + services
6
The Virtual Observatory
7
7 10-June-2004Andy Lawrence : PPARC data curation panel meeting VObs : Generic Science Drivers data growth : volume and richness desire to work on-line multi-archive science large database science empowerment --> professional data management
8
8 10-June-2004Andy Lawrence : PPARC data curation panel meeting Data re-use : a market fact HST : more retrieval than ingest
9
9 10-June-2004Andy Lawrence : PPARC data curation panel meeting VObs : Technical Drivers data rate, storage and flops : x 1000 /decade but device bw x 10/decade search engines next to the data backbone network bw ~Gbps but end-end bw ~10Mbps analyse in situ : shift results not data
10
10 10-June-2004Andy Lawrence : PPARC data curation panel meeting network development higher level protocols ==> transparency TCP/IP message exchange HTTPdoc sharing (web) grid suiteCPU sharing XML/SOAP data exchange
11
11 10-June-2004Andy Lawrence : PPARC data curation panel meeting data centre primacy these developments all point same way data centres take a central role they become active service centres they need to present a common front
12
12 10-June-2004Andy Lawrence : PPARC data curation panel meeting the VObs concept web all docs in the world inside your PC VObs all databases in the world inside your PC
13
13 10-June-2004Andy Lawrence : PPARC data curation panel meeting VObs geometry not a warehouse not a hierarchy not a peer-to-peer system small set of service centres and large population of end users consistent with market model or centrally planned model for DCs
14
the VObs application web service job results anything web service web service web service web service web service Registry Workflow GLUE Certification MySpace standard semantics publish WSDL grid connected
15
work needed application grid service job results anything grid service grid service grid service grid service grid service Registry Workflow GLUE AstroPass MySpace pooled resource standard semantics TOOLS STANDARDS INFRASTRUCTURE TECHNOLOGY RESEARCH DATA SERVICES (access and analysis) INF. UPTAKE DATA PIPELINES ontology PHYSICAL GRID
16
who does what ? pipelines DCs physical grid DCs but we help infrastructure uptake DCs but we help data services DCs but we help technology research AstroGrid infrastructure AstroGrid standards IVOA including AstroGrid tools various but we collaborate
17
17 10-June-2004Andy Lawrence : PPARC data curation panel meeting publishing metaphor facilities are authors data centres are publishers VObs portals are shops end-users are readers VObs infrastructure is distribution system.
18
18 10-June-2004Andy Lawrence : PPARC data curation panel meeting Data Centre Alliance (DCA) AstroGrid-2 and Euro-VO proposals both propose forming a DCA AG2 requested money for baseline DC support VO uptake not funded VEGA proposed joint pipe/archive development for VISTA, Eddington, GAIA partly funded
19
data curation at WFAU
20
20 10-June-2004Andy Lawrence : PPARC data curation panel meeting WFAU approach science products not raw data survey or large project datasets live data sets not historical archive not a one-stop shop focus on on improved service query interface, data mining work with local computer scientists eg junk detection algorithms XML compression methods
21
21 10-June-2004Andy Lawrence : PPARC data curation panel meeting WFAU now Legacy photographic atlas 17,000 plates SuperCOSMOS Science Archive all-sky pixel map and object catalogue SQL interface 6DF redshift survey spectra and image thumbnails SQL interface SDSS mirror
22
22 10-June-2004Andy Lawrence : PPARC data curation panel meeting WFAU next WFCAM Science Archive 200520TB/yr deep IR sky survey : pixels and catalogues collaboration with CASU and JAC 2MASS, USNO-B, SDSS attached VISTA 2007 100TB/yr IR and maybe optical considering : RAVE, JSA, GAIA
23
Edinburgh e-Science
24
24 10-June-2004Andy Lawrence : PPARC data curation panel meeting Edinburgh activities School of Informatics AI, databases, DM algorithms,.. National e-Science Centre EPCC National Digital Curation Centre
25
25 10-June-2004Andy Lawrence : PPARC data curation panel meeting Digital Curation Centre funded by JISC and EPSRC partners : Edinburgh, Glasgow, CLRC, UKOLN not a warehouse... research in curation technology development of standard protocols and policies advice, support, training, best practice
26
26 10-June-2004Andy Lawrence : PPARC data curation panel meeting Digital Curation Structure Archival Storage Access Management ProducerProducer Preservation Planning Administration ConsumerConsumer Ingest Data Management From CCDSD, 2001 Lord, Macdonald
27
27 10-June-2004Andy Lawrence : PPARC data curation panel meeting... + curation Scientist Research Process Secondary (derived) data Tertiary data for publication Primary publication Secondary publication Tertiary Publication Peer Review Pre-print Publication Archives Library - Peers - Public - Industry Publication Process Primary data Web Content Patent data Research Process Research based on data Curation Curator Curation Process Archived data Data repositories Metadata
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.