A centre of expertise in data curation and preservation Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike.

Slides:



Advertisements
Similar presentations
What is HathiTrust and How Can it Make a Difference? Sourcing and Scaling brought to the collective collection.
Advertisements

DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
Conference xxx - August 2003 Fabrizio Gagliardi EDG Project Leader and EGEE designated Project Director Position paper Delivery of industrial-strength.
Introduction to Planets Hans Hofman Nationaal Archief Netherlands Prague, 17 October 2008.
A centre of expertise in data curation and preservation LOCKSS Town Meeting :: DCC LOCKSS TSS :: 2 nd December 2005 DCC LOCKSS Technical Support Service.
Enrich: Repository and Research System Integration William J Nixon Enrich Project Manager, University of Glasgow.
ESDS Qualidata and QUADS Coordination Louise Corti Online Resources Day 15 November 2005, London.
Philip LordDigital Archiving Consultancy Alison Macdonald Digital Archiving Consultancy Liz LyonDigital Curation Centre David GiarettaDigital Curation.
A centre of expertise in data curation and preservation DCC/NeSC eScience Workshop, June 2008 Working in partnership with the eScience community This work.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.
Opening the Research Data Lifecycle Workshop Capturing and Sharing Research Data Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
CURRENT ISSUES Current contents Over 3,000 items open access, 42% reports and working papers, 21% journal articles, 21% conference items, 7% book chapters,
Supporting education and research Repositories in Context Digital repositories as components of an integrated infrastructure for education Leona Carpenter.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Collection-level description & collection management: tool for the trade or information trade-off? Collection Description Focus Workshop 4 Newcastle, 8.
A centre of expertise in data curation and preservation EAOLUG :: RSC :: Cambridge23 May 2006 Funded by: This work is licensed under the Creative Commons.
A centre of expertise in digital information managementwww.ukoln.ac.uk Approaches To E-Learning: Developing An E-Learning Strategy Brian Kelly UKOLN University.
Breakout 1 Socio-legal etc. Every discipline will be different & each data centre will have different answers to questions. Use a questionnaire and send.
A centre of expertise in digital information management UKOLN is supported by: Curating the Scientific Record: The Challenges Ahead Dr.
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
A centre of expertise in digital information management UKOLN is supported by: British Academy e-Resources Policy Review: UKOLN Report.
A centre of expertise in digital information management UKOLN is supported by: UK Perspectives on the Curation and Preservation of Scientific.
Federation eCrystals Federation: Open Repositories for Data-driven Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
A centre of expertise in digital information management UKOLN is supported by: Digital Futures for MLAs? A snapshot in real time. Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: UKOLN Update on Selected Activities Dr Liz Lyon, Director,
A centre of expertise in digital information management UKOLN is supported by: Memory institutions and the social fabric of the Web Dr.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
A centre of expertise in data curation and preservation Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike.
Linking Repositories Scoping Study Key Perspectives Ltd University of Hull SHERPA University of Southampton.
Collection-level description & the Information Landscape: users evaluate strategies for resource discovery Collection Description Focus Workshop 5 Cambridge,
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
A centre of expertise in data curation and preservation Preserving Digital ArchivesLUCAS March 2006 Funded by: This work is licensed under the Creative.
A centre of expertise in data curation and preservation DCC Workshop: Curating sApril 24 – 25, 2006 Funded by: This work is licensed under the Creative.
A centre of expertise in data curation and preservation UKOLN Open ForumIWMW June 2006 Funded by: This work is licensed under the Creative Commons.
A centre of expertise in data curation and preservation London :: ARK Group Workshop: Archiving the Web :: 28 Sept 2006 Funded by: This work is licensed.
A centre of expertise in data curation and preservation National FoI Group Birmingham07 March 2007 Funded by: This work is licensed under the Creative.
A centre of expertise in data curation and preservation SoA Annual Conference::York::August 2008 Funded by: This work is licensed under the Creative Commons.
A centre of expertise in data curation and preservation CETIS MDR SIG::28 June 2006::University of Bath Funded by: This work is licensed under the Creative.
A centre of expertise in data curation and preservation DC 101 Lite, September 10, 2010, London Funded by: This work is licensed under the Creative Commons.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
… because good research needs good data DMP Online, Lincoln, 28 th Feb 2013 DMP Online Kerry Miller Digital Curation Centre University of Edinburgh
A centre of expertise in data curation and preservation Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike.
A centre of expertise in data curation and preservation MIS Seminar :: University of Edinburgh :: 2 October 2006 Funded by: This work is licensed under.
JISC Collections 19 May 2015 | ILI 2007 | Slide 1.
Building Digital Museums, Libraries and Archives David Dawson Senior Policy Adviser (Digital Futures)
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
© HATII, University of Glasgow Introduction to the UK ’ s Digital Curation Centre Prof Seamus Ross Visiting Fellow at Oxford Internet Institute ,
David Giaretta Associate Director (Development) Funders: DCC Development Digital Curation Centre a centre of expertise in data curation and preservation.
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Digital/Open Access repositories Paul Sheehan Director of Library Services DCU HEAnet National Networking Conference Athlone 11 th November 2005.
15/06/2012 slide 1 OA and Research Information Josh Brown Programme Manager for Research Information Management and Scholarly Communications.
Common challenges, common issues Lorcan Dempsey School for scanning The Hague, 16 October 2002.
An Introduction. Aspiration To begin the process of adding significant value to those emerging repositories in which.
The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.
UK LOCKSS Alliance: Investigation into Private LOCKSS Networks Adam Rusbridge EDINA, University of Edinburgh.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
JISC/CNI Conference Edinburgh, 26th June 2002 Challenges of Digital Preservation – do we have a road map? Maggie Jones.
Beyond the Repository: Research Systems, REF & New Opportunities William J Nixon Digital Library Development Manager.
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Paolo Budroni, University of Vienna
Trustworthiness of Preservation Systems
Open Access to your Research Papers and Data
TODAY’S NEWS SSR – Collect Mentor Agreements
Brian Matthews STFC EOSCpilot Brian Matthews STFC
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Presentation transcript:

a centre of expertise in data curation and preservation Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License, excluding content property of others. To view a copy of this license, visit ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. Tomorrow, and tomorrow, and tomorrow: the players on the curation stage Chris Rusbridge Presentation at OCLC

a centre of expertise in data curation and preservation OCLC October 2006 "To-morrow, and to-morrow, and to-morrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death. Out, out, brief candle! Life's but a walking shadow; a poor player, That struts and frets his hour upon the stage, And then is heard no more: it is a tale Told by an idiot, full of sound and fury, Signifying nothing." Shakespeare: Macbeth

a centre of expertise in data curation and preservation OCLC October 2006 Dunsinane Hill Photo by Fabrice

a centre of expertise in data curation and preservation OCLC October 2006

a centre of expertise in data curation and preservation OCLC October 2006

a centre of expertise in data curation and preservation OCLC October 2006 Contents Curation and the Digital Curation Centre Science and Data Citations The poor players of data curation Sustainability of curated data Macbeth again…

a centre of expertise in data curation and preservation OCLC October 2006 Curation Data increasingly important as evidence Experimental verifiability (the basis of science) Unrepeatable observations & experiments (particularly environmental in broadest sense) Legal, compliance & transactions Cultural resources Preservation view vs Publishing view

a centre of expertise in data curation and preservation OCLC October 2006 Lynch remarks Closing the Curation Conference 3 views of digital curation Finite process, handover to preservation Whole life process, evolving object(s) Collection as a living thing

a centre of expertise in data curation and preservation OCLC October 2006 Digital curation? Digital preservation Static For later use

a centre of expertise in data curation and preservation OCLC October 2006 Digital curation? Digital preservationDigital curation StaticDynamic Long-term For later useIn use now (and the future)

a centre of expertise in data curation and preservation OCLC October 2006 Digital curation Digital curation & preservation StaticDynamic Long-term For later useIn use now (and the future) maintaining and adding value to a trusted body of digital information for current and future use

a centre of expertise in data curation and preservation OCLC October 2006 Mission The over-riding purpose of the DCC is to support and promote continuing improvement in the quality of data curation, and of associated digital preservation

a centre of expertise in data curation and preservation OCLC October 2006 Organisation to Engage & Collaborate Industry research collaborators standards bodies testbeds & tools communities of practice: users community support & outreach research development co-ordination service definition & delivery management & admin support Associates Network curation organisations eg DPC

a centre of expertise in data curation and preservation OCLC October 2006 Organisation to Engage & Collaborate: Leads Industry research collaborators standards bodies testbeds & tools communities of practice: users Bath Edinburgh CCLRC Glasgow Edinburgh Associates Network curation organisations eg DPC

a centre of expertise in data curation and preservation OCLC October 2006 Associated work DCC LOCKSS Technical Support Service (Lots of Copies Keep Stuff Safe) DCC SCARP Project Disciplinary approaches to sharing, curation, re- use and preservation EU projects associated CASPAR Digital Preservation Europe PLANETS

a centre of expertise in data curation and preservation OCLC October 2006 Phase 2 Externally-moderated, reflective self- evaluation completed Phase 2 proposal (2007/10) to JISC Accepted: focus on science data, reduced scale EPSRC-funded Research continues until 2007/8

a centre of expertise in data curation and preservation OCLC October nd International Digital Curation Conference Research & invited presentations Glasgow, 21/22 November, 2006 Please register at:

a centre of expertise in data curation and preservation OCLC October 2006

a centre of expertise in data curation and preservation OCLC October 2006 Data resource stages Curated data is created… Observations? Fixed! Or Acquired… Data brought/bought from outside Ingest Development Derived, refined, combined, processed data Potentially many stages

a centre of expertise in data curation and preservation OCLC October 2006 SDSS (Visual) TWOMASS (Infrared) Slide from Rajendra Bose

a centre of expertise in data curation and preservation OCLC October 2006 Slide from Rajendra Bose

a centre of expertise in data curation and preservation OCLC October 2006 New discovery… National Virtual Observatory Johns Hopkins press release: Scientists working to create the NVO, an online portal for astronomical research unifying dozens of large astronomical databases, confirmed discovery of [a] new brown dwarf recently. The star emerged from a computerized search of information on millions of astronomical objects in two separate astronomical databases. Thanks to an NVO prototype, that search, formerly an endeavor requiring weeks or months of human attention, took approximately two minutes.

a centre of expertise in data curation and preservation OCLC October 2006 Context Data meaningless without context Linkage Metadata of many kinds Workflow! Provenance Computational lineage Authenticity

a centre of expertise in data curation and preservation OCLC October 2006 NASA University research group1 research group3 local decision- making body University research group2 Slide from Rajendra Bose

a centre of expertise in data curation and preservation OCLC October 2006 Access and re-use Ethics and rights control access Weak in expressing this long-term Collaboration tools Annotation, discussion, review Re-use leading to change and development Publication Not just in print Underlying data should be published, too Citation…

a centre of expertise in data curation and preservation OCLC October 2006 CLADDIER citation investigation My last example was an MST data set held at the BADC, and I was suggesting something like this (for a citation): Natural Environment Research Council Mesosphere-Stratosphere-Troposphere Radar at Aberystwyth Internet British Atmospheric Data Centre (BADC) 1990 badc.nerc.ac.uk/data/mst/v3/upd Sep (Made up tags!) Bryan Lawrence Weblog

a centre of expertise in data curation and preservation OCLC October 2006 CLADDIER 2: Version of record Role of Publisher: add value provision of catalogue metadata some commitment to maintenance of the resource at the AvailableAt url some commitment to the resource being conformant to the description of the Feature some commitment to the maintenance of the mapping between the identifier [LocalID] and the resource. Bryan Lawrence Weblog

a centre of expertise in data curation and preservation OCLC October 2006 CLADDIER 3: persistence Wayback Machine Only snapshots (eg only 2004 version of Bryans home page!) WebCite allows the creater of content to submit URLs for [archiving], thus ensuring when one writes an academic document, the material will be archived, and the citation will be persistent But no real help for data… … only allow [data citation] when we believe in the persistence of the organisation making the data available… Bryan Lawrence Weblog

a centre of expertise in data curation and preservation OCLC October 2006

a centre of expertise in data curation and preservation OCLC October 2006 Citation OWL Web Ontology Language Reference W3C Proposed Recommendation 15 December 2003 This version : Latest version: Previous version: Needs a stable resource to cite… (FRBR works & expressions?)

a centre of expertise in data curation and preservation OCLC October 2006 Citation… The date alone (as in common web citation approaches) is not enough! Cited object likely to have changed… Citation should link to the cited object as it was! [6] The CIA World Factbook. Retrieved on 8 Jan 2006.

a centre of expertise in data curation and preservation OCLC October 2006 Citation needs… An efficient way to reference and access archived past states of a changing dataset (work in progress, Buneman et al) Not important for original observations Dont mess with those data Less important for incremental datasets Later stuff should not invalidate earlier Very important for revisable datasets Eg Genomics… datasets that result from the combined work of curators, or contain opinions or facts likely to change Eg Mapping… OS maps represent a huge database that changes on a daily basis

a centre of expertise in data curation and preservation OCLC October 2006 XML Archiver Relational Database XML Archive at time t - 1 XML Archive at time t XMLArch: System Architecture Pre-processor Version Merger Data Extractor XML Snapshot at time t Carwyn Edwards

a centre of expertise in data curation and preservation OCLC October 2006 Who are the curation players?

a centre of expertise in data curation and preservation OCLC October 2006 Curation: Individual Small science 2-3 times more data than Big science, but much more at risk PhD student? RA? PI? Administrator? IT support? Data potentially on local hard drives, or at best shared network drives May be inadequately protected Liable for policy-led deletion on resignation Individual knows too much Documentation/metadata unlikely to be adequate Tomorrow: gone!

a centre of expertise in data curation and preservation OCLC October 2006 Department: eCrystals Specialist department archive (& national service) Workflow recording of lab parameters (R4L) Public & private elements Trying to build eCrystals federation (eBank 3) But… ReciprocalNet? French COD efforts? Fragmented discipline! Tomorrow: likely to continue

a centre of expertise in data curation and preservation OCLC October 2006 Institution: Cambridge Chemistry 175,000 small molecule structures in CML Alongside Archaeology, Manuscripts, Learning Materials, etc No library curation skills; dependent on research group enthusiast Collection isolated from other Chemistry Tomorrow: assured…

a centre of expertise in data curation and preservation OCLC October 2006 Community: CDL Shared effort from group of institutions Comparison OhioLink? Document tradition, not data Passive role re collections Rely on departmental & domain expertise Tomorrow: assured…

a centre of expertise in data curation and preservation OCLC October 2006 Community: SDSC? Data specialists Multiple disciplines Distinct from domains; curation dependent on external expertise Research ethos Tomorrow: dependent on grant/contract income & research priorities

a centre of expertise in data curation and preservation OCLC October 2006 Community: LOCKSS? Self-selected group of collectors: closest to genuine open activity (despite Alliance)? Traditionally libraries collecting eJournals Model respects IPR No domain expertise; rely on origins Data limitations… Tomorrow: potentially very persistent (low cost, high reliability, attack resistance, distributed)

a centre of expertise in data curation and preservation OCLC October 2006 Discipline: Archaeology Staffed by archaeologist curators Understand special legal issues Strong relationship with community & peers Internationally still fragmented? Tomorrow: dependent on research council grants + deposit funding

a centre of expertise in data curation and preservation OCLC October 2006 Discipline: Astronomy Part of major international effort Expensive shared facilities, global reach Well integrated into community Enable new science Tomorrow: assured by community (another large facility)

a centre of expertise in data curation and preservation OCLC October 2006 Discipline: Atmosphere Strong believer in need for domain scientists as curators Significant participant in community proxy agenda-setting activities Internationally fragmented resources Tomorrow: mostly dependent on grant funding (but strong commitment)

a centre of expertise in data curation and preservation OCLC October 2006 Discipline: Pharmacology International Scientific Union Attempting to build credit for data contributions DB ownership rotates Tomorrow: extremely limited funding

a centre of expertise in data curation and preservation OCLC October 2006 Discipline: Social Sciences Mature! Staffed by Social Science curators Alert to opportunities Able to appraise material offered Strong relationship to discipline Tomorrow: assured through broad mix of funding streams

a centre of expertise in data curation and preservation OCLC October 2006 Publisher: Crystallography Publisher and Scientific Union Created key domain crystallographic standard (CIF) Strong motivator for deposit of structure data Consistent quality checks DOIs used for structure data Tomorrow: publishing business model Slide from IUCr

a centre of expertise in data curation and preservation OCLC October 2006 National bodies: British Library Serious and robust approach Legal deposit powers & responsibilities as driver Oriented primarily towards cultural heritage (broadly interpreted) Little data, no science domain experience Tomorrow: strong future commitment

a centre of expertise in data curation and preservation OCLC October 2006 National bodies: TNA/NDAD Specialist archive for government datasets Understand government regulations, dynamics & requirements Subject generalists; disconnected from associated science Technology specialists (understand databases) Tomorrow: likely to pass eventually to The National Archives

a centre of expertise in data curation and preservation OCLC October 2006 National bodies: NOAA (etc) Government body making serious data available Domain scientists curate data Operates in current political context (!) Tomorrow: reasonably assured but some un- funded mandates?

a centre of expertise in data curation and preservation OCLC October rd parties: OCLC? Should this be community? Demand driven No domain science expertise: rely on origins Tomorrow: business case

a centre of expertise in data curation and preservation OCLC October rd parties: Portico Specific area: eJournals Depends on publisher agreements No data or domain science expertise Tomorrow: commitment from Mellon + publishers + subscriptions, good funding mix

a centre of expertise in data curation and preservation OCLC October rd Parties: Iron Mountain Records management IS a curation problem Organisations like this very likely to branch out No domain science expertise Tomorrow: business case, viability, stock market…

a centre of expertise in data curation and preservation OCLC October 2006 Institutions & the network Institutions have some fundamental sustainability Disciplines live in the network; sustainability is an issue Can we get the best of both?

a centre of expertise in data curation and preservation OCLC October 2006 Intersections… Institution 1 Institution 2 Institution 3 etc Discipline 1 XX Discipline 2 XX Discipline 3 XX etc

a centre of expertise in data curation and preservation OCLC October 2006 Who are the curation players again?

a centre of expertise in data curation and preservation OCLC October 2006 Project StORe findings Discipline commonality from survey (Miller, UKDA, 2006): 2-way links between data & publication useful Barriers to actual deposit of data/outputs Sharing data important, likely between colleagues Perceived inconsistency across repositories Most common searching: Google type Researchers favour self-reliance rather than library support Recognise need for common minimum metadata Aim for pilot linking middleware demonstrator Creating small scale silos of information with institutional repositories is not … a compelling information management strategy in the Google age (Heery & Anderson for JISC, 2005)

a centre of expertise in data curation and preservation OCLC October 2006 Sustainability: tomorrow is the emerging worry Sustainability work package in DCC (new grant!) JISC/NDIIPP meeting addressed it AHRC report draft soon Research Information Network report draft JISC study on sustainable IT systems for HE Recent ARL/NSF workshop, NSF strategy

a centre of expertise in data curation and preservation OCLC October 2006 Sustainability of what? Repository as an organisation Repository as a service Repository as a system Repositories as a network (federation?) Collections and objects supported by repositories Commit to collection: contract the manager!

a centre of expertise in data curation and preservation OCLC October 2006 Social factors Commitment essential… much more than anything else (cf persistent identifiers) Funder requirements express social determination Policy & grant application forms, selection criteria Monitoring essential Legal, ethical, IPR impacts all significant Public good questions Academic credit (citations?) Free-loaders (embargos?) Disciplines are different! Workforce skills: researcher, data librarian/scientist

a centre of expertise in data curation and preservation OCLC October 2006 Sustainability a function of... Commitment Goals Value and cost Business model Time Environment Domain knowledge and information Dimensions (how much stuff) Technical approaches Usage

a centre of expertise in data curation and preservation OCLC October 2006 So, tomorrow… Digital data repositories already sustained > 30 years How? Vision, leadership, commitment Libraries, archives, museums sustained 100s of years How? Aggregate value proposition Perception now under threat! Collectively we need to identify the next steps toward digital data sustainability, for tomorrow, and tomorrow, and tomorrow!

a centre of expertise in data curation and preservation OCLC October 2006 Macbeth again… "To-morrow, and to-morrow, and to-morrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; …it is a tale Told by an idiot, full of sound and fury, Signifying nothing."

a centre of expertise in data curation and preservation OCLC October 2006 Mission (impossible?) To that last syllable of recorded time Keep our tales forever full of significance! Thank you