Download presentation
Presentation is loading. Please wait.
Published byAngel Douglas Modified over 10 years ago
1
UKOLN is supported by: From research data to new knowledge: a lifecycle approach. Dr Liz Lyon, Director UKOLN, University of Bath, UK JISC/SURF/CNI Conference May 2005, Amsterdam. www.bath.ac.uk a centre of expertise in digital information management www.ukoln.ac.uk
2
JISC/SURF/CNI Conference May 20052 Overview 1.Scholarly communications in flux 2.e-Research and the diversity of data 3.Repositories & meta-functionality Realising the link to learning: eBank UK Providing value-added services Enabling knowledge extraction & post- processing 4.Look at (some of) the issues en route
3
1. Scholarly communications in flux
4
JISC/SURF/CNI Conference May 20054 A medieval scriptorium…..
5
JISC/SURF/CNI Conference May 20055 Research & e-Science workflows Aggregator services: national, commercial Repositories : institutional, e-prints, subject, data, learning objects Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Searching, harvesting, embedding Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding The scholarly knowledge cycle. Liz Lyon, Ariadne, July 2003.
6
JISC/SURF/CNI Conference May 20056 Learning & Teaching workflows Aggregator services: national, commercial Repositories : institutional, e-prints, subject, data, learning objects Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Harvesting metadata Resource discovery, linking, embedding Peer-reviewed publications: journals, conference proceedings Validation Resource discovery, linking, embedding Deposit / self- archiving Learning object creation, re-use Searching, harvesting, embedding Quality assurance bodies Validation Presentation services: subject, media-specific, data, commercial portals
7
JISC/SURF/CNI Conference May 20057 Learning & Teaching workflows Research & e-Science workflows Aggregator services: national, commercial Repositories : institutional, e-prints, subject, data, learning objects Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Resource discovery, linking, embedding Deposit / self- archiving Learning object creation, re-use Searching, harvesting, embedding Quality assurance bodies Validation Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding
8
2. e-Research and the diversity of data
9
JISC/SURF/CNI Conference May 20059 Assuring permanent open access to the records of science & the humanities? Long term access to primary data Increasing data volumes from eScience and Grid-enabled / cyberinfrastructure applications Changing research paradigm: data-driven science, big science Observational data, simulations, large-scale experimentation, computations Multi-media resources, statistical data, surveys, geo-spatial data……
10
JISC/SURF/CNI Conference May 200510 Diversity of data collections Very large, relatively homogeneous: Large-scale Hadron Collider (LHC) outputs from CERN Smaller, heterogeneous and richer collections: World Data Centre for Solar-terrestrial Physics CCLRC Small-scale laboratory results: jumping robots project at the University of Bath Population survey data: UK Biobank Highly sensitive, personal data: patient care records
11
JISC/SURF/CNI Conference May 200511 Taxonomy of data collections Research collections: jumping robots Community collections: Flybase at Indiana (with UC Berkeley ) Reference collections: Protein Data Bank Source: NSF Long-Lived Digital Data Collections Draft report March 2005
12
JISC/SURF/CNI Conference May 200512 Taxonomy of data collections Research collections: jumping robots Community collections: Flybase at Indiana (with UC Berkeley ) Reference collections: Protein Data Bank Source: NSF Long-Lived Digital Data Collections Draft report March 2005 Evolution……
13
JISC/SURF/CNI Conference May 200513 Repository evolution: 1971 Research collection <12 files 2005 Reference collection >2700 structures deposited in 6 months
14
JISC/SURF/CNI Conference May 200514 1. Issues: research data as content Sharing it! Data diversity –Homo- or heterogeneous –Raw and derived / processed –Sensitivity –Fast or slow growth in volume Repository evolution: –Likelihood to scale up (from bytes to petabytes) –Quality assurance (from the start) –Community-based standards development (folksonomies) –Build robust services
15
3. Repositories & meta-functionality
16
JISC/SURF/CNI Conference May 200516 eBank UK: linking research data to learning JISC-funded September 2003, Phase 2 February 2005 UKOLN at the University of Bath (lead), University of Southampton, University of Manchester Exemplar: e-Science testbed Combechem –Grid-enabled combinatorial chemistry –Crystallography, laser and surface chemistry examples –Development of an e-Lab using pervasive computing technology –National Crystallography Service Resource Discovery Network / PSIgate physical sciences portal http://www.ukoln.ac.uk/projects/ebank-uk/
17
JISC/SURF/CNI Conference May 200517 Learning & Teaching workflows Research & e-Science workflows Aggregator services: eBank UK Repositories : institutional, e-prints, subject, data, learning objects Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Resource discovery, linking, embedding Deposit / self- archiving Learning object creation, re-use Searching, harvesting, embedding Quality assurance bodies Validation Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding
18
JISC/SURF/CNI Conference May 200518 Data Flow in eBank UK OAI-PMH Submit Store/link Harvest (XML) Index and Search Data files Metadata present HTML present HTML Institutional repository eBank aggregator Create
19
Comb-e-Chem Project X-Ray e-Lab Analysis Properties Properties e-Lab Simulation Video Diffractometer Grid Middleware Structures Database
20
JISC/SURF/CNI Conference May 200520
21
JISC/SURF/CNI Conference May 200521 The digital repository ecrystals.chem.soton.ac.uk Acknowledgement: Simon Coles
22
JISC/SURF/CNI Conference May 200522 Access to the underlying data
23
JISC/SURF/CNI Conference May 200523 Harvesting: OAIster
24
JISC/SURF/CNI Conference May 200524 Aggregating: search & discover
25
JISC/SURF/CNI Conference May 200525 Linking to publications
26
JISC/SURF/CNI Conference May 200526 eBank embedded in a science portal
27
JISC/SURF/CNI Conference May 200527 eBank Phase 2: linking to learning Embedding in e-Learning processes Evaluating the pedagogical benefits – MChem course – Chemical informatics course
28
JISC/SURF/CNI Conference May 200528 2. Issues: generic data models, metadata schema & terminology Validation against other schema –CCLRC Scientific Data Model Vs 2 Complex digital objects and packaging options –METS –MPEG 21 DIDL Terminologies –Domain: crystallography –Inter-disciplinary e.g. biomaterials –Metadata enhancement: subject keyword additions to datasets based on knowledge of keywords in related publications –Meaningful resource discovery?
29
JISC/SURF/CNI Conference May 200529 3. Issues: linking and identifiers Links to individual datasets within an experiment Links to all datasets associated with an experiment or a data collection Links to derived eprints and published literature Context sensitive linking: find me –Datasets by this author / creator –Datasets related to this subject –Learning objects by this author / creator –Learning objects related to this subject Identifiers and persistence –generic –domain: International Chemical Identifier (InChI code) Resource discovery : Google Scholar? Provenance: authenticity, authority, integrity?
30
JISC/SURF/CNI Conference May 200530 4. Issues: embedding and workflow Into the crystallographic publishing community International Union of Crystallography Into the chemistry research workflow –SMART TEA Digital Lab Book e-synthesis Lab –Other analytical techniques and instrumentation Into the curriculum and e-Learning workflows –MChem course –Undergraduate Chemical Informatics courses
31
JISC/SURF/CNI Conference May 200531 For later use? In use now (and the future)? Repositories and digital curation Data preservationData curation StaticDynamic maintaining and adding value to a trusted body of digital information for current and future use
32
JISC/SURF/CNI Conference May 200532 Provide value-added services Annotation e-Lab books (Smart Tea Project in chemistry) Gene and protein sequences
33
JISC/SURF/CNI Conference May 200533 Enable post-processing and knowledge extraction The acquisition of newly-derived information and knowledge from repository content Run complex algorithms over primary datasets Mining (data, text, structures) Modelling (economic, climate, mathematical, biological) Analysis (statistical, lexical, pattern matching, gene) Presentation (visualisation, rendering)
34
JISC/SURF/CNI Conference May 200534
35
JISC/SURF/CNI Conference May 200535 5. Issues: knowledge services Layered over repositories –Annotation –Mining, modelling, analysis –Visualisation Across multiple repositories –Grid enabled applications –Highly distributed, dynamic and collaborative Associated with curatorial responsibility –UK Digital Curation Centre http://www.dcc.ac.uk
36
JISC/SURF/CNI Conference May 200536 Issues summary 1.Research data is diverse, increasing rapidly in volume and complexity 2.Repository collections are dynamic and evolve 3.Technical challenges associated with interoperability, persistence, provenance, resource discovery and infrastructure provision 4.Embedding in workflow is critical: scholarly communications, research practice, learning 5.Knowledge extraction tools will generate new discoveries based on repository content 6.Repository solutions must scale: M2M processing will become the norm……
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.