Download presentation
Presentation is loading. Please wait.
Published byJade Farmer Modified over 6 years ago
1
Powerful access to qualitative data: What’s behind the UK QualiBank
Darren Bell Data & Services Developer UK Data Archive IASSIST, Toronto June 2014
2
QualiBank project: rationale & aims
Provide enhanced access to key qualitative data via online data browsing and exploration: UK QualiBank Based on existing metadata schemas and known technologies Offer a mechanism for reliably citing data located in the system Project includes large-scale digitisation of precious and undigitized materials Maximise the impact from existing research and resource investments – demonstrate re-use
3
UK Data Service and its own needs
We have one of the largest qualitative data collections– over 350 data collections A proportion of these have been digitised from older paper sources Currently users find and download these from our website Not so easy to find, but study documentation good No searching within collections No file manifest shown until download It can be a bit of guess work! Have Datacite DOIs; cannot reliably cite parts of data
4
Finding & accessing qualitative data
Search for “health” in our data catalogue, Discover Retrieve catalogue record, e.g. SN 6124: Being a Doctor: a Sociological Analysis, DDI 2.5 very limited for describing file content View limited user guide Web download as RTF bundle (46 transcripts)
5
Data listing
6
Download Zip of data and doc
7
Complex data collections
SN 5801: Concepts of Healthy Eating Food Research: Phases I and II, 293 interview transcripts; 73 diaries; 6 observation field notes Not represented well at all in a DDI 2.X catalogue
8
Metadata demands for UK QualiBank
Explore data through a data journey Find relevant extract, examine in context, cite Link data to still and moving images, and other related research outputs Some collections completely open Demands highly structured and consistently marked-up data Qualitative data requires object (file-level) descriptive metadata, e.g. interviews, audio-visual files, images Use of common metadata elements enable federated catalogues across providers and borders
9
Description below the collection
DDI 2.5 for catalogue metadata QuDEx schema for file level description: allows detailed identification of data objects: Interview transcript or audio recording etc. Descriptive categories at the object level, e.g. mime type, interview characteristics, interview setting Relationship to another data object or part of data Capacity to capture rich annotation of parts of data (e.g an extract) Based on published QuDEx model in use (Schema at: Object-level description = a lot of manual work! Limited use of TEI schema for mark-up of textual data items
10
User expectations Search/browse for data Browse Search:
Search /faceted browse of data - text; image/PDF, audio Browse Faceted browse by categories: Collection level, title, date and openess Collection object: data type, interview characteristics, location Search: Display no. hits and minimal item metadata Word in paragraph; thumbnail image/pdf; AV link Context: other related objects,within system or external Access full object View data, key metadata and all related files and links Get citation for part of data
11
System assumptions BaseX for metadata storage; Java loading; Solr search Data must be fully prepared on loading/publishing to the system. Data not ‘managed’ within the system Mark-up, metadata, relationships all pre-defined Pre-defined GUIDs to be used for citation (DOI + drilldown) Cannot search audio-visual data content Simple QuDEx metadata data entry tool created using SharePoint Technologies for user interface use existing in-house systems, .NET No download of data collection/subset - route to the UK Data Service Citation of selected extract of text; user-annotation possible
12
UK QualiBank Dataflow
13
Digitisation of key data sources
Selectively digitize paper-based materials: Original survey questionnaires Open ended questions Transcribed interviews Handwritten field notes, essays Diagrams Photographs Destination formats: All text files treated as XML Image files (photos and text) as PDF Audio as mp3
14
QuDEx collection level metadata
15
Objects in collection metadata
16
Object relationships Rich set of verbs available to define relationships between all objects Converse verbs generated automatically:
17
QuDEx Category Schemes
18
Use of Text Encoding Initiative (TEI)
Minimal use of TEI tags, of massive profile To denote structural mark-up Headers, turn takers, paragraphs Corrections, errors Use of unique GUIDs to identify all QuDEx IDs: Collection, Files, Paragraphs
19
School Leavers on the Isle of Sheppey
20
TEI XML: School Leavers on the Isle of Sheppey
21
Search interface - hits
22
Target page for an interview
23
Target references
24
Audio file target page
25
Citation mechanism System allows extract/quotation level citation; 1 or more consecutive paragraphs Citation object and citation format created on the fly – using GUIDS and system URI URI resolves directly to the data extract Some more sensitive collections are closed, so cannot resolve to data without login Is related to our collection-level DOIs e.g /UKDA-SN
26
Contact details Darren Bell dbell@essex.ac.uk
Louise Corti Agustina Martinez
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.