Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Louise Corti IASSIST, Edinburgh May 2005.

Slides:



Advertisements
Similar presentations
1 of 18 Information Dissemination New Digital Opportunities IMARK Investing in Information for Development Information Dissemination New Digital Opportunities.
Advertisements

UK DATA ARCHIVE Louise Corti, ODAF April UK Data Archive an internationally-renowned centre of expertise in data acquisition, preservation, dissemination.
ESDS Qualidata: Qualitative Data Preparation and Use John Southall ESDS 26 November 2003.
New Services for Users Enhanced User Support and Enhanced Access to Data Angela Dale, Head ESDS Government Melanie Wright, Head ESDS Access & Preservation.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up (SQUAD) Louise Corti UK Data Archive, University of Essex QUADS Demonstrator Workshop.
ESDS user support materials and resources: how to use them Support Services Royal Statistical Society, London 13 February 2009.
The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
Depositing Data for Archiving Libby Bishop ESDS Qualidata, University of Essex Changing Families, Changing Food Meeting University of Sheffield 15 March.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Online Qualitative Data Resources: Best Practice in Metadata Creation.
Using secondary qualitative data in interdisciplinary contexts Libby Bishop ESDS Qualidata, University of Essex Working Across Boundaries: 2 nd NCRM Summer.
Issues in methods and reuse for hypermedia ethnography Presented at QUADS Showcase day September 28, 2006 Louise Corti.
ESDS Qualidata and QUADS Coordination Louise Corti Online Resources Day 15 November 2005, London.
QUALITATIVE ARCHIVING AND DATA SHARING SCHEME WHO WE ARE QUADS is the ESRC Qualitative Archiving and Data Sharing Scheme, running from April 2005 until.
New Services for Data Creators and Providers Louise Corti, Head ESDS Qualidata/ Outreach & Training Alasdair Crockett, ESDS Data Services Manager.
A Common Standard for Data and Metadata: The ESDS Qualidata XML Schema Libby Bishop ESDS Qualidata – UK Data Archive E-Research Workshop Melbourne 27 April.
A DTD for Qualitative Data: Extending the DDI to Mark-up the Content of Non-numeric Data Libby Bishop and Louise Corti, UK Data Archive, ESDS, University.
ESDS Qualidata Libby Bishop, ESDS Qualidata Economic and Social Data Service UK Data Archive ESDS Awareness Day Friday 5 December 2003Royal Statistical.
Data Exchange and Conversion Utilities and Tools (DExT) Louise Corti, Angad Bhat, Herve LHours UK Data Archive CAQDAS Conference, April 2007.
QUADS Co-ordination Louise Corti QUADS Director, UKDA 28 September 2006.
R e D R e S S Resource Discovery for Researchers in e-Social Science ReDReSS A Joint Application from Lancaster and Daresbury (7 social scientists, 6 computer/computational.
New Directions for ESDS Qualidata: 2003 and beyond Louise Corti, Head ESDS Qualidata Economic and Social Data Service UK Data Archive IASSIST 2003.
Achieve Benefit from IT Projects. Aim This presentation is prepared to support and give a general overview of the ‘How to Achieve Benefits from IT Projects’
Why, what were the idea ? 1.Create a data infrastructure, 2.Data + the knowledge products that are produced on the basis of data a) Efficiant access to.
Re-purposing survey data sources for teaching and learning Louise Corti Economic and Social Data Service Head Qualidata, and Outreach & Training, UKDA.
Qualitative Data Preparation and Use Jack Kneeshaw ESDS Psychology Department-U of Essex 4 December 2003.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up (SQUAD) Louise Corti and Libby Bishop UK Data Archive, University of Essex IASSIST.
The e-Social Science Research Agenda Peter Halfpenny and Rob Procter School of Social Sciences - University of Manchester UK e-Science All Hands Meeting.
CAP 252 Lecture Topic: Requirement Analysis Class Exercise: Use Cases.
DATA IN Qualitative Data Acquisitions Process Louise Corti ESDS Qualidata, UKDA IASSIST WORKSHOP 27 May 2003.
Is Mobility of Data a Special Problem for Qualitative Research? John Southall ESDS Qualidata A service provider of the UK Data Archive.
The Subject Librarian's Role in Building Digital Collections: Where Information Management and Subject Expertise Meet Ruth Vondracek Oregon State University.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Creating Access to Europe’s Television Heritage Prof. Dr. Sonja de Leeuw (project-coordinator, Utrecht University) Johan Oomen MA (technical director,
DExT PROJECT Louise Corti UK Data Archive University of Essex Colchester, Essex CO4 3SQ Tel: +44 (0) URL:
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
NSW Curriculum and Learning Innovation Centre Draft Senior Secondary Curriculum ENGLISH May, 2012.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up (SQUAD) Louise Corti UK Data Archive, University of Essex ASC Conference 29 September.
Interfacing Registry Systems December 2000.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Semantic Web services Interoperability for Geospatial decision.
Approaching Utopia: Approaching Utopia: the maturing of qualitative data archiving services and secondary analysis in the UK Louise Corti Associate Director,
UK DATA ARCHIVE-NLP COLLABORATION Louise Corti and Claire Grover UK Data Archive University of Essex Colchester, Essex CO4 3SQ
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Language and Computation Day University of Essex 4 October 2005.
VIRTUAL INFORMATION AND KNOWLEDGE ENVIRONMENT FRAMEWORK IP-FP
CLARIN work packages. Conference Place yyyy-mm-dd
Introduction ESDS Qualidata John Southall ESDS Creating and delivering re-usable qualitative data 24 June 2004.
EVA Workshop, 26 March 2003, Florence, Italy1 COINE Cultural Objects In Networked Environments Anthi Baliou University of Macedonia,Library Thessaloniki,
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
MEDIN Work Plan for By March 2011 MEDIN will be 3 years into the original 5 year development plan started in Would normally ask for continued.
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
Quads.esds.ac.uk/squad THE PROJECT SMART QUALITATIVE DATA: METHODS AND COMMUNITY TOOLS FOR DATA MARK-UP SQUAD aims to explore methodological and technical.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
REPRESENTING CONTEXT IN AN ARCHIVE OF EDUCATIONAL EVALUATIONS PROJECT ACTIVITIES The project team canvassed opinion across the.
REPRESENTING CONTEXT IN AN ARCHIVE OF EDUCATIONAL EVALUATIONS The project has constructed a permanent archive of significant.
JISC/CNI Conference Edinburgh, 26th June 2002 Challenges of Digital Preservation – do we have a road map? Maggie Jones.
1 e-Arts and Humanities Scoping an e-Science Agenda Sheila Anderson Arts and Humanities Data Service Arts and Humanities e-Science Support Centre King’s.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
METHODOLOGICAL ISSUES IN QUALITATIVE DATA SHARING AND ARCHIVING THE PROJECT MIQDAS has been exploring the methodological.
KNOWLEDGE MANAGEMENT UNIT II KNOWLEDGE MANAGEMENT AND TECHNOLOGY 1.
The International Coastal Atlas Network (ICAN) Overview and Recent Activities Ned Dwyer Dawn Wright.
METHODOLOGICAL ISSUES IN QUALITATIVE DATA SHARING AND ARCHIVING THE PROJECT TEAM CONTACT Dr Bella Dicks Cardiff School.
1 Open Discovery Space Overview Argiris Tzikopoulos, Ellinogermaniki Agogi Open Discovery Space [CIP-ICT-PSP ][elearning] A socially-powered and.
Working with personal digital archives Susan Thomas Project Manager & Digital Archivist project Manuscripts Matter, Electronica panel London, October.
eContentplus 2008 Work Programme
Louise Corti UK Data Archive IASSIST 2007
Powerful access to qualitative data: What’s behind the UK QualiBank
2. An overview of SDMX (What is SDMX? Part I)
LOSD Publication Deirdre Lee
Presentation transcript:

Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Louise Corti IASSIST, Edinburgh May 2005

New qualitative data UK initiative Demonstrator Scheme for Qualitative Data Sharing and Research Archiving scheme - QUADS main aim of scheme to develop and promote innovative methodological approaches to the archiving, sharing, re-use and secondary analysis of qualitative research and data –models may be of temporary, local or thematic archiving –complement the ESDS Qualidata approach (traditional data archiving model) –exploit new or existing research collaborations locally, nationally or internationally explore a range of new models for increasing access to qualitative data resources, and for extending the reach and impact of qualitative studies draw primarily on existing qualitative research and data sets of a range of types but encourages researchers to explore the use of stored and shared video, visual and audio data sets promote understanding of the benefits and challenges of emerging information and communication e-science technologies aim to disseminate good practice in qualitative data sharing and research archiving part of the ESRC's initiative to increase the UK resource of highly skilled researchers, and to fully exploit the distinctive potential offered by qualitative research and over 10 months: 6 awards – 5 demonstrators + 1 coordination

SQUAD Aims collaboration between UK Data Archive, University of Essex and Language Technology Group, Human Communication Research Centre, School of Informatics, University of Edinburgh Essex lead partner 18 months duration, 1 March 2005 – 31 august part-time staff split across sites = 1 FTE Aims: to explore methodological and technical solutions for ‘exposing’ digital qualitative data to make them fully shareable and exploitable and to promote appropriate standards and tools Precursors of data sharing and collaborative research practice and data analysis are to found in the methods and tools for documenting and representing data

Why do we need tools & standards? to archive and web-enable high quality qualitative data in a way that faithfully represents its origins and context to provide rich and full documentation that enables effective resource discovery (already do DDI first 3 levels) to enable creative and exciting ways of exploring and visualizing data –from simple publishing of anonymised digital qualitative data –through mark-up to the ability to link qualitative data to other distributed data sources (e.g. audio-visual or geo-coded data sources) the absence of appropriate tools and standards is inhibiting successful digitisation efforts –many popular qualitative collections are not yet even in digital format –"digitising" these collections is often merely providing an online catalogue of metadata –there is little community knowledge in this area about the use of standards (TEI not used in social science)

Prerequisites for making data shareable data are collected to a high standard research methods and practices (including consent process) are fully documented the context of the data collection and analysis is captured the richness of the structure and features of data and are made available (use of mark-up) the interrelationships between data and analyses (intra-project) are made available (issues of representation) data are represented in intuitive, appealing and sensitive ways that satisfy the ethical and legal requirements to which they are bound

Main objectives specify, test and propose an XML schema for storing and marking-up a broad range of qualitative data types –textual or audio-visual social science data –and for e-social science exploitations, i.e. grid-enabling data –ESDS Qualidata had developed draft DTD based on TEI) investigate requirements for contextualising data (e.g. interview setting and interviewer characteristics), and develop standards for data documentation and common vocabularies develop user-friendly (java-based) tools for semi-automating processes (using NLP technologies) already used to prepare qualitative data for digital archiving and e-science type exploitation investigate non-proprietary tools for publishing and archiving XML marked-up data and study context - Qualitative Data Mark-up Tools (QDMT). Enable preservation of data structures and links to other objects increase awareness and provide training with step-by-step guides and exemplars on the use of these tools and standards utilised

A uniform quali format a uniform format for richly encoding qualitative research is necessary as it: –ensures consistency across datasets –supports the development of common web-based publishing and search tools –and facilitates data interchange and comparison among datasets it could also enable data and linked products to be imported and exported directly into and out of CAQDAS packages, avoiding the reliance on just a single product, and offering the opportunity to share analytic workings outside the confines of the particular software a draft but limited formal definition of a common XML vocabulary and Document Type Definition (DTD) based on the Text Encoding Initiative (TEI) for describing these structures has been prepared by ESDS Qualidata but the important development of a common framework for marking up the content of qualitative datasets requires support and contribution from various sectors of the social science community: –data creators –qualitative data software developers –data archivists –end users fortunately, the expansion of e-science funding is accelerating the need for such standards – exposure of ‘structured’ qualitative data to the web.

Marking up what? spoken interview texts provide the clearest -and most common -example of the kinds of encoding features needed three basic groups of features –structural features representing basic format: utterance, specific turn taker, other speech tags e.g. defining idiosyncrasies –structural features representing links to other data types created in the course of the research process (e.g. audio or video referencing points, researcher annotations) –structural features representing identifying information such as real names, company names, place names, temporal information

Solutions to qualitative data mark-up with XML: Qualitative Data Mark-up Tools (QDMT) systematic preparation of digital data : to create formatted text documents ready for xml output mark-up of data to capture basic structural features of textual data: e.g. turn- takers, speakers and selected demographic details advanced annotation or mark-up of data –automated information extraction of basic semantic information: inserting tags for real names and temporal references –automated anonymisation: replacing names with dummy forms, including co- references –geographic mark-up to enable data linking: identifying and applying geographic mark-up, and scoping researchers' needs for geo-linking basic classification or thematic coding of textual data: for of efficient resource discovery rather than data analysis; will investigate linking into a domain ontology (e.g. social science thesaurus) - Key word assignment tool contextual documentation to capture richness of the research methods, data collection and analytic interpretation and representation: will dovetail with Cardiff QUADS project to look at the interrelationships between complex intra-project data, annotations and context exposure of annotated and contextualised qualitative data to the web: investigating publishing of above QDM XML outputs to ESDS Qualidata Online, opportunities for exchange within CAQDAS tools, etc.

First output from automated mark-up

Existing tools Making use of unix-based community tools used in NLP fields applications are for mining and summarising e.g. legal, pharmaceutical reports, news stories, web sites etc. but not tested on for social science corpora yet – training data is limited tools using named entity recognition and speech taggers will insert xml tags others use stand-of annotation (x-link, x-pointer etc) Currently unfriendly tools - need GUIs!

Relationship to ESDS Qualidata ESDS Qualidata, through the UKDA, currently provides the ESRC RRB strategy for archiving, accessing and supporting users of qualitative research data strong emphasis on –developing community standards for describing data/metadata –providing better study and data context to inform re-use grant represents critical useful R&D funding for ESDS Qualidata who have no budget to do this normally SQUAD outputs and tools will be used for in-house processing of qualitative data and made available as shareable standards and tools for others archiving data

Summary of deliverables I –report on consultation with, and initial assessment by, LTG at Edinburgh, and a consolidated plan of workMonth 2 –report on applying levels of mark-up, setting out minimal and ideal requirements for different data types (interview data, field notes, naturally occurring speech, etc.) Month 5 –report on first set of components of the Qualitative Data Mark-up suite of tools, including user testing resultsMonth 9 –report on second batch of components of the Qualitative Data Mark-up suite of tools, including user testing and user workshopMonth 15 –short promotional overview of QDM tools and applicationsMonth 15

Summary of deliverables II –draft user guide and tutorials for each data preparation process and tool, with exemplars Month 16 –tool and programming documentationMonth 16 –report on further needs and developments for components that may not be completedMonth 17 –report on fit of tools to ESDS Qualidata Online system Month 17 –report of brief evaluation of user guide and tutorials Month 17 –final report Month 18