Download presentation
Presentation is loading. Please wait.
Published byGeorgia McCarthy Modified over 9 years ago
1
Using ESDS data in Linguistics and NLP Dr. Kakia Chatsiou ESDS/UK Data Archive Language and Computation Group Day 07 Oct 2011 http://lac.essex.ac.uk/lacday2011
2
What is ESDS? Economic and Social Data Service national data archiving and dissemination service (since January 2003) access and specialist support for key economic and social data resources to UK Higher and Further Education users brings together centres of expertise in data creation, dissemination, preservation and use in Manchester and Essex managed by the UK Data Archive (established in 1967); jointly supported by Economic and Social Research Council (ESRC) & Joint Information Systems Committee (JISC)
3
http://www.esds.ac.uk
4
ESDS in numbers 6,000 datasets in the collection 230 new datasets added each year over 22,000 registered users approximately 60,000 downloads worldwide p.a. 3,000+ user support queries
5
Data collections we hold Through our dedicated services we provide access to: surveys government data aggregate statistics censuses international data longitudinal data qualitative data - multimedia data sources historical data
6
ESDS Linguistics data offers YearData offers 20042 20056 200611 200715 200813 200910 20107 2011 (Jan) 3 Total67 From ESRC grants 19 accepted rest unable to accept (due to confidentiality or size reasons) or referred to more suitable archives (e.g. Oxford Text Archive, CHILDES/Talkbank database) increase in depositing after researcher self- archive (UKDA-Store) launch
7
ESDS data holdings on linguistics & related fields 40 main catalogue data collections with language and linguistics subject category, accessible from the main ESDS Data Catalogue (14 qualitative, 18 quantitative, 8 historical) all qualitative studies comprising of in-depth interview transcripts or audio recordings can be used as corpus material or data sources for secondary analysis e.g. Family Life And Work Experience Before 1918 (Edwardians) (SN 2000), Pioneers interview collections Family Life And Work Experience Before 1918 (Edwardians) (SN 2000)Pioneers interview collections 13 UKDA-Store data collections with ‘linguistics’ as the primary discipline.
8
Examples of ESDS data collections with subject term “Language and Linguistics” 6228Discourse of the School Dinners Debate, 2004-2008 6402Urban Classroom Culture and Interaction, 2005-2007 6790Dynamic Variability in Speech: a Forensic Phonetic Study of British English, 2006-2007 6259Identities in Neighbour Discourse: Community, Conflict and Exclusion, 2004-2006 5271British Migrants in Spain: the Extent and Nature of Social Integration, 2003-2005 6127Linguistic Innovators: the English of Adolescents in London, 2004-2005 5200Devolution and Identity in Northern Ireland: a Longitudinal Discursive Study, 2003-2004 4457Phonological Memory as a Predictor of Language Development in Down Syndrome, 1995 and 2001 4634Transnational Seafarers, 1999-2001 4632Dutch Map Task Corpus, 1999 3991Profiling Elements of Prosodic Systems in Children (PEPS-C), 1997-1998 3556Age of Acquisition, Frequency, Concreteness and Imageability Ratings for Welsh Words and Their English Equivalents, 1995-1996 5487Literary Practices and the Mass-Observation Project, 1992-1993 3435Welsh Social Survey, 1992; Including Welsh House Condition Survey, 1992 4896English People, 1965-1990 4897Language People, 1965-1986 2715Northern Ireland Transcribed Corpus of Speech, 1973-1980 430U.K. County Data, 1851-1966 5251Study of the Abelam of Papua New Guinea and the Nso of Cameroon, 1939-1963 2947Susanne Corpus, 1961 3821Social History of the Welsh Language : Evidence of the 1891 Census; Project 2
9
Examples of linguistics data holdings in UKDA-Store
10
Linguist users of ESDS data 51 self-reported linguists (out of around 22,000) about 30 of these downloaded ESDS data, the majority of them being survey data, then qualitative interviews and a few historical data downloads the rest might well have accessed documentation, study methods and instruments about studies (but since these do not require registration, we cannot report usage)
11
How linguists have used ESDS data a researcher and their team based at the University of Sheffield used 2 audio collections for analysis of speech patterns (SN2000 - Edwardians, SN5407- Health And Social Consequences Of The Foot And Mouth Disease Epidemic In North Cumbria)SN2000 - EdwardiansSN5407- Health And Social Consequences Of The Foot And Mouth Disease Epidemic In North Cumbria an ESRC joint project between the UK Data Archive and the Language Processing team at the University of Edinnburgh used three classic social science collections to test natural language processing tools. They looked at named entity recognition on typical social science data interviews. Person- based identification enabled the testing of an anonymisation tool.
12
ESDS data uses by Linguists a JISC project between EDINA and the UK Data Archive using the HISTPOP collection at the UK Data Archive to augment resource search and discovery methods. –data and metadata were fed to GeoDigRef and LTG GeoParser –the enriched data were embedded in an experimental geographical service by EDINA –allows users to search resource collections via a map- based interface, which provides links back to the reference of the place-name in the original resource
13
That sounds interesting! Where to look for relevant data ? ESDS data catalogue (homepage) Some of these options can be used to find data: –search the ESDS Catalogue (simple or advanced search) –search variables –browse Major Studies list –browse the latest releases
14
Finding data: Searching the Data Catalogue
15
Finding data: Sample data catalogue record
16
Finding Data: Sample Documentation
17
Where to find more data
18
Finding Data: our researcher self-archiving UKDA-Store
19
Accessing data Documentation is freely available to anyone Users must be registered with ESDS to download access data You can use your university username & password to register Access to some data is limited to users at UK Higher or Further Education Institutions Currently have approx. 22,000 registered users
20
How to access data register with ESDS agree to the terms & conditions of the End User Licence select the dataset from the Data Catalogue and click ‘Download/Order’ specify a usage/project for which the data are to be used then: –download data selecting your preferred format (SPSS, Stata, TAB etc.) or –place an online order for the data for more see http://www.esds.ac.uk/support/e2.asphttp://www.esds.ac.uk/support/e2.asp
21
How to access data
22
Teaching resources ESDS can help provide support in many areas of teaching and research methods –teaching datasets –thematic guides, e.g. on health and crime –guides on: data collection and use data sharing and data management confidentiality, consent and ethics issues survey and research design and analysis software for analysing data –case studies of re-use –training events and workshops recently involved in creating formal assessments based on Qualitative data collections (TALIF grant with Dept of Sociology, Essex)
23
Workshops and training Thematic data resources events Help with using data –specific datasets –data handling skills –methodological issues –analytical skills - introductory and advanced level We are pro-active and re-active, so ask us, if you want to have a workshop! Forthcoming events: http://www.esds.ac.uk/news/esdsforthevents.asp
24
Other UK Data Archive services
25
Thank you! Questions?
26
References Corti, Louise. (2011, 11 Jan). Report on Linguists’ use of ESDS. ESDS/UK Data Archive.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.