Speech data in Swedish national archives and government agencies Jens Edlund, KTH Royal Institute of Technology Dept. of Speech, Music and Hearing.

Slides:



Advertisements
Similar presentations
Question Exploration Guide
Advertisements

ESDS Qualidata Libby Bishop, ESDS Qualidata Economic and Social Data Service UK Data Archive ESDS Awareness Day Friday 5 December 2003Royal Statistical.
Researchers and academic libraries Alma Swan Key Perspectives Ltd Truro, UK Quebec universities libraries sub-committee conference, Quebec, 9 May 2008.
Who needs libraries anyway? Researchers use of academic libraries and their services Alma Swan Key Perspectives Ltd Truro, UK Research Information Network.
Dialogue systems at KTH. The August project Part of the Stockholm Cultural Capital of Europe '98 program Swedish spoken dialogue system with an animated.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Making a Strong Home-School Connection: Supporting Literacy at Home.
Introduction to Multimedia Adeyemi Adeniyi Bsc, MCP MCTS
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Audio, Visual, and Digital Technologies in Teaching
Set up slide. Warwick Diabetes Research & Education User Group Introducing the User Group Or to give it the full title Warwick Diabetes Research & Education.
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
AVQ Automatic Volume and eQqualization control Interactive White Paper v1.6.
AUTOMATIC ORGANIZING AND FORMATTING FOR LECTURE NOTES SHIQING (LICIA) HE ADIVISOR: PROF.KRISTINA STRIEGNITZ SPRING 2014 STRUCTURING THE UNSTRUCTURED NOTE:
Opportunities for improving stock assessment Kristjan Thorarinsson Population Ecologist The Federation of Icelandic Fishing Vessel Owners.
Overview of Computer Setup in the Electronic Music Studio Computer Digital in/out (usb) Mbox Audio in/out Analog audio signals travel from the Audio Mixer.
A multimodal dialogue-driven interface for accessing the content of recorded meetings Agnes Lisowska ISSCO/TIM/ETI University of Geneva IM2.MDM Work done.
Mixed Reality System for tracking of 1:100 scale buildings in a town planning scenario Roy C. Davies Lund University,
CS351 © 2003 Ray S. Babcock Requirements What not How “The Pizza Experiment” 1994, 350 companies, 8000 software projects. 31% were canceled before they.
User studies. Why user studies? How do we know security and privacy solutions are really usable? Have to observe users! –you may be surprised by what.
Data collection methods Questionnaires Interviews Focus groups Observation –Incl. automatic data collection User journals –Arbitron –Random alarm mechanisms.
Needs Analysis Instructor: Dr. Mavis Shang
Audio-visual media in L2 teaching. What media do you use? Video (self made) With free software Jing Example 2.
Presented by Eroika Jeniffer.  What are we going to learn? - the use of chat in classroom - the most likely application on chat. And many more….. So,
Lecture 3 Teaching Listening
Look, no hands! Teaching qualitative research methods using problem based learning Sally Wiggins Dept of Psychology University of Strathclyde.
What is Multimedia? Derived from the word “Multi” and “Media” Multi
SSRG annual workshop Balancing and Managing Risk 8th April 2008 Costing Children’s Services: Availability of Child Level Data Samantha Culley Centre for.
Lightly Supervised and Unsupervised Acoustic Model Training Lori Lamel, Jean-Luc Gauvain and Gilles Adda Spoken Language Processing Group, LIMSI, France.
GCSE ENGLISH LANGUAGE UNIT 2 EXAM REVISION.  Can you identify and define different types of non-fiction text?  Can you identify the purpose of a.
[1] Processing the Prosody of Oral Presentations Rebecca Hincks KTH, The Royal Institute of Technology Department of Speech, Music and Hearing The Unit.
Data collection and experimentation. Why should we talk about data collection? It is a central part of most, if not all, aspects of current speech technology.
11-July-2011, SURFnet Heather Flanagan, COmanage Project Coordinator Benn Oshrin, COmanage Developer Scott Koranda, U. Wisconsin – Milwaukee and LIGO.
CSC – School of Computer Science and Communication.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Audit projet Ste p 1 Margot Le Roux Josselin Le Testu Anne Lutignier.
Introduction to Interactive Media The Interactive Media Development Process.
Misbehavior on the Bus… What do I DO? New Ways to Assist Students in Having Great Bus Behavior.
Gathering User Data IS 588 Dr. Dania Bilal Spring 2008.
Turning Audio Search and Speech Analytics into Business Intelligence.
1 Use of qualitative methods in relating exams to the Common European Framework: What can we learn? Spiros Papageorgiou Lancaster University The Third.
Household appliances control device for the elderly On how to encourage universal usability in the home environment.
COMPUTER PARTS AND COMPONENTS INPUT DEVICES
Introduction ESDS Qualidata John Southall ESDS Creating and delivering re-usable qualitative data 24 June 2004.
The Use and Abuse of Indicators Joachim Nahem UNDP Oslo Governance Centre World Forum on Statistics, Knowledge and Policy: Istanbul
Responding to Student Writing What Theory and Research Tell Us.
13 December 2006 (c) Dennis Adams Associates Limited Stakeholder Management for Designers Or “I’m a Designer, Get me out.
Interviewing for Dissertation Research But these ideas apply to many types of interviewing.
Local content in a Europeana cloud Kate Fernie, 2Culture Associates, Project Manager LoCloud is funded by the European Commission's ICT Policy Support.
Introduction defining communication. communication let’s draw our map.
1 An Institutional Framework for the Digital Humanities: an Alternative to the DH Centre 7 September 2012.
Managing Photos Chapter 7 Bit Literacy. In the old days Cameras were “analog” – film-based Photos were expensive to make You did not make duplicates Prints.
Unit 2 AS Sociology Research Methods Examination Technique.
1 Multi-Track Recorder Typical Usage Scenario Demonstration.
Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.
Night Time Music Exploring pitch through using movement, voices and instruments. PE Specialist PE Coaches will be covering: § master basic movements including.
Presentation on “Technology used by university student”
Stages of Group Development
Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen,
Green Fingers Music Weather & Plant theme songs. PE Specialist PE Coaches will be covering: § master basic movements including running, jumping, throwing.
Special Education 671: Advanced Study of Literacy Problems Spring 2016 Professor Sue Sears.
EFFECTIVE PUBLIC SPEAKING HOW TO DELIVER YOUR SPEECH.
Background information of the unit : Main text : Baby in the House (Journal entries)Baby in the House (Journal entries) Language Focus : Long vowel soundLong.
CLARIN ERIC Franciska de Jong Oxford April 2016
ARC - Academic Resource Centre
Kathleen Hayes, PhD Candidate.
Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.
Issues in Spoken Dialogue Systems
This presentation will include:
Chapter 4: Application software, that will let you work and play
Presentation transcript:

Speech data in Swedish national archives and government agencies Jens Edlund, KTH Royal Institute of Technology Dept. of Speech, Music and Hearing

About me Full time researcher Mainly human face-to-face interaction (humanities) so in everyday life, a CLARIN user But also Speech technology (technology/computer science) And for the purposes of this talk, a CLARIN representative

About KTH Speech, Music and Hearing A CLARIN K-centre “a centre of expertise providing an information service offering technical advice on speech analysis” A research institution One of the oldest speech labs in the world (founded by Gunnar Fant in 1951)

Structure of this talk Background Speech in general Speech in Swedish archives and agencies CLARIN obstacles (with a speech focus) Lack of collaborations with “users” Lack of suitable analyses and methods

Speech vs writing Speech is often, but not always perceived as a special case of writing Speech is consistently treated like a special case of writing But Speech predates writing Speech is the most commonly occurring form of language Writing is a special case of speech? In practice, there are many similarities, but the differences are huge SpeechWriting

Some salient differences Speech is transient It exists only in the present This is true, in a sense, even if recorded Speech is largely interactive and emergent It is created, edited, and undrstodd dynamically This is true for read speech as well

Speech in Swedish archives and agencies Rough inventory performed in 2015 Direct interviews with ~25 data holders, indirect with another ~25 Key finding There is a lot of materials around Nobody uses these materials – at all Virtually none of it can be made fully public (easily) Obstacles IP rights is just the beginning (other legislation) Lack of responsible people Lack of descriptions Sheer size

How to find users To start with, don’t call them “users”? Start with existing methodologies and the needs that come out of this Autumn 2015 workshop in Stockholm Discussions in “triplets”: a researcher, a data holder, and a language tech representative Resultet in several pretty mature project ideas A large project application combining three research tracks oral history, language change, human behaviour in interaction (under review)

How to find users (2) Interdisciplinary themes Similar to the CLARIN + efforts Initial theme for SWE-CLARN: food Enormous interest from a wide range of researchers We only just started…

How to analyse archive speech “Turn it into text” Automatic transcriptions Manual correction Annotation Text storage And dig the audio back down again…

But… Current methods are not designed for this type of speech Text is not speech Different studies call for different analyses There is a very real danger in standardizing too soon

Analysis as an iterative process Automatic transcription Produces (erroneous) machine transcriptions Manual correction Produces (correct) manual transcription So we have the sound, a negative example, and a positive example This is very good for training Take home message: don’t throw away materials that can be used to improve the automatic methods

Data for speech analysis development For example Parametrized speech Statistics Word (and n-gram) frequencies Pronunciation variation … Huge amounts are needed Expensive to get, even with data access Suggestion: build in data taps in various processes

Tapping training data from processes Example 0: Automatic transcription Example 1: Transliteration Example 2: Digitization of text Example 3: Digitization of speech

Thank you! Questions (or feel free to ask me in the breaks)