Hands-on tutorial: Using Praat for analysing a speech corpus Mietta Lennes 12.-13.8.2005 Palmse, Estonia Department of Speech Sciences University of Helsinki.

Slides:



Advertisements
Similar presentations
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Online Qualitative Data Resources: Best Practice in Metadata Creation.
Advertisements

Mitglied der Leibniz-Gemeinschaft Querying Spoken Language Corpora Thomas Schmidt IDS Mannheim.
Tools for Speech Analysis Julia Hirschberg CS4995/6998 Thanks to Jean-Philippe Goldman, Fadi Biadsy.
Analyses on IFA corpus Louis C.W. Pols Institute of Phonetic Sciences (IFA) Amsterdam Center for Language and Communication (ACLC) Project meeting INTAS.
Annotation, Alignment and Transcription: An extremely brief and basic introduction to Elan and Transcriber OLAC Tutorial at the Linguist Society of America.
AN INTRODUCTION TO PRAAT Tina John M.A. Institute of Phonetics and digital Speech Processing - University Kiel Institute of Phonetics and Speech Processing.
Conceptual Modelling Entity Relationship Model Overview Entities, Attributes and Relationship modelling Generating a Relational Database for an EAR model.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Tools for Speech Analysis 2 How do we choose? What kind of data? Which task?
SCA Introduction to Multimedia
MUSCLE movie data base is a multimodal movie corpus collected to develop content- based multimedia processing like: - speaker clustering - speaker turn.
Praat Fadi Biadsy.
~ Multimodal Communication ~ HOW TO: From raw data to data annotation.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
ISOWare Presentation January 2009 ISOWARE is a management tool, that simple and efficient describes and communicates Business Processes. ISOWARE is also.
Lecturer: Ghadah Aldehim
Speech Recognition Final Project Resources
July 11, 2003E-MELD 2003 E-MELD “School” of Best Practice Helen Aristar-Dry & Gayathri Sriram The LINGUIST List Eastern Michigan University.
Copyright ©: SAMSUNG & Samsung Hope for Youth. All rights reserved Tutorials Software: Building apps Suitable for: Advanced.
STANDARDIZATION OF SPEECH CORPUS Li Ai-jun, Yin Zhi-gang Phonetics Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences.
1 Shawlands Academy Higher Computing Software Development Unit.
1 CSBP430 – Database Systems Chapter 1: Databases and Database Users Mamoun Awad College of Information Technology United Arab Emirates University
WLAP: Improving acquisition Workshop on digital video archiving 22 June 2001, CERN Hector Sanchez San Martin Universitat Jaume I Ing. Informatica CERN.
Data Management David Nathan & Peter Austin & Robert Munro.
Temporal Compression Of Speech: An Evaluation IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 4, MAY 2008 Simon Tucker and Steve.
Introduction to ELAN Mary Chambers ELAP, Department of Linguistics, SOAS.
1 The Software Development Process  Systems analysis  Systems design  Implementation  Testing  Documentation  Evaluation  Maintenance.
CHAPTER TEN AUTHORING.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Language and Computation Day University of Essex 4 October 2005.
Customizing the IMDI metadata schema for endangered languages Heidi Johnson (AILLA) Arienne Dwyer (DOBES)
Speech analysis with Praat Paul Trilsbeek DoBeS training course June 2007.
Collaborative Annotation of the AMI Meeting Corpus Jean Carletta University of Edinburgh.
1 AutoCAD Electrical 2008 What’s New Name Company AutoCAD Electrical 2008 What’s New AMS CAD Solutions
Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.
Intermediate 2 Software Development Process. Software You should already know that any computer system is made up of hardware and software. The term hardware.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Praat LING115 November 4, Getting started Basic phonetic analyses with Praat –Creating sound objects Recording, reading from a file, creating from.
A Fully Annotated Corpus of Russian Speech
The Software Development Process
Student Edition: Gale Info Trac Database Lesson Grades 9-12 High School Student Edition: Gale Info Trac Database Lesson Grades 9-12 High School Anita Cellucci.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
The Games Corpus Design, implementation and annotation Agustín Gravano Spoken Language Processing Group Columbia University.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
ONZEminer Margaret Maclagan, ONZE director Robert Fromont, designer.
20081 Converting workspaces and using SALT & subversion to maintain them. V1.02.
Performance Comparison of Speaker and Emotion Recognition
Link Translation provides training and practical experience on industry- standard Computer Assisted Translation (CAT) tools for our team of linguists.
Basic structure of sphinx 4
TypeCraft Software Evaluation 21/02/ :45 Powered by None Complete: 10 On, Partial: 0 Off, Excluded: 0 Off Country: All, Region:
Lecture 1 Phonetics – the study of speech sounds
Annotation by category – ELAN and ISO DCR Han Slöetjes, Peter Wittenburg Max-Planck-Institute for Psycholinguistics LREC,
Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
DocLing2016 Software Tools Peter K. Austin Department of Linguistics SOAS, University of London
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
ELAN as a tool for oral history CLARIN Oral History Workshop Oxford Sebastian Drude CLARIN ERIC 18 April 2016.
Praat: doing phonetics by computer Introductory tutorial Kyuchul Yoon Division of English Kyungnam University.
Visual Information Retrieval
An Introduction to : a closer look at analysing vowels
DATA INTEGRATION FOR LANGUAGE DOCUMENTATION
Multimedia Authoring Tools
Hands-on tutorial: Using Praat for analysing a speech corpus
Automatically managing your music metadata
Applied Linguistics Chapter Four: Corpus Linguistics
Tools for Speech Analysis
Database management systems
Presentation transcript:

Hands-on tutorial: Using Praat for analysing a speech corpus Mietta Lennes Palmse, Estonia Department of Speech Sciences University of Helsinki

Objectives Lecture: Understanding what speech annotation means efficient annotation theoretical pitfalls Exercises: Learning to use Praat for annotating speech basic techniques and analysis displays incremental annotation Exercises: Using simple Praat scripts to analyse a small annotated speech corpus understanding basic acoustic analyses running and editing scripts

Annotation Annotation generally means describing, classifying and organizing (speech) material by systematically adding symbolic labels to its parts. The analyses you will be able to perform are restricted by the accuracy and types of annotations you have for your corpus. Up to date, no automatic speech segmentation or recognition tool exists for any language that can perform as well as a human annotator.

Transcripts are not annotations as such. Annotations and transcripts are not data.

Multiple annotation layers kuva jossa esimerkkejä monenmoisista annotaatiokerroksista

Prerequisites for annotating and analysing a speech corpus Signal files in a format readable by the annotation tool (Praat: WAV, AIFF, AIFC, Next/Sun, NIST; 16- or 8-bit) Sufficiently high signal-to-noise ratio Different speakers should preferably be separated into different audio files (crosstalk is difficult to annotate). High acoustic quality is required for complex acoustic analyses (e.g., formant modeling). If studying speech and interaction, there should be a common timeline for all audio/video/other signal files.

Planning an annotation project Annotation is boring and time-comsuming -> you should make sure it is worth all the work! Annotation should help to run analyses automatically and to reduce the need for manually browsing through your corpus. Explore and practise with a small material, then complete your annotations. What are you aiming to study?

Remember... Speech communication is much more than an ”acoustic form of writing”. Writing things down in a specific notation and carefully classifying them does not make these things nor the categories any more real. All units that you plan to annotate tend to be ”fuzzy” when you try to find them in real speech: the temporal boundaries are unclear, the different categories are sometimes difficult to separate, etc.

Annotation and the Human Factor...

Defining your annotation structure List your units: what kind of labels are allowed? What kind of properties do your units have? Which values are allowed for the properties? How many layers (tiers) of annotation do you need? You should understand how the use of these units, labels and tiers can help you to automatically analyse your material in a consistent way. Do not waste time labeling things that can be automatically measured! (e.g. labeling pause durations into a TextGrid)

Multiple annotation layers : Word units in search focus

Multiple annotation layers: Phone units in search focus

Metadata It is important to gather sufficiently detailed metadata about the speech material (speakers and their background, recording conditions, etc.) Metadata can also be used when analysing the corpus! E.g., the speakers’ sex and age are factors that tend to affect their linguistic behaviour. (If a speech database system is not available, you can encode information about the speakers, e.g., into the filenames.)

Why choose Praat for analysing your corpus? Widely used, well known, well maintained Easily installed on multiple platforms Scriptable All Praat scripts and files can be made fully portable from one system to another. With Praat, you can use your corpus almost anywhere!

Why not to use Praat Video annotation must be done with another tool. Praat does not include a proper database system as such, so searching a speech corpus with Praat must be implemented through Praat scripts (which can become painfully slow). Recommended: If your corpus is large, use Praat (scripts) to dump your annotations and acoustic analysis results to a suitable format and do the searching and statistics somewhere else.

Links Praat: Praat scripts: Linguistic annotation (tools and formats): Annotation guide (in Finnish; a ”public draft” version): An RDF/XML Schema for formally defining your annotation structure, e.g., in your own applications: