Tools for Speech Analysis Julia Hirschberg CS4706 Thanks to Jean-Philippe Goldman, Fadi Biadsy.

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

Building an ASR using HTK CS4706
Royalty Free Music for Schools Do You Have the To Do a Podcast?
Free open source audio recording and editing software 1Using Audacity.
CS 551/651: Structure of Spoken Language Spectrogram Reading: Approximants John-Paul Hosom Fall 2010.
Synthesizing naturally produced tokens Melissa Baese-Berk SoundLab 12 April 2009.
Tools for Speech Analysis Julia Hirschberg CS4995/6998 Thanks to Jean-Philippe Goldman, Fadi Biadsy.
Tools for Speech Analysis Julia Hirschberg CS4995/6998 Thanks to Jean-Philippe Goldman, Fadi Biadsy.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Xkl: A Tool For Speech Analysis Eric Truslow Adviser: Helen Hanson.
Created by Amanda Shultz About Section 1 Section 2 Section 3 Links.
AN INTRODUCTION TO PRAAT Tina John M.A. Institute of Phonetics and digital Speech Processing - University Kiel Institute of Phonetics and Speech Processing.
Looking at Spectrogram in Praat cs4706, Jan 30 Fadi Biadsy.
Tools for Speech Analysis 2 How do we choose? What kind of data? Which task?
MUSCLE movie data base is a multimodal movie corpus collected to develop content- based multimedia processing like: - speaker clustering - speaker turn.
1 Computing for Todays Lecture 22 Yumei Huo Fall 2006.
Sound in PowerPoint Demonstration Sound File Inserted in PPT  Requires existing file (wav, mp3, wma, or mid)  Insert >Movies & Sounds >Sound from file.
Praat Fadi Biadsy.
Microsoft Office Illustrated Inserting Illustrations, Objects, and Media Clips.
Speech tools Jean-Philippe Goldman Two questions What kind of data ? Which task ?
Phonetics October 1-3, 2008 Phonetics 1.Experimental Phonetics a. Production b. Perception 2. Surveys/Interviews and Phonetics.
Create a Narrated Story with PowerPoint. Basics Enter Text. (Click in the text box and start typing. If a text box is not visible, go to Insert > Text.
Basic and advanced Praat Scripting
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
Creating a PowerPoint With Sound PowerPoint 2003 Version.
Structure of Spoken Language
Free Sound Recorder By FreeAudioVideoSoft. Pricing & Installation Software is absolutely FREE With agreement to terms and conditions Installation Requirements:
Speech Signal Processing
Activity 2 Mix a WAV file and the sound from a youtube video In this activity, we are going to mix the WAV file created in Activity 1and the sound file.
Hands-on tutorial: Using Praat for analysing a speech corpus Mietta Lennes Palmse, Estonia Department of Speech Sciences University of Helsinki.
Speech analysis with Praat Paul Trilsbeek DoBeS training course June 2007.
Alice Workshop Working with Sound. Sound Working with sound is appealing to students Demo: Penguin Sound.
American Speechsounds How to Use the Program. AmericanSpeechsounds Why use American Speechsounds? Practice the problem sounds of American English Learn.
Experimentation Duration is the most significant feature with around 40% correlation. Experimentation Duration is the most significant feature with around.
What is Audacity? Audacity is a free audio editor and recording program which is classified as open source software. It is easily downloaded to one’s.
Praat LING115 November 4, Getting started Basic phonetic analyses with Praat –Creating sound objects Recording, reading from a file, creating from.
Statistical NLP Spring 2011
1 EndNote X2 Your Bibliographic Management Tool 29 September 2009 Humanities and Social Sciences Resource Teams.
Imposing native speakers’ prosody on non-native speakers’ utterances: Preliminary studies Kyuchul Yoon Spring 2006 NAELL The Division of English Kyungnam.
XP Tutorial 8 New Perspectives on Microsoft Windows XP 1 Microsoft Windows XP Object Linking and Embedding Tutorial 8.
Speech Analysis TA : 林賢進 HW /10/28 1. Goal This homework is aimed to analyze speech from spectrogram, and try to distinguish different initials/
Experimentation Duration is the most significant feature with around 40% correlation. Experimentation Duration is the most significant feature with around.
Administriva l James will run a hands on tutorial in WEB 130 today at 3:30 and again at 2:00 and 3:30 on Thursday. l Can everyone that wants to attend,
The role of prosody in dialect authentication Simulating Masan dialect with Seoul speech segments Kyuchul Yoon Division of English, Kyungnam University.
XP New Perspectives on Microsoft Office Access 2003, Second Edition- Tutorial 8 1 Microsoft Office Access 2003 Tutorial 8 – Integrating Access with the.
Acoustic Phonetics 3/14/00.
Dialect Simulation through Prosody Transfer: A preliminary study on simulating Masan dialect with Seoul dialect Kyuchul Yoon Division of English, Kyungnam.
HW2-2 Speech Analysis TA: 林賢進
Audio Books for Phonetics Research CatCod2008 Jiahong Yuan and Mark Liberman University of Pennsylvania Dec. 4, 2008.
JING SCREEN CAPTURE Anne Perorazio Information Resources Specialist UM Health Sciences Libraries
XP Creating Web Pages with Microsoft Office
영어교육에 있어서의 영어억양의 역할 (The role of prosody in English education) Korea Nazarene University Kyuchul Yoon English Division Kyungnam University.
Praat: doing phonetics by computer Introductory tutorial Kyuchul Yoon Division of English Kyungnam University.
Digital Audio Basics.
An Introduction to : a closer look at analysing vowels
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Speech Analysis TA:Chuan-Hsun Wu
Why Study Spoken Language?
Chapter 4 Application Software
N. Capp, E. Krome, I. Obeid and J. Picone
Why Study Spoken Language?
Pitch Detection from Waveform and Spectrogram
Spoken Language Processing:Summing Up
Hands-on tutorial: Using Praat for analysing a speech corpus
From Word Spotting to OOV Modeling
ICT Word Processing Lesson 5: Revising and Collaborating on Documents
Assist. Lecturer Safeen H. Rasool Collage of SCIENCE IT Dept.
Learning How to Create an Online Interactive Poster using
Tools for Speech Analysis
Looking at Spectrogram in Praat cs4706, Jan 30
Presentation transcript:

Tools for Speech Analysis Julia Hirschberg CS4706 Thanks to Jean-Philippe Goldman, Fadi Biadsy

2 Goals Create stimuli for an experiment (i.e. hybridization) Create databases for TTS or research Analyze a speech corpus from an experiment or natural speech Verify/correct an automatic segmentation or pitch track Fix your TTS recordings to sound better, different

3 Data Speech content (noise, multivoice,…) Data Files –Sound/Transcription/PitchContour –Sampling/Quantization 16k 12k 8k 4k 8bit –Size: how much data? –Format Sound: wav, wma, mp3, ogg, aiff, aifc, au, vox, raw, sd, CSL, Ogg/Vorbis, NIST/Sphere,… Translation: sox or Praatsox Transcription schemes: ToBI

4 What tasks do we want to perform ? Visualization and Editing: –Record, play, edit, mix, add effects Analysis: –Spectral information, pitch, intensity Speech manipulation: –Filtering, mixing, adding effects, prosodic manipulation Annotation: –Segmentation, labeling Scripting: –Batch, communication with outside programs

5 Software Options Goldwave(audio editor) Esps Xwaves(routines + visual.) Praat(speech analysis) Wavesurfer(speech editor) Transcriber(annotation tool) Matlab(general purpose soft) OGI speech tools(routines + app. dev.) …winpitch, pitchworks, phonedit, cooledit…..

6 Links (Matlab) (phonedit) (PitchWorks) (WinPitch) m.html (CoolEdit > Audition) m.html

7 How to Evaluate Visualization/Edition Analysis Speech manipulation Annotation support Scripting Plotting Supported formats Platform/installation Evolution/community Accessibility Price

8 Our Choice: Praat Developed by Paul Boersma and David Weenink at the Institute of Phonetic Sciences, University of Amsterdam General purpose speech tool : editing, segmentation and labeling, prosodic manipulation

9

10 Praat Pros: designed for speech analysis (not just sound editing or spectrogram visualization), nice GUI, scripting, active development and community, prosodic manipulation, many scripts available on line Cons: limited scripting language, native format of transcription and pitch files

11 File Management Recording files and saving them –New menu Opening files –Read menu Long and short sound files Other file types –Write menu –Exercise: Record a file with your own name, play it to check, call it ‘ ’, save it to list, write it to a.wav file on disk, remove it from the objects list, read it back in

12 Editing Options from Objects Window Select and edit your name file Spectrum: –Show a spectral slice –Show a spectrogram Pitch: –Show pitch –Check the settings, change the range –Get pitch information: get pitch, get min/max pitch Intensity: –Get intensity information: similar to pitch functions –Check the settings Formant: Display

13 Modifying the Data Changing the pitch contour of your name file: Go to To manipulation Edit the new object Pitch  Stylize pitch (2st) Modify pitch by dragging points up and down Modify duration: –Add points in duration tier –Drag points up and down To save: File -> Publish resynthesis

14 Annotation: Textgrids From objects window, w/ sound file selected –Annotate  To textgrid –Point vs. interval tiers Add a point tier and an interval tier and insert some labels NB: remember to select the interval or point first in the waveform or spectrogram before trying to insert a label

15 Scripting From history: –Praat  new Praatscript  Edit  Paste history –NB: you can run all or part of the script Writing scripts Modifying existing scripts: –Tutorials, scripts, resources, user groups, searchTutorials, scripts, resources, user groups

Sample Praat Script # This script will create a new text-grid for a wav file form Make a text-grid for a.wav file comment Source Directory? sentence Directory C:\Documents and Settings\julila\My Documents\ comment File name? sentence Filename comment Tier Name? sentence Tier endform Read from file... 'directory$‘ ‘filename$' stem$ = left$(filename$,length(filename$)-4) select Sound 'stem$' To TextGrid... 'tier$' 'tier$‘ # tier names, which tiers are point tiers Write to text file... 'directory$'\'stem$'.TextGrid Remove

Task1 Record a file with the following vowels: –/iy/ (heat), /ae/ (hat), /uw/ (you), /aa/ (not) Segment them in an annotation tier Find the height of the first 3 formants for each vowel How do the high vowels (/iy/, /uw) differ from the low vowels? The front (/iy/, /ae/) from the back?

American English vowel space FRONTBACK HIGH LOW ey ow aw oy ay iy ih eh ae aa ao uw uh ah ax ixux

Vowels and Formants Higher F1, lower vowel (/ae/, /aa/) – high vowels have low F1 (/iy/, /uw/) Higher F2, fronter vowel (/iy/, /ae/) – back vowels have low F2 (/uw/, /aa/)

Task2 Record files with different consonant classes as defined by manner of articular/voiced and unvoiced, all in the context of the same vowel /iy/ (e.g. /piyp/, /biyb/, /giyg/, /kiyk/, /miym/, /liyl/ Measure the formants of /iy/ in each context Are they all the same? Are they different from /iy/ said alone?

Task3 Record something in a very loud voice, to produce clipping, and see what the waveform looks like

Task4 Record a file using falling intonation; modify it to produce a rising intonational contour

Task5 Add a textgrid with one interval tier to Task4’s file; label the words in the file, aligning each interval with the word in the waveform

Task6 Record a sample of the same short sentence as angry speech, sad speech, happy speech, and see what things (pitch contour, pitch mean and max, intensity mean and max, spectral information) differ. choose something fully voiced if possible.

25 Help Online help, FAQ, manual Links from Additional tutorials, scripts, resources, user groupsAdditional tutorials, scripts, resources, user groups

26 Next Class Feb 3: More on speech tools and Lab visit