LE 460 L Acoustics and Experimental Phonetics L-13

Slides:



Advertisements
Similar presentations
Analog to digital conversion
Advertisements

| Page Angelo Farina UNIPR | All Rights Reserved | Confidential Digital sound processing Convolution Digital Filters FFT.
EET260: A/D and D/A conversion
Analog Representations of Sound Magnified phonograph grooves, viewed from above: When viewed from the side, channel 1 goes up and down, and channel 2 goes.
Digital Signal Processing
Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
ECE 4371, Fall, 2014 Introduction to Telecommunication Engineering/Telecommunication Laboratory Zhu Han Department of Electrical and Computer Engineering.
Analogue to Digital Conversion (PCM and DM)
CMP206 – Introduction to Data Communication & Networks Lecture 2 – Signals.
4-Integrating Peripherals in Embedded Systems (cont.)
Motivation Application driven -- VoD, Information on Demand (WWW), education, telemedicine, videoconference, videophone Storage capacity Large capacity.
SIMS-201 Characteristics of Audio Signals Sampling of Audio Signals Introduction to Audio Information.
IT-101 Section 001 Lecture #8 Introduction to Information Technology.
CHAPTER 5 Discrete Sampling and Analysis of Time-Varying Signals Analog recording systems, which can record signals continuously in time, digital data-acquisition.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Introduction to Data Conversion
Image and Sound Editing Raed S. Rasheed Sound What is sound? How is sound recorded? How is sound recorded digitally ? How does audio get digitized.
CEN352, Dr. Ghulam Muhammad King Saud University
Chapter 2 Fundamentals of Data and Signals
William Stallings Data and Computer Communications 7th Edition (Selected slides used for lectures at Bina Nusantara University) Data, Signal.
Chapter 2: Fundamentals of Data and Signals. 2 Objectives After reading this chapter, you should be able to: Distinguish between data and signals, and.
1 Chapter 2 Fundamentals of Data and Signals Data Communications and Computer Networks: A Business User’s Approach.
Chapter 4 Digital Transmission
Sampling Theory. Time domain Present a recurring phenomena as amplitude vs. time  Sine Wave.
 Principles of Digital Audio. Analog Audio  3 Characteristics of analog audio signals: 1. Continuous signal – single repetitive waveform 2. Infinite.
Basics of Signal Processing. frequency = 1/T  speed of sound × T, where T is a period sine wave period (frequency) amplitude phase.
Source/Filter Theory and Vowels February 4, 2010.
Digital audio. In digital audio, the purpose of binary numbers is to express the values of samples that represent analog sound. (contrasted to MIDI binary.
Fall 2004EE 3563 Digital Systems Design Audio Basics  Analog to Digital Conversion  Sampling Rate  Quantization  Aliasing  Digital to Analog Conversion.
Key terms Sampling rate – how often we read the value of the signal Resolution – the separation between “levels” for our samples (can we read the value.
Lab #8 Follow-Up: Sounds and Signals* * Figures from Kaplan, D. (2003) Introduction to Scientific Computation and Programming CLI Engineering.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Lecture 1 Signals in the Time and Frequency Domains
Basics of Signal Processing. SIGNALSOURCE RECEIVER describe waves in terms of their significant features understand the way the waves originate effect.
Data Communications & Computer Networks, Second Edition1 Chapter 2 Fundamentals of Data and Signals.
COMP Representing Sound in a ComputerSound Course book - pages
CSC361/661 Digital Media Spring 2002
1 4-Integrating Peripherals in Embedded Systems (cont.)
Computer Some basic concepts. Binary number Why binary? Look at a decimal number: 3511 Look at a binary number: 1011 counting decimal binary
Acoustic Analysis of Speech Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD 301.
Wireless and Mobile Computing Transmission Fundamentals Lecture 2.
Chapter #5 Pulse Modulation
Introduction to SOUND.
1 Introduction to Information Technology LECTURE 6 AUDIO AS INFORMATION IT 101 – Section 3 Spring, 2005.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
1 Composite Signals and Fourier Series To approximate a square wave with frequency f and amplitude A, the terms of the series are as follows: Frequencies:
1 Manipulating Audio. 2 Why Digital Audio  Analogue electronics are always prone to noise time amplitude.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
1 What is Multimedia? Multimedia can have a many definitions Multimedia means that computer information can be represented through media types: – Text.
Acoustic Phonetics 3/14/00.
Digital Audio I. Acknowledgement Some part of this lecture note has been taken from multimedia course made by Asst.Prof.Dr. William Bares and from Paul.
Introduction to Data Conversion EE174 – SJSU Tan Nguyen.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Fourier Analysis Patrice Koehl Department of Biological Sciences National University of Singapore
The Physics of Sound.
Topics discussed in this section:
Multimedia Systems and Applications
4.1 Chapter 4 Digital Transmission Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Lecture Signals with limited frequency range
Chapter 2 Signal Sampling and Quantization
MODULATION AND DEMODULATION
Soutenance de thèse vendredi 24 novembre 2006, Lorient
MECH 373 Instrumentation and Measurements
Digital Control Systems Waseem Gulsher
Sampling and Quantization
COMS 161 Introduction to Computing
CEN352, Dr. Ghulam Muhammad King Saud University
Presentation transcript:

LE 460 L Acoustics and Experimental Phonetics L-13 Anu Khosla DRDO, Delhi khoslaanu@yahoo.co.in 9313979365 1 1

Introduction Most of analysis methods are not designed to analyse sounds whose characteristics are changing in time Practical solution is to model the speech signal as a slowly varying function of time. During intervals of 5 to 25 ms the speech characteristics don’t change too much and are considered to be constant. Analyse in small segments - analysis intervals Optimal analysis interval length depends on the kind of information you want to extract from the speech signal. Therefore the analysis results always represent some kind of average of the analysis interval.

Parameters for Analysis Three parameters to be decided for analysis Window Length There is no one optimal window length that fits all circumstances It depends on the type of analysis and the type of signal e.g - to make spectrograms one often chooses either 5 ms for a wideband spectrogram or 40 ms for narrow band - For pitch analysis a window length of 40 ms is more appropriate

Time step This parameter determines the amount of overlap between successive segments. If the time step is much smaller than the window length we have much overlap. If time step is larger than the window length we have no overlap at all. In general we like to have at least 50% overlap between two succeeding frames and we will chose a time step smaller than half the window length.

Window shape In general we want the sound segment’s amplitudes to start and end smoothly. A lot of different window shapes are popular in speech analysis, square window (or rectangular window) Hamming window Hanning window Bartlett window. In Praat the default windowing function is the Gaussian window.

Speech Analysis Short Time Analysis In time domain Short time energy:Used to segment speech into smaller units Short time zero crossing: Used to help in making voicing decisions (high ZCR indicates unvoiced speech) Short time autocorrelation : pitch determination In Frequency Domain Fourier analysis:Spectrogram, formants

Computerized Speech Precautions Try to avoid making recordings in reverberant rooms (a church is very reverberant). • Try to avoid making recordings at places where environment is noisy and uncontrollable • To avoid large intensity variations in the recording, the distance from the speaker’s mouth to the microphone should remain as constant as possible. Avoid simultaneous speaking

Computerized Speech Speech (sound) is analog Computers are digital We need to convert Th e s p ee ch s i g n al l e v el v a r ie s w i th t i m(e)

Sampling is the reduction of a continuous signal to a discrete signal Sampling frequency or sampling rate fs is defined as the number of samples obtained in one second (samples per second), fs = 1/T. Shannon and Nyquist proved in the 1930’s that for the digital signal to be a faithful representation of the analog signal, a relation between the sampling frequency and the bandwidth of the signal had to be maintained. The Nyquist-Shannon sampling theorem: A sound s(t) that contains no frequencies higher than F hertz is completely determined by giving its sample values at a series of points spaced 1=(2F ) seconds apart. The number of sample values per second corresponds to the term sampling frequency. Sample values at intervals of 1/2F s translate to a sampling frequency of 2F hertz.

Poor Sampling Sampling Frequency = 1/2 X Wave Frequency Sampling rate 2* wave period

Even Worse Sampling Frequency = 1/3 X Wave Frequency

Higher Sampling Frequency Sampling Frequency = 2/3 Wave Frequency

Getting Better Sampling Frequency = Wave Frequency

Good Sampling Sampling Frequency = 2 X Wave Frequency

Shannon-Nyquist's Sampling Theorem A sampled time signal must not contain components at frequencies above half the sampling rate (The so-called Nyquist frequency) The highest frequency which can be accurately represented is one-half of the sampling rate

Range of Human Hearing 20 – 20,000 Hz We lose high frequency response with age Women generally have better response than men To reproduce 20 kHz requires a sampling rate of 40 kHz Below the Nyquist frequency we introduce aliasing

Effect of Aliasing Fourier Theorem states that any waveform can be reproduced by sine waves. Improperly sampled signals will have other sine wave components.

Half the Nyquist Frequency

Nyquist Frequency

Recovery of a sampled sine wave for different sampling rates

Sampling

Quantization and encoding of a sampled signal

Quantization Error When a signal is quantized, we introduce an error - the coded signal is an approximation of the actual amplitude value. The difference between actual and coded value (midpoint) is referred to as the quantization error. The more zones, the smaller  which results in smaller errors. BUT, the more zones the more bits required to encode the samples -> higher bit rate

Digitization of Analog Signal Sample analog signal in time and amplitude Find closest approximation Original signal Sample value D/2 3D/2 5D/2 7D/2 -D/2 -3D/2 -5D/2 -7D/2 Approximation 3 bits / sample Rs = Bit rate = # bits/sample x # samples/second

All DAC’s have a fixed highest sampling frequency and to guarantee that the input contains no frequencies higher than half this frequency we have to filter them out. If we don’t filter out these frequencies, they get aliased and would also contribute to the digitized representation.

For most phonemes, almost all of the energy is contained in the 5Hz-4 kHz range, allowing a sampling rate of 8 kHz. This is the sampling rate used by nearly all telephony systems CD quality audio is recorded at 16-bit.