EE513 Audio Signals and Systems Introduction Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

Tamara Berg Advanced Multimedia
Analog and Digital Signals AD/DA conversion BME 1008 Introduction to Biomedical Engineering FIU, Spring 2015 Feb 5 Lesson 3.
Analysis of recorded sounds using Labview
Are you taking Phys. 1240, The Physics of Sound and Music, for credit? A) yes B) no.
Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
CMPS1371 Introduction to Computing for Engineers PROCESSING SOUNDS.
What makes a musical sound? Pitch n Hz * 2 = n + an octave n Hz * ( …) = n + a semitone The 12-note equal-tempered chromatic scale is customary,
Pitch Perception.
EE Audio Signals and Systems Psychoacoustics (Pitch) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Chapter 7 Principles of Analog Synthesis and Voltage Control Contents Understanding Musical Sound Electronic Sound Generation Voltage Control Fundamentals.
SIMS-201 Characteristics of Audio Signals Sampling of Audio Signals Introduction to Audio Information.
IT-101 Section 001 Lecture #8 Introduction to Information Technology.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
1 Frequency Domain Analysis/Synthesis Concerned with the reproduction of the frequency spectrum within the speech waveform Less concern with amplitude.
EE 225D, Section I: Broad background Synthesis/vocoding history (chaps 2&3) Recognition history (chap 4) Machine recognition basics (chap 5) Human recognition.
Natural Language Processing - Speech Processing -
Synthetic Audio A Brief Historical Introduction Generating sounds Synthesis can be “additive” or “subtractive” Additive means combining components (e.g.,
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
1 Multimedia Systems 1 Dr Paul Newbury School of Engineering and Information Technology ENGG II - 3A11 Ext: 2615.
COMP 4060 Natural Language Processing Speech Processing.
Human Psychoacoustics shows ‘tuning’ for frequencies of speech If a tree falls in the forest and no one is there to hear it, will it make a sound?
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza.
infinity-project.org Engineering education for today’s classroom 53 Design Problem - Digital Band Build a digital system that can create music of any.
Speech synthesis Recording and sampling Speech recognition Apr. 5
Introduction to Interactive Media 10: Audio in Interactive Digital Media.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
EE Audio Signals and Systems Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Audio Scene Analysis and Music Cognitive Elements of Music Listening
Art 321 Sound, Audio, Acoustics Dr. J. Parker. Sound What we hear as sound is caused by rapid changes in air pressure! It is thought of as a wave, but.
Multimedia Specification Design and Production 2013 / Semester 2 / week 3 Lecturer: Dr. Nikos Gazepidis
Acoustic Analysis of Speech Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD 301.
COSC 1P02 Introduction to Computer Science 4.1 Cosc 1P02 Week 4 Lecture slides “Programs are meant to be read by humans and only incidentally for computers.
Compression No. 1  Seattle Pacific University Data Compression Kevin Bolding Electrical Engineering Seattle Pacific University.
CH. 21 Musical Sounds. Musical Tones have three main characteristics 1)Pitch 2) Loudness 3)Quality.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
Fundamentals of Audio Production. Chapter 1 1 Fundamentals of Audio Production Chapter One: The Nature of Sound.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES INTRODUCTION 6/1/ A.Aruna, Assistant Professor, Faculty of Information Technology.
1 Introduction to Information Technology LECTURE 6 AUDIO AS INFORMATION IT 101 – Section 3 Spring, 2005.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
(Extremely) Simplified Model of Speech Production
Chapter 21 Musical Sounds.
EE Audio Signals and Systems Speech Production Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
EE Audio Signals and Systems Wave Basics Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
AUDIOMETRY An Audiometer is a machine, which is used to determine the hearing loss in an individual.
A Brief Introduction to Musical Acoustics
Acoustic Phonetics 3/14/00.
Audio Scene Analysis and Music Cognitive Elements of Music Listening Kevin D. Donohue Databeam Professor Electrical and Computer Engineering University.
Design of a Guitar Tab Player in MATLAB Background Lecture
ARCHITECTURAL ACOUSTICS
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
HOW WE TRANSMIT SOUNDS? Media and communication 김경은 김다솜 고우.
Measurement and Instrumentation
Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)
Vocoders.
Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)
EE Audio Signals and Systems
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)
Kocaeli University Introduction to Engineering Applications
The Vocoder and its related technology
EE Audio Signals and Systems
Presentation transcript:

EE513 Audio Signals and Systems Introduction Kevin D. Donohue Electrical and Computer Engineering University of Kentucky

Question! If a tree falls in the forest and nobody is there to hear it, will it make a sound? Sound provided by

Ambiguity! Merriam-Webster Dictionary: Sound a : a particular auditory impression b : the sensation perceived by the sense of hearing c : mechanical radiant energy that is transmitted by longitudinal pressure waves in a material medium (as air) and is the objective cause of hearing.

Electronic Audio Systems Sound Sources – Vibrations at 20Hz-20kHz Amplification, Signal Conditioning Electoacoustic Transducer Processing for Intended Application Transmission Media Storage Information Extraction / Measurement Playback

Natural Audio Systems Generation Propagation Amplification Transduction Information Understanding

Synthetic Audio: Imitating Nature  1780 Wolfgang von Kemplen’s Speaking Machine U2BWolfgang von Kemplen’s Speaking MachineU2B  Mid 1800’s Charles Wheatstone MRCharles WheatstoneMR  Late 1800’s Alexander Graham BellAlexander Graham Bell  1939 Homer Dudley’s Voder U2B Dudley’s VoderU2B  1898 Thaddeus Cahill’s Telharmonium (First Music Synthesizer)Telharmonium  1919 Lev Theremin’s Theremin U2B1 and U2B2U2B1U2B2

Speech Analysis and Synthesis  Communication channels (acoustic and electric)  1874/1876 (Antonio Meucci’s) Alexander Graham Bell’s Telephone.Alexander Graham Bell’s Telephone  1940’s Homer Dudley’s Channel Vocoder first analysis-synthesis system

Voice-Coding Models The general speech model: Speech sounds can be analyzed by determining the states of the vocal system components (vocal chords, track, lips, tongue … ) for each fundamental sound of speech (phoneme). Unvoiced Speech Quasi-Periodic Pulsed Air Air Burst or Continuous flow Voiced Speech Vocal Tract Filter Vocal Radiator

Spectral Analysis Voiced Speech Spectral envelop => vocal tract formants Harmonic peaks => vocal chord pitch

Time Analysis Voiced Speech Time envelop => Volume dynamics Oscillations => Vocal chord motion

Spectrogram Analysis Time Frequency There shoe old do She lived

Spectogram of CD sound Time Frequency

Speech Recognition  1920’s Radio RexRadio Rex  1950’s (Bell Labs) Digit Recognition(Bell Labs) Digit Recognition  Spectral/Formant analysis  Filter Banks  1960’s Neural NetworksNeural Networks  1970’s ARPA Project for Speech UnderstandingARPA Project for Speech Understanding  Applications of spectral analysis methods FFT, Cepstral/homomorphic, LPC  1970’s Application of pattern matching methods DTW, and HMM

Speech Recognition  1980’s  Standardize Training and Test with Large Corpora (TIMIT) (RM) (DARPA)TIMITRM  New Front Ends (feature extractors) more perceptually based  Dominance/Development of HMMHMM  Backpropagation and Neural Networks U2BU2B  Rule-Base AI systemsAI systems

Specification of Speech Recognition  Speaker dependent or independent  Recognize isolated, continuous, or spot speech  Vocabulary Size, Grammar Perplexity, Speaking style  Recording conditions

Components of Speech Recognition Speech Transduction Acoustic/Electronic Front End Local Match Global DetectorLanguage Model Input Speech Detected Speech String

Matlab Examples % Create and play a 2 second 440 Hz tone in Matlab: fs = 8000; % Set a sampling frequency fq = 440; % frequency to play t = [0:round(2*fs)-1]/fs; % Sampled time axis sig = cos(2*pi*fq*t); % Create sampled signal soundsc(sig,fs) % Play it plot(t,sig); xlabel('Seconds'); ylabel('Amplitude') wavwrite(sig,fs,'t440.wav') clear % Remove all variables from work space % Reload tone and weight it with a decaying exponential of time constant.6 seconds tc =.6; % Set time constant [y, fs] = wavread('t440.wav'); % read in wave file t =[0:length(y)-1]'/fs; % Create sampled time axis dw = exp(-t/tc); % Compute sampled decaying exponential dsig = y.*dw; % Multiply sinusoid with decaying exponential soundsc(dsig,fs) plot(t,dsig); xlabel('Seconds'); ylabel('Amplitude')

Matlab Examples Explore demo and help files >> help script SCRIPT About MATLAB scripts and M-files. A SCRIPT file is an external file that contains a sequence of MATLAB statements. By typing the filename, subsequent MATLAB input is obtained from the file. SCRIPT files have a filename extension of ".m" and are often called "M-files". To make a SCRIPT file into a function, see FUNCTION. See also type, echo. Reference page in Help browser doc script In the help window (click on question mark) Go through section on programming and then go to the demo tab and view a few of the demo.

Matlab Examples In class examples …

Matlab Exercise  Use the sine/cosine function in Matlab to write a function that generates a Dorian scale (for testing the function use start tones between 100 and 440 Hz with a sampling rate of 8 kHz). Let the Matlab function input arguments be the starting frequency and the time interval for each scale tone in seconds. Let the output be a vector of samples that can be played with Matlab command “soundsc(v,8000)” (where v is the vector output of your function).Dorian  The frequency range of a scale covers one octave, which implies the last frequency is twice the starting frequency. On most fixed pitch instruments, 12 semi-tones or half steps make up the notes within an octave. A minor scale sequentially increases by a whole, half, whole, whole, half, whole, and whole (8 notes altogether – including the starting note).

Matlab Exercise - Scales JustPythagoreanEqual Temperament Interval - 0 (1)1/1 = 11 = 12^(0)=1 Interval - 116/15256/2432^(1/12) Interval - 2 (2)10/9 (or 9/8)9/82^(2/12) Interval - 3 (3)6/532/272^(3/12) Interval - 45/481/642^(4/12) Interval - 5 (4)4/3 2^(5/12) Interval - 645/32 (or 64/45)1024/729 (or 729/512)2^(6/12) Interval - 7 (5)3/2 2^(7/12) Interval - 8 (6)8/5128/812^(8/12) Interval - 95/327/162^(9/12) Interval - 10 (7)7/4 (or 16/19 or 9/5)16/92^(10/12) Interval /8243/1282^(11/12) Interval - 12 (8)2/1 = 2 2^(12/12) = 2

Matlab Exercise – Famous Notes Middle C = Hz (standard tuning) Concert A (A above middle C) = 440 Hz Middle C = 256 Hz (Scientific tuning) Lowest note on piano A=27.5 Hz Highest note on piano C=