Intro. to Audio Signals Jyh-Shing Roger Jang ( 張智星 ) MIR Lab, CSIE Dept National Taiwan Univ., Taiwan.

Slides:



Advertisements
Similar presentations
Acoustic/Prosodic Features
Advertisements

Introduction to Digital Audio
Sound can make multimedia presentations dynamic and interesting.
Part A Multimedia Production Rico Yu. Part A Multimedia Production Ch.1 Text Ch.2 Graphics Ch.3 Sound Ch.4 Animations Ch.5 Video.
Chapter 4: Representation of data in computer systems: Sound OCR Computing for GCSE © Hodder Education 2011.
I Power Higher Computing Multimedia technology Audio.
SIMS-201 Characteristics of Audio Signals Sampling of Audio Signals Introduction to Audio Information.
Digital Audio.
Image and Sound Editing Raed S. Rasheed Digital Sound Digital sound types – Monophonic sound – Stereophonic sound – Quadraphonic sound – Surround.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
SCA Introduction to Multimedia
1 Interspeech Synthesis of Singing Challenge, Aug 28, 2007 Formant-based Synthesis of Singing Sten Ternström and Johan Sundberg KTH Music Acoustics, Speech.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Basic Features of Audio Signals ( 音訊的基本特徵 ) Jyh-Shing Roger Jang ( 張智星 ) MIR Lab, CS Dept, Tsing Hua Univ. Hsinchu, Taiwan.
1 Lab Preparation Initial focus on Speaker Verification –Tools –Expertise –Good example “Biometric technologies are automated methods of verifying or recognising.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Introduction to Sound Sounds are vibrations that travel though the air or some other medium A sound wave is an audible vibration that travels through.
Physics 1251 The Science and Technology of Musical Sound Unit 3 Session 31 MWF The Fundamentals of the Human Voice Unit 3 Session 31 MWF The Fundamentals.
Source/Filter Theory and Vowels February 4, 2010.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Introduction to Interactive Media 10: Audio in Interactive Digital Media.
Lecture # 22 Audition, Audacity & Sound Editing Sound Representation.
DTC 354 Digital Storytelling Rebecca Goodrich. Wave made up of changes in air pressure by an object vibrating in a medium—water or air.
ACOUSTICS AND THE ELEMENTS OF MUSIC Is your name and today’s date at the top of the worksheet now?
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
Endpoint Detection ( 端點偵測 ) Jyh-Shing Roger Jang ( 張智星 ) MIR Lab, CSIE Dept National Taiwan Univ., Taiwan.
Preprocessing Ch2, v.5a1 Chapter 2 : Preprocessing of audio signals in time and frequency domain  Time framing  Frequency model  Fourier transform 
Sound and audio. Table of Content 1.Introduction 2.Properties of sound 3.Characteristics of digital sound 4.Calculate audio data size 5.Benefits of using.
COSC 1P02 Introduction to Computer Science 4.1 Cosc 1P02 Week 4 Lecture slides “Programs are meant to be read by humans and only incidentally for computers.
Modication by tuti 1 LECTURE 7 THE USES OF DIGITAL AUDIO IN MULTIMEDIA.
Music Tech.  What is the definition of sound?  What is a wave?
Introduction to SOUND.
Audio / Sound INTRODUCTION TO MULTIMEDIA SYSTEMS Lect. No 3: AUDIO TECHNOLOGY.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Sound Waveforms Neil E. Cotter Associate Professor (Lecturer) ECE Department University of Utah CONCEPT U AL TOOLS.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Multimedia Sound. What is Sound? Sound, sound wave, acoustics Sound is a continuous wave that travels through a medium Sound wave: energy causes disturbance.
Speech Recognition with Matlab ® Neil E. Cotter ECE Department UNIVERSITY OF UTAH
Measurement and Instrumentation
Introduction to Digital Audio
Introduction to Music Information Retrieval (MIR)
Onset Detection, Tempo Estimation, and Beat Tracking
Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)
Discrete Fourier Transform (DFT)
Ch. 2 : Preprocessing of audio signals in time and frequency domain
Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)
Cepstrum and MFCC Cepstrum MFCC Speech processing.
Introduction to Digital Audio
Higher Intensity (Volume)
Multimedia Fundamentals(continued)
Intro. to Audio Signals Jyh-Shing Roger Jang (張智星)
Mobile Systems Workshop 1 Narrow band speech coding for mobile phones
Introduction to Digital Audio
Digital Media Lecture 12: Additional Audio Georgia Gwinnett College
Introduction to Digital Audio
Introduction to Digital Audio
Chapter 4: Representing sound
The Production of Speech
Assist. Lecturer Safeen H. Rasool Collage of SCIENCE IT Dept.
Sound and Matlab® Neil E. Cotter ECE Department
Endpoint Detection ( 端點偵測)
Introduction to Digital Audio
Sound and Matlab® Neil E. Cotter ECE Department
What is a sound? Sound is a pressure wave in air or any other material medium. The human ear and brain working together are very good at detecting and.
Recap In previous lessons we have looked at how numbers can be stored as binary. We have also seen how images are stored as binary. This lesson we are.
Duration & Pitch Modification via WSOLA
Presentation transcript:

Intro. to Audio Signals Jyh-Shing Roger Jang ( 張智星 ) MIR Lab, CSIE Dept National Taiwan Univ., Taiwan

What Are Audio Signals? zAudio signals are… ySignals that are audible to human, such as speech and music yThe range of fundamental frequencies of audible signals is about 20 ~ Hz. xThe range is wider for the young people, narrower for the elderly. Quiz candidate!

Voice Generation & Reception zSteps in voice generation & reception yVibration of voice source yResonance by surrounding objects yTraveling through air (or other media) yReception of membranes and neurons at inner ears yRecognition by brains zInstances of voice generation ySinging yWhistling yGuitar yFlute

Categorization of Audio Signals zNumber of sources yMonophonic: exampleexample yPolyphonic: exampleexample zWaveform yQuasi-periodic sound xvoiced sound of speech yAperiodic sound xUnvoiced sound of speech zSource types ySounds from animals (bioacoustics) xDog barking, cat meowing, frog croaking, duck quacking ySounds from non- animals xCar engines, thunders, music instruments

S/U/V in Speech zSpeech signals can be divided into S, U, V yS (silence): no speech activity yU (unvoiced): speech activity without vibration from vocal chords yV (voiced): speech activity with vibration zHow to detect S, U, V? yBy putting your hand on your throat to feel the vibration yBy waveform observation Quiz candidate!

Tools for General Audio Processing zTools for recording and waveform observation yCool Edit yGoldWave yAudacity yMATLAB zQuiz yWhat is the major difference between the waveforms of speech and whistle?

Speech Signal of “Sunday” zUnvoiced vs. voiced frames

Silence, Unvoiced and Voiced Sounds zExamples of S, U, V y“Six” y“ 資訊系 ” suvusvuv suvsus Quiz candidate!

Human Speech Production

Source-filter Model for Human Speech Production Speech is split into a rapidly varying excitation signal and a slowly varying filter. The envelope of the power spectra contains the vocal tract info. Two important characteristics of the model are fundamental frequency (f0) and formants (F1, F2, F3, …) unvoiced voiced

The Vocal Tract

Glottal Volume Velocity & Resulting Sound Pressure (Voiced)

Speech Production Glottal Pulses Vocal Tract Speech Signal (a) Source Spectrum(c) Output Energy Spectrum + + = = (b) Filter Function

Videos for Vocal Cords Movement zMovement of vocal cords yhttp:// yhttp://

Parameters for Audio Files zThree major parameters for recording audio files ySample rate: no. of samples per sec x8 kHz (phone quality) x16 KHz (for common speech recognition) x44.1 KHz (CD quality) yBit resolution: no. of bits for representing a sample x8-bit (uint8 with range: 0~255) x16-bit (int16 with range: ~32767) yNo of channels xMono xStereo Quiz candidate!

Storage for Audio Files zExamples of storage requirement y1 min. of recording with fs=16000, nbits=16, #channel=1  60 (sec)*16 (KHz)*2 (byetes)*1 (channel) = 1920 KB = 1.92 MB y3-mins of CD music with fs=44.1KHz, nbits=16, #channel=2  180 (sec)*44.1 (KHz)*2 (bytes)*2 (channels) = KB = 32 MB Quiz candidate!

Other Interesting Phenomena zInteresting phenomena about audio signals yDon’t trust what you have heard! (Vision rules)Don’t trust what you have heard! yPerceived speech is highly context dependent:

Hints for Exercises zHow to generate a sine wave signal: yMath formula: yMATLAB code: duration=3; f=440; fs=16000; time=(0:duration*fs-1)/fs; y=0.8*sin(2*pi*f*time); plot(time, y); sound(y, fs);