GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.

Slides:



Advertisements
Similar presentations
| Page Angelo Farina UNIPR | All Rights Reserved | Confidential Digital sound processing Convolution Digital Filters FFT.
Advertisements

DCSP-13 Jianfeng Feng
Acoustic/Prosodic Features
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Digital Coding of Analog Signal Prepared By: Amit Degada Teaching Assistant Electronics Engineering Department, Sardar Vallabhbhai National Institute of.
AES 120 th Convention Paris, France, 2006 Adaptive Time-Frequency Resolution for Analysis and Processing of Audio Alexey Lukin AES Student Member Moscow.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
SIMS-201 Characteristics of Audio Signals Sampling of Audio Signals Introduction to Audio Information.
IT-101 Section 001 Lecture #8 Introduction to Information Technology.
CEN352, Dr. Ghulam Muhammad King Saud University
Chapter 8: The Discrete Fourier Transform
1 Speech Parametrisation Compact encoding of information in speech Accentuates important info –Attempts to eliminate irrelevant information Accentuates.
FFT-based filtering and the Short-Time Fourier Transform (STFT) R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.
Discrete Fourier Transform(2) Prof. Siripong Potisuk.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Continuous-Time Signal Analysis: The Fourier Transform
Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.
CELLULAR COMMUNICATIONS DSP Intro. Signals: quantization and sampling.
Basics of Signal Processing. frequency = 1/T  speed of sound × T, where T is a period sine wave period (frequency) amplitude phase.
Representing Acoustic Information
T Digital Signal Processing and Filtering
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Audio and Music Representations (Part 2) 1.
DSP. What is DSP? DSP: Digital Signal Processing---Using a digital process (e.g., a program running on a microprocessor) to modify a digital representation.
LE 460 L Acoustics and Experimental Phonetics L-13
Ni.com Data Analysis: Time and Frequency Domain. ni.com Typical Data Acquisition System.
Lecture 1 Signals in the Time and Frequency Domains
Basics of Signal Processing. SIGNALSOURCE RECEIVER describe waves in terms of their significant features understand the way the waves originate effect.
CSC361/661 Digital Media Spring 2002
Multiresolution STFT for Analysis and Processing of Audio
Fourier Concepts ES3 © 2001 KEDMI Scientific Computing. All Rights Reserved. Square wave example: V(t)= 4/  sin(t) + 4/3  sin(3t) + 4/5  sin(5t) +
1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.
Acoustic Analysis of Speech Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD 301.
Transforms. 5*sin (2  4t) Amplitude = 5 Frequency = 4 Hz seconds A sine wave.
Copyright ©2010, ©1999, ©1989 by Pearson Education, Inc. All rights reserved. Discrete-Time Signal Processing, Third Edition Alan V. Oppenheim Ronald W.
Preprocessing Ch2, v.5a1 Chapter 2 : Preprocessing of audio signals in time and frequency domain  Time framing  Frequency model  Fourier transform 
EE210 Digital Electronics Class Lecture 2 March 20, 2008.
Zhongguo Liu_Biomedical Engineering_Shandong Univ. Chapter 8 The Discrete Fourier Transform Zhongguo Liu Biomedical Engineering School of Control.
Chapter 6 Spectrum Estimation § 6.1 Time and Frequency Domain Analysis § 6.2 Fourier Transform in Discrete Form § 6.3 Spectrum Estimator § 6.4 Practical.
Real time DSP Professors: Eng. Julian S. Bruno Eng. Jerónimo F. Atencio Sr. Lucio Martinez Garbino.
Pre-Class Music Paul Lansky Six Fantasies on a Poem by Thomas Campion.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
7- 1 Chapter 7: Fourier Analysis Fourier analysis = Series + Transform ◎ Fourier Series -- A periodic (T) function f(x) can be written as the sum of sines.
HOW JEPG WORKS Presented by: Hao Zhong For 6111 Advanced Algorithm Course.
Lecture 21: Fourier Analysis Using the Discrete Fourier Transform Instructor: Dr. Ghazi Al Sukkar Dept. of Electrical Engineering The University of Jordan.
Fourier and Wavelet Transformations Michael J. Watts
Fourier Analysis Using the DFT Quote of the Day On two occasions I have been asked, “Pray, Mr. Babbage, if you put into the machine wrong figures, will.
The Discrete Fourier Transform
The Frequency Domain Digital Image Processing – Chapter 8.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Fourier Analysis Patrice Koehl Department of Biological Sciences National University of Singapore
Content: Distortion at electronic loads
Ch. 2 : Preprocessing of audio signals in time and frequency domain
CS 591 S1 – Computational Audio
Spectrum Analysis and Processing
Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.
ARTIFICIAL NEURAL NETWORKS
FFT-based filtering and the
Sampling rate conversion by a rational factor
Fourier and Wavelet Transformations
EE Audio Signals and Systems
Fourier Analysis of Signals Using DFT
LECTURE 18: FAST FOURIER TRANSFORM
EE210 Digital Electronics Class Lecture 2 September 03, 2008
Chapter 9 Advanced Topics in DSP
INTRODUCTION TO THE SHORT-TIME FOURIER TRANSFORM (STFT)
Electrical Communication Systems ECE Spring 2019
ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals
LECTURE 18: FAST FOURIER TRANSFORM
Electrical Communications Systems ECE
Presentation transcript:

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1

Outlines Overview of MIR system – Human Listening – Machine Listening Audio and Music representations – Time-domain representation Waveform – Time-frequency domain representations Sinusoids DFT STFT, Spectrogram 2

Human Listening L. Watts, “Visualizing Complexity in the Brain”, Ears Auditory Transduction watch?v=PeTriGTENoc

Machine Listening Emulated the human auditory system? – Well, it might be better to understand the functionalities in a high level and implement them in efficient ways for machines… Basic Functionalities – Capture sounds and convert the air vibration to an accessible form (by the machine) – Transform the input to have a better view of sounds – Extract only necessary part – Obtain desired information from the extracted part 4

(Content-based) MIR System 5 Algorithms Feature Extraction Sound Capture Representation Transform Block Diagram of MIR system

Sound Capture 6 Microphone – Mechanical vibration to electrical signals A-D converter – Sampling and Quantization – Produce digital waveforms Often, we store the waveforms as audio files – If necessary, the audio files are compressed (mp3, wma, …) Algorithms Feature Extraction Sound Capture Representation Transform

Representation Transform Transform waveforms to have better view of sounds – Mostly using sinusoidal basis functions Types – Short-time Fourier Transform (STFT): Spectrogram – Constant-Q transform – Auditory filter banks – Remapped spectrogram (frequency or amplitude) – Auto-correlation 7 Algorithms Feature Extraction Sound Capture Representation Transform

Feature Extraction and Algorithms Feature extraction – Extract only necessary variations in the data representation Algorithms – Determine categories or specific values through training Two approaches in feature extraction and algorithms – Heuristic approach: make computational rules based on domain knowledge and trial-and-error – Learning-based approach: training the system using labeled (or unlabeled) data – The rest of this course is all about this 8 Algorithms Feature Extraction Sound Capture Representation Transform

Sound Capture 9 Microphone – Mechanical vibration to electrical signals – Followed by pre-amplifiers – Microphones and pre-amps have characteristic frequency responses that colorize the input sound A-D converter – Sampling: continuous-to discrete-time signals – Quantization: finite numbers of amplitude steps – Produce digital waveforms Often, multiple input channels are used – Stereo (2-ch) is standard in music recordings – Microphone arrays: good for sound localization and spatial filtering (e.g. beam-forming )

Sampling Convert continuous signals to a series of discrete numbers by uniformly picking up the signal values in time Sampling theorem – Sampling rate must be twice as high as the highest frequency the continuous signals contain. – Lowpass filter is applied before sampling to avoid aliasing Human can hear up to 20kHz – Sampling rate of 40kHz or above Examples of sampling rates – Speech: 8kHz, 16kHz – Music: 22.05Hz, 44.1KHz, 48KHz – Professional audio gears: 48kHz, 96kHz 10

Quantization 11 Convert continuous level of values to a finite set of steps in amplitude Create “quantization error” – Can be regarded as additive noise – Sufficient number of quantization steps is necessary to prevent the noise from being audible Examples of quantization steps – 8 bit: 48dB (dynamic range) – 16 bit: 96dB – 24 bit: 144dB – Human ears: about 110 dB (depending on frequency)

(Digital) Waveform 12 The most basic audio representation that computers can take – x(n) = [a1, a2, a3,...] Good to view energy change – Overall dynamic range when zoomed out – Fine-time note onset when zoomed in But not very intuitive

Another View of Waveform Waveform can be seen as representing signals with the following basis functions For example, the signal x(n) is like: Can we find better basis functions? – New basis functions: 13

Sinusoids A periodic waveform drawn from a circle Why sinusoids are important – Fundamental in Physics – Eigen-functions of linear systems – Human ears is a kind of spectrum analyzer 14 : Amplitude : Angular Frequency : Initial Phase

Discrete Fourier Transform (DFT) Complex Sinusoid – By Euler’s Identity: Discrete Fourier Transform – Inner product with complex sinusoid Inverse Discrete Fourier Transform 15

DFT Inverse DFT Basis Function View Practical Form of DFT 16

Matrix Multiplication View of DFT In fact, we don’t compute this directly. There is a more efficiently way, which is called “Fast Fourier Transform (FFT)” Complexity reduction by FFT: O( N 2 )  O( Nlog 2 N ) Practical Form of DFT 17

Practical Form of DFT DFT produces complex numbers! Magnitude – Correspond to energy at frequency k Phase – Corresponds to phase at frequency k 18

Examples of DFT 19 Sine waveform Drum Flute

Short-Time Fourier Transform (STFT) DFT assumes that the signal is stationary – It is not a good idea to apply DFT to long and dynamically changing signals like music – Instead, we segment the signal and apply DFT separately Short-Time Fourier Transform 1.Segment a frame using a window function 2.Zero-padding if necessary 3.Apply DFT to the zero-padded windowed waveform 4.Progress by “hop size” 5.Repeat step 1-4 This produces 2-D time-frequency representations – Get “spectrogram” from the magnitude Parameters: window size, window type, FFT size, hop size 20

Windowing Types of window functions – Rectangular, Triangle, Hann, Hamming, Blackman-Harris – Trade-off between the width of main-lobe and the level of side-lobe 21 Main-lobe width Side-lobe level

Zero-padding Adding zeros to a windowed frame in time domain Corresponds to “ideal interpolation” in frequency domain In practice, FFT size increases by the size of zero-padding 22

Example: Music 23

Example: Deep Note 24

Time-Frequency Resolutions in STFT Trade-off between time-resolution and frequency-resolution – Long window: high frequency-resolution / low time-resolution – short window: low frequency-resolution / high time-resolution 25

References JOS DSP Books – Mathematics of DFT – Spectral Audio Signal Processing The Scientist and Engineer’s Guide to Digital Signal Processing – (See chapter 8-12) 26