INTRODUCTION TO 18-491 FUNDAMENTALS OF SIGNAL PROCESSING Richard M. Stern 18-491 lecture January 14, 2019 Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, Pennsylvania 15213
Welcome to 18-491 Fundamentals of Signal Processing (DSP)! Today will Review mechanics of course Review course content Preview material in 18-491 (DSP)
Important people (for this course at least) Instructor: Richard Stern PH B24, 8-2535, rms@cs.cmu.edu Course management assistant: Michelle Mahouski HH 1112, 8-4951, mmahousk@andrew.cmu.edu
More important people Teaching interns: Tyler Supradeep Vuong Rangarajan
Some course details Meeting time and place: Lectures here and now Recitations Friday 10:30 – 12:20, 12:30 – 2:20, SH 214 Pre-requisites (you really need these!): Signals and Systems 18-290 Some MATLAB or background (presumably from 18-290)
Does our work get graded? Yes! Grades based on: Homework (including MATLAB problems) (33%) Three exams (67%) Two midterms (March 6 and April 3), and final exam Plan on attending the exams!
Textbooks Major text: Oppenheim, Schafer, Yoder, and Padgett: Discrete-Time Signal Processing Plan on purchasing a hard copy new or used Material to be supplemented by class notes at end of course Some other texts listed in syllabus
Other support sources Office hours: Course home page: Two hours per week for instructor and each TA, times TBA You can schedule additional times with me as needed Course home page: http://www.ece.cmu.edu/~ece491 Canvas to be used for grades (but probably not much else) Piazza to be used for class discussions
Academic stress and sources of help This is a hard course Take good care of yourself If you are having trouble, seek help Teaching staff CMU Counseling and Psychological Services (CaPS) We are here to help!
Academic integrity (i.e. cheating and plagiarism) CMU’s take on academic integrity: http://www.cmu.edu/policies/documents/Cheating.html ECE’s take on academic integrity: http://www.ece.cmu.edu/programs-admissions/masters/academic-integrity.html Most important rule: Don’t cheat! But what do we mean by that? Discussing general strategies on homework with other students is OK Solving homework together is NOT OK Accessing material from previous years is NOT OK “Collaborating” on exams is REALLY REALLY NOT OK!
18-491: major topic areas Signal processing in the time domain: convolution Frequency-domain processing: The DTFT and the Z-transform Complementary signal representations Sampling and change of sampling rate The DFT and the FFT Digital filter implementation Digital filter design Selected applications Orange headings refer to deterministic topics
Complementary signal representations Unit sample response Discrete-time Fourier transforms Z-transforms Difference equations Poles and zeros of an LSI system
Some application areas (we may not get to all of these) Linear prediction and lattice filters Adaptive filtering Optimal Wiener filtering Two-dimensional DSP (image processing) Short-time Fourier analysis Speech processing Orange headings refer to deterministic topics
Signal representation: why perform signal processing? A speech waveform in time: “Welcome to DSP I”
A time-frequency representation of “welcome” is much more informative
Downsampling the waveform Downsampling the waveform by factor of 2:
Consequences of downsampling by 2 Original: Downsampled:
Upsampling the waveform Upsampling by a factor of 2:
Consequences of upsampling by 2 Original: Upsampled:
Linear filtering the waveform x[n] y[n] Filter 1: y[n] = 3.6y[n–1]+5.0y[n–2]–3.2y[n–3]+.82y[n–4] +.013x[n]–.032x[n–1]+.044x[n–2]–.033x[n–3]+.013x[n–4] Filter 2: y[n] = 2.7y[n–1]–3.3y[n–2]+2.0y[n–3]–.57y[n–4] +.35x[n]–1.3x[n–1]+2.0x[n–2]–1.3x[n–3]+.35x[n–4]
Filter 1 in the time domain
Output of Filter 1 in the frequency domain Original: Lowpass:
Filter 2 in the time domain
Output of Filter 2 in the frequency domain Original: Highpass:
Let’s look at the lowpass filter from different points of view … x[n] y[n] Difference equation for Lowpass Filter 1: y[n] = 3.6y[n–1]+5.0y[n–2]–3.2y[n–3]+.82y[n–4] +.013x[n]–.032x[n–1]+.044x[n–2]–.033x[n–3]+.013x[n–4]
Lowpass filtering in the time domain: the unit sample response
Lowpass filtering in the frequency domain: magnitude and phase of the DTFT
Another type of modeling: the source-filter model of speech A useful model for representing the generation of speech sounds: Pitch Pulse train source Noise source Vocal tract model Amplitude p[n]
The poles and zeros of the lowpass filter
Signal modeling: let’s consider the “uh” in “welcome:”
The raw spectrum
All-pole modeling: the LPC spectrum
An application of LPC modeling: separating the vocal tract excitation and and filter Original speech: Speech with 75-Hz excitation: Speech with 150 Hz excitation: Speech with noise excitation: Comment: this is a major techniques used in speech coding Welcome16 Welcome 75 Welcome 150 Welcome 0
Classical signal enhancement: compensation of speech for noise and filtering Approach of Acero, Liu, Moreno, et al. (1990-1997)… Compensation achieved by estimating parameters of noise and filter and applying inverse operations “Clean” speech Degraded speech x[m] h[m] z[m] Linear filtering n[m] Additive noise
“Classical” combined compensation improves accuracy in stationary environments Threshold shifts by ~7 dB Accuracy still poor for low SNRs Complete retraining –7 dB 13 dB Clean VTS (1997) Original CDCN (1990) “Recovered” CMN (baseline) out_pre0_norm out_new_pre20 out out_post0_norm out_new_post20
Another type of signal enhancement: adaptive noise cancellation Speech + noise enters primary channel, correlated noise enters reference channel Adaptive filter attempts to convert noise in secondary channel to best resemble noise in primary channel and subtracts Performance degrades when speech leaks into reference channel and in reverberation Push-to-talk will make life MUCH easier!!
Simulation of noise cancellation for a PDA using two mics in “endfire” configuration Speech in cafeteria noise, no noise cancellation Speech with noise cancellation But …. simulation assumed no reverb ANC_base ANC_cancel
Signal separation: speech is quite intelligible, even when presented only in fragments Procedure: Determine which time-frequency time-frequency components appear to be dominated by the desired signal Reconstruct signal based on “good” components A Monaural example: Mixed signals - Separated signals - 5_spk 1st_spk 2nd_spk 3rd_spk 4th_spk 5th_spk
Practical signal separation: Audio samples using selective reconstruction based on ITD RT60 (ms) 0 300 No Proc Delay-sum ZCAE-bin ZCAE-cont Brian-Ba-R0I0 Brian-Ba-R3I0 Brian-DS-R0I0 Brian-DS-R3I0 Brian-ZB-R0I0 Brian-ZB-R3I0 Brian-ZC-R0I0 Brian-ZC-R3I0
Phase vocoding: changing time scale and pitch Changing the time scale: Original speech Faster by 4:3 Slower by 1:2 Transposing pitch: Original music After phase vocoding Transposing up by a major third Transposing down by a major third Comment: this is one of several techniques used to perform autotuning Welcome16 Welcome 75 Welcome 150 Welcome 0
Summary Lots of interesting topics that teach us how to understand signals and design filters An emphasis on developing a solid understanding of fundamentals Will introduce selected applications to demonstrate utility of techniques I hope that you have as much fun in signal processing as I have had!