SOME SIMPLE MANIPULATIONS OF SOUND USING DIGITAL SIGNAL PROCESSING Richard M. Stern demo January 15, 2015 Department of Electrical and Computer Engineering and School of Computer Science Carnegie Mellon University Pittsburgh, Pennsylvania 15213
Carnegie Mellon Slide Digital Signal Processing I The original sound and its spectrogram
Carnegie Mellon Slide Digital Signal Processing I Downsampling the waveform Downsampling the waveform by factor of 2:
Carnegie Mellon Slide Digital Signal Processing I Consequences of downsampling Original: Downsample Downsampled:
Carnegie Mellon Slide Digital Signal Processing I Upsampling the waveform Upsampling by a factor of 2:
Carnegie Mellon Slide Digital Signal Processing I Consequences of upsampling Original: Upsampled:
Carnegie Mellon Slide Digital Signal Processing I Linear filtering the waveform x[n] y[n] Filter 1: y[n] = 3.6y[n–1]+5.0y[n–2]–3.2y[n–3]+.82y[n–4] +.013x[n]–.032x[n–1]+.044x[n–2]–.033x[n–3]+.013x[n–4] Filter 2: y[n] = 2.7y[n–1]–3.3y[n–2]+2.0y[n–3–.57y[n–4] +.35x[n]–1.3x[n–1]+2.0x[n–2]–1.3x[n–3]+.35x[n–4]
Carnegie Mellon Slide Digital Signal Processing I Filter 1 in the time domain
Carnegie Mellon Slide Digital Signal Processing I Output of Filter 1 in the frequency domain Original: Lowpass:
Carnegie Mellon Slide Digital Signal Processing I Filter 2 in the time domain
Carnegie Mellon Slide Digital Signal Processing I Output of Filter 2 in the frequency domain Original: Highpass:
Carnegie Mellon Slide Digital Signal Processing I The source-filter model of speech A useful model for representing the generation of speech sounds: Pitch Pulse train source Noise source Vocal tract model Amplitude p[n]
Carnegie Mellon Slide Digital Signal Processing I Original speech: Speech with 75-Hz excitation: Speech with 150-Hz excitation: Speech with noise excitation: Separating the vocal-tract excitation from the filter
Carnegie Mellon Slide Digital Signal Processing I Some Research Foci in ECE Processing, analysis, and compression of static and moving video Optical signal processing techniques Signal processing for digital data storage Speech recognition and understanding Multimedia fusion of video and audio information Architecture and protocols of telecommunications and computer networks
Carnegie Mellon Slide Digital Signal Processing I Approach of Acero, Liu, Moreno, et al. ( )… Compensation achieved by estimating parameters of noise and filter and applying inverse operations “Clean” speech x[m] h[m] n[m] z[m] Linear filtering Degraded speech Additive noise Classical signal enhancement: compensation of speech for noise and filtering
Carnegie Mellon Slide Digital Signal Processing I “Classical” combined compensation improves accuracy in stationary environments Threshold shifts by ~7 dB Accuracy still poor for low SNRs CMN (baseline) Complete retraining VTS (1997) CDCN (1990) –7 dB 13 dB Clean Original “Recovered”
Carnegie Mellon Slide Digital Signal Processing I Another type of signal enhancement: adaptive noise cancellation Speech + noise enters primary channel, correlated noise enters reference channel Adaptive filter attempts to convert noise in secondary channel to best resemble noise in primary channel and subtracts Performance degrades when speech leaks into reference channel and in reverberation
Carnegie Mellon Slide Digital Signal Processing I Simulation of noise cancellation for a PDA using two mics in “endfire” configuration Speech in cafeteria noise, no noise cancellation Speech with noise cancellation But …. simulation assumed no reverb
Carnegie Mellon Slide Digital Signal Processing I Signal separation: speech is quite intelligible, even when presented only in fragments Procedure: –Determine which time-frequency time- frequency components appear to be dominated by the desired signal –Reconstruct signal based on “good” components A Monaural example: –Mixed signals - –Separated signals -
Carnegie Mellon Slide Digital Signal Processing I Practical signal separation: Audio samples using selective reconstruction based on ITD RT60 (ms) No Proc Delay-sum ZCAE-bin ZCAE-cont
Carnegie Mellon Slide Digital Signal Processing I Summary Lots of interesting topics that extend core material from DSP Greater emphasis on implementation and applications Greater emphasis on statistically-optimal signal processing I hope that you have as much fun with this material as I have had!
Carnegie Mellon Slide Digital Signal Processing I Academic integrity (i.e. cheating and plagiarism) CMU’s take on academic integrity: – Most important rule: Don’t cheat! But what do we mean by that? –Discussing general strategies on homework with other students is OK –Solving homework together is NOT OK –Accessing material from previous years is NOT OK –“Collaborating” on exams is REALLY REALLY NOT OK!