Time Compression/Expansion Independent of Pitch
Listening Dies Irae from Requiem, by Michel Chion (1973)
Classic “concrete” techniques With classic tape techniques, the only way to change the duration of a recorded sound is to change the speed of the tape. Which also changes pitch. Same is true if all you want is a pitch change. Duration changes
Computer processing techniques Software offers two different options for changing duration independent of pitch: 1.Granular Synthesis, a process that slices (windows) time domain audio into very small ( ms) segments, and 2.Phase Vocoding, a process that converts time domain audio into frequency domain representations.
Converting Domains Any arbitrary periodic signal can be represented as a sum of many simultaneous sine waves. Fourier Transform Converts a time-domain representation into a frequency domain representation Inverse Fourier Transform Converts a frequency-domain representation into a time domain representation
Fast Fourier Transform FFT takes a slice of time (a window) that is n samples in length, where n = some power of 2. The number of samples in an FFT window = the number of frequency bands between 0 Hz and the Sampling Rate. Only half the bands are usable. (why?)
How Phase Vocoding Works Each FFT window represents a frame of analysis information (frequency domain content) Time compression or expansion involves changing the playback rate of the frames (the conversion of frequency domain to time domain), which takes place during an inverse Fast Fourier Transform (iFFT) Like changing the playback rate of film or video. Pitch Shifting is an independent process. X times all frequency bands (2 = octave up; 0.5 = octave down.
Phase Vocoding parameters FFT size (window size) Number of frequency bands Length of time per analysis window ( FFT_Size / SR = Length in seconds) Overlaps Determines onset of windows Helps with time resolution Window type Can affect accuracy of measurements For now, you can stick to hamming Time Scale (constant or graph) Pitch Scale (constant or graph)
Problems with the Phase Vocoder Frequency/Time trade-off — the more accurate you are with one parameter, the less accurate you are with the other. Larger FFT size provides more frequency bands, but less information about start time of events, and vice versa. Frequency bands are linearly spaced. Fourier Transform theory assumes a periodic signal. Periodic signals have no beginning or end (infinity in both directions)
Phase Vocoding in SoundHack demo
Assignment Roads: Phase Vocoder p