CS 591 S1 – Computational Audio -- Spring, 2017 Wayne Snyder Computer Science Department Boston University Lecture 8 (Thursday) Physical Modeling Synthesis A Glimpse of Speech Synthesis Karplus-Strong String Synthesis Lecture 8 (Tuesday) Conclusions on Karplus-Strong Conclusions on Music Synthesis Questions on Midterm Demo of Software Synthesizer (Ebrahim) Midterm One on Thursday 3/2 in same room! 1
Physical Modeling Synthesis Perhaps the ultimate way to synthesize realistic instruments sounds (and provide the tools to create virtual instruments of any description) would be to find mathematical equations to describe the physics of the instruments. Since there are a LOT of interactions between the various parts of a real instrument, this is very complex! One way to do this is to measure the properties of the instrument and to come up with equations to fit the measurements. 2
Physical Modeling Synthesis This is extremely complex, but progress has been made in modeling Human Voice Strings Wind Instruments Percussion Instruments We will look at the first two, especially focussing on a clever and flexible algorithm for string synthesis. 3
Physical Modeling Synthesis: Human Voice Modeling the Human Vocal Tract: People have tried to create artificial voices for hundreds of years! The most accurate modern methods model the human vocal tract as a series of concentric cylinders, each of which has its own resonance properties (remember standing waves in pipes?). 4
Physical Modeling Synthesis: Human Voice The results have been quite stunning! Here are some interesting examples of such simulated voices: https://www.youtube.com/watch?v=WjMwGWdqHVQ (Early voice synthesizers doing the “Argument Room Sketch” by Monty Python.) https://www.youtube.com/watch?v=SNqNM6Ccck8 (Interview with Ingo Titze, creator of “Pavarobotti” https://deepmind.com/blog/wavenet-generative-model-raw-audio/ (Google WaveNet with examples) 5
Physical Modeling Synthesis: Karplus-Strong Karplus-Strong String Synthesis One of the simplest and most accurate such modeling techniques does not derive from mathematical analysis but from the simulation of a plucked string, such as on a guitar. When a string is first plucked, we can imagine it as having every possible frequency represented, as in random white noise: But very quickly, (0.05 sec or so) the resonance modes of the string emphasize certain frequencies, which produce its characteristic sound, and over longer time scales, the higher frequencies roll off: 6
Physical Modeling Synthesis: Karplus-Strong Karplus-Strong String Synthesis This process is modeled by a Ring Buffer Queue (remember those, CS 112 folks??). The algorithm is actually very simple: to create a signal of length M samples, fill the queue with random values, and then rotate the queue, but inserting the average of two values times a decay factor: (Actually this diagram is reversed left-to-right compared with the Python code.) 7
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 8
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 9
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 10
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 11
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 12
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 13
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 14
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 15
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing: 16
Physical Modeling Synthesis: Karplus-Strong NOTE that what we are essentially doing is running through the queue and smoothing the values, which would attentuate the high frequencies – but because you are repeating the smoothing in a circle, you get numeric patterns that act like resonance modes in a string, producing a periodic wave form. Here is a queue of length 30, during each round of smoothing. Here is the whole signal: 17
Physical Modeling Synthesis: Karplus-Strong To create a realistic string sound, we observe that the buffer acts something like a string, setting up resonance modes f, 2f, 3f, etc. where the period of the fundamental frequency f is the length of the queue (in samples). Hence, to simulate the SteelString.wav sound at 220 Hz, we create a buffer of length 200, which is as close as we can come, given that the queue has an integer length: 44100/200 = 220.5. X = KarplusStrong(200,44100*2, 0.998) 18
Physical Modeling Synthesis: Karplus-Strong To create a realistic string sound, we observe that the buffer acts something like a string, setting up resonance modes f, 2f, 3f, etc. where the period of the fundamental frequency f is the length of the queue (in samples). Hence, to simulate the SteelString.wav sound at 220 Hz, we create a buffer of length 200, which is as close as we can come, given that the queue has an integer length: 44100/200 = 220.5. SteelString.wav 19
Physical Modeling Synthesis: Karplus-Strong The spectra not only “roll off” fairly realistically, but the basic spectrum is also close, and shows the characteristic harmonic series: SteelString.wav KarplusStrong.wav 20
Physical Modeling Synthesis: Karplus-Strong Now of course, you can try different variations, varying the frequency and the decay; here is low and slow: KarplusStrong(1000,44100*5,0.99) 44100/1000 = 44.1 Hz 21
Physical Modeling Synthesis: Karplus-Strong Now of course, you can try different variations, varying the frequency and the decay; here is high and short: KarplusStrong(20,10000,0.9) 44100/20 = 2205 Hz 22
Physical Modeling Synthesis: Karplus-Strong Or, you can start tweeking the various components of the algorithm: Load it with non-random values to start, maybe a square wave or a sample; Use a different smoothing algorithm, e.g., weighted average, average of 3, or average of two values separated by k samples, etc. etc. Here I have taken the average of 4 samples: two samples, skipping 2: Here is a weighted average (0.75a + 0.25b) Here I have loaded the queue with a square wave at 220.5 Hz: Here I have loaded the queue with the first 200 samples of the SteelString.wav file: Original: 23
Physical Modeling Synthesis: Karplus-Strong Some other variations that have been explored include probabilistically inverting the value inserted into the queue (reminiscent of Ring Modulation, perhaps): For p = ½ this simulates a drum sound: For other values of p, we can mix the string and drum sounds:
Physical Modeling Synthesis: Karplus-Strong Or we can load the queue with an appropriate non-random signal: here is a square wave at 220 Hz: And one at 44.1 Hz: Here is the clarinet sound at 220 Hz: And a Bell sound at 1000 Hz: Or we can change the length of the queue to simulate glissando and other pitch-specific effects: The possibilities seem endless… Any ideas?