Chapter 13 Sounds and signals basics of computer sound perception and generation of sound synthesizing complex sounds sampling sound signals simple example of signal processing Sources used: (a) Chapter 13 of the text (b) Daisy Fan, Cornell University
Role of audio in computers audio is an important sensory signal crucial component of multimedia data – audio, music tools for interacting with digital computers for visual impaired persons music analysis and synthesis speech processing and synthesis
Basics of computer sound in Matlab Matlab can open a file in.wav format: >> [x, fs, bits] = wavread(‘fh.wav’); >> sound(x, fs); % plays the sound clip Computing with sound in Matlab requires that we first convert the wav format data into simple numeric data—the job of the function wavread. Variable fs above represents the number of samples per second and bits represents the number of bits used represent each sample.
Basics of computer sound >> fs fs = >> bits bits = 16 >> x(100:105) ans = >> plot(1:length(x), x);
Basics of computer sound Some basic questions: why does the sound waveform range in amplitude from -1 to 1? what role does the sampling frequency play in the quality of the sound? what happens if we play back the sound as a different sampling frequency?
Computing with sound requires digitization Sound is (analog) continuous; capture its essence by sampling Digitized sound is a vector of numbers
Sampling rate affects the quality If sampling not frequent enough, then the discretized sound will not capture the essence of the continuous sound and the quality will be poor.
Sampling Rate Given human perception, samples/second is pretty good (20000Hz or 20kHz). 8,000 Hz required for speech over the Telephone 44,100 Hz required for audio CD 192,400 Hz required for HD-DVD audio tracks
Resolution also affects the quality Typically, each sampled value is encoded as an 8- bit integer in the.wav file. Possible values: -128, -127,…,-1,0,1,…,127 Loud: -120, 90, 122, etc. Quiet: 3, 10, bit used when very high quality is required. Wavread converts 8 bit values into real numbers in [-1,1].
Amplitude, frequency and phase P(t) = A sin( 2*pi*f*t + phi) f = frequency phi = phase (shift) A = amplitude a single sine wave models a pure musical tone. amplitude determines the loudness. human perception of sound is in the range 50 Hz to Hz. most sounds (and music) are complicated mixture of various frequencies.
Calculating frequencies of notes The author of the text has recorded a whistle sound in the file named whistle.wav. >> [w, fs, bits] = wavread('whistle.wav'); >> length(w) ans = >> fs fs = Question: What is the duration of the whistle? What are the various frequencies he has used?
whistle.wav – wave Zooming in on one region Question: what is the frequency? Note the sample length is 100 and the waveform has repeated about 9 times. 9 cycles/100 samples. This works out to about 1KHz.
Synthesis of sound – combining pure notes A pure tone can be synthesized as follows: >> x = 2*sin(0: 2*pi*1000/2000: 2*pi*1000); >> sound(x) What will happen if we try? >> x = [x x]; >> sound(x) We can also generate superposition of two frequencies.
Exercise: Write a program in Matlab that plays a sequence of music clips in succession. Possible solution playList = {‘whistle.wav’,'song.wav'}; for k=1:length(playList) [y,rate] = wavread(playList{k}); sound(y,rate) end Problem: will start playing song before whistle finishes playing.
Correct way to solve the problem is to introduce appropriate delay after each song. pause(x) will introduce a delay of x seconds. Calculate the delays based on the sampling rate and the number of samples. >> [x, fs] = wavread(file1); >> [y, fs1] = wavread(file2); >> sound(x, fs); delay( length(x)/fs); sound(y, fs1);
A simple application of signal analysis A phone dial pad has a frequency is associated with each row & column. So two frequencies are associated with each button. Each button has its own 2-frequency “fingerprint”!
Signal for button 5 Fs = 32768; tFinal =.5; t = 0:(1/Fs):tFinal; yR = sin(2*pi*770*t); yC = sin(2*pi*1336*t) y = (yR + yC)/2; sound(y,Fs)
Received signal should be decoded to determine the digits
Decoding – using a filter We can analyze the spectrum of the signal to determine the frequencies present. From this it is easy to decode the signal. We may pursue the details of this in the next HW. Question: can you think of a way to decode the signal directly (without applying Fourier transform or some such trick?) There is discussion about sampling rate and Nyquist theorem in the book. Please read this.
Exercise 13.1: The audio file whistle.wav waveform is an eight-note ascending scale. Use reversal and concatenation to generate an ascending and descending scale.
Exercise 13.1: The audio file whistle.wav waveform is an eight-note ascending scale. Use reversal and concatenation to generate an ascending and descending scale. >> [x, fs] = wavread('whistle.wav'); >> y = x(length(x):-1:1); >> sound(x, fs); pause(length(x)/fs); sound(y, fs);
Exercise 13.2 Find the lowest frequency signal you can hear. Idea: It is obvious that every one can hear 1000 Hz. (What if this is not true?) So perform a binary search for the frequency that you can’t hear. Play the mid frequency and ask if the user can hear. Continue search in the lower or higher half of the range.
function res = lowsearch upper = 1000; lower = 0; fs = 22050; timelength=1.0; amp=1.0; nsamps = timelength.*fs+1; t = linspace(0, timelength, nsamps); f = upper; sig = amp.*sin(2.*pi*f.*t); sound(sig,fs); response = input('Did you hear that? (y or n)', 's'); if response(1) ~= 'y' error('The equipment is not working.'); end; for k=1:10 middle = (lower + upper)./2; sig = amp.*sin(2.*pi*middle.*t); sig = sig.*(sin(pi.*t./timelength)).^2; sound(sig, fs); disp(middle); response = input('Did you hear that? (y or n)', 's'); if response(1) == 'n' lower = middle; else upper = middle; end; res = (upper + lower)./2;
Exercise 13.3 For a particular frequency, find the lowest amplitude that you can hear. Use the same binary search, this time on a range of amplitudes. Exercise 13.4 Write a program that plays the seven-note musical scale starting at a given frequency. Note that the frequencies of successive notes in the scale are separated by 2^(1/12).
function exer134(basefreq,dur) if nargin < 2 dur = 0.5; end fs = 22050; sig = []; notes = [0,2,4,5,7,9,11,12]; t = linspace(0,dur - (1./fs), fs.*dur); for k=1:length(notes) note = sin(2.*pi.*t.*basefreq.*(2.^(notes(k)/12))); sig = [sig, note]; end; sound(sig,fs);
Exercise 13.5 (Home work problem) Your program should be able to handle long sequence of digits. Idea: 0. create the files ‘one.wav’, ‘two.wav’, etc. (You can do it by recording all the sounds into a single file, then split into separate files.) 1.Convert the input number to ASCII string. 2. For each character in the string, look up the sound file that is relevant. Read in that sound file. 3. Concatenate the signals corresponding to digits.