HasSound: Generating Musical Instrument Sounds in Haskell Paul Hudak Yale University Department of Computer Science Joint work with: Matt Zamec (Yale ‘00) Sarah Eisenstat (Brown ‘08) NEPLS, Brown University October 27, 2005
Haskore/HasSound Pictorial Haskore MIDI File Player (uses synthesizer on sound card) HasSound csound Sound File Player (D/A conversion) MIDI File (note level) csound score file (note level) wav/snd/mp3 file (signal sample level) csound orchestra file (signal processing description) small large
Background Haskore is a Haskell library for describing music at a high level of abstraction. MIDI is a low-level representation of music at the note level. csound is a low-level DSL both for describing music at the note level, and for defining instruments that generate sounds in a signal-processing framework. The Haskore implementation translates high level descriptions into either MIDI or (the note-level version of) csound. (MIDI instruments are pre-defined.) What’s missing is a way to define instruments in Haskell. HasSound fills this gap. The HasSound implementation translates high-level sound descriptions into (the sound- generating version of) csound.
Haskore The primitive element in Haskore is a note (all higher-level concepts will be ignored). A note consists of a pitch, a duration, and one or more instrument-dependent parameters, or “p-fields” (which encode things like volume, vibrato, tremolo, reverb, etc). The instrument must know how to handle these: –In MIDI, only volume is allowed. –But in csound, the instrument is user-defined – so any number of p-fields can be accommodated.
HasSound A HasSound program describes an orchestra. An orchestra consists of a header and a collection of instruments. Each instrument consists of: 1.An instrument number. 2.A note extension. 3.A signal expression describing the output of the instrument in terms of pitch, duration, & p-fields. 4.A list of global signals to which values (expressed as signal expressions) are communicated. The key concept (3) above.
Example 1 Monophonic output of a 440 hertz using wave table 1 (a sine wave) at amplitude 10,000, and sampled at the audio rate: mono (oscil ar ) Ok, so that’s not very interesting…
Score Files and Orchestra Files f i i i i i e Instr 1 … endin … instr 2 a1 oscil p4, p5, 1 out a1 endin … Score file (.sco)Orchestra file (.orc) P-fields 4 and 5 (p4 and p5) Instrument number (2) start time duration p4 = volume p5 = frequency
A Bigger Example This example chooses one of four waveforms (sine, sawtooth, square, or pulse), and adds chorusing (detuned signals), vibrato (frequency modulation), and various kinds of envelopes. Let’s set up the p-fields as follows: –p3 = duration p6 = attack time p9 = vibrato depth –p4 = amplitude p7 = release time p10 = vibrato delay (0-1) –p5 = pitch p8 = vibrato rate p11 = waveform selection letirel = vibrato release idel1 = p3 * p10-- initial delay isus = p3 - idel1 - irel -- sustain time iamp = ampdb p4-- amplitude inote = cpspch p5-- frequency k3 = linseg 0 idel1 p9 isus p9 irel 0 -- envelope k2 = oscil k3 p vibrato signal k1 = linen (iamp/3) p6 p3 p7 -- envelope a3 = oscil k1 (inote*0.999+k2) p11 -- chorusing signal a2 = oscil k1 (inote*1.001+k2) p11 -- “ “ a1 = oscil k1 (inote+k2) p11 -- main signal in mono (a1+a2+a3)-- result
Executable Specification oscil 1 p8 p5 oscil linen cpspch lineseg xx p11 k2 k1 let … inote = cpspch p5 k2 = oscil k3 p8 1 k1 = linen (iamp/3) p6 p3 p7 a3 = oscil k1 (inote*0.999+k2) p11 a2 = oscil k1 (inote*1.001+k2) p11 a1 = oscil k1 (inote+k2) p11 in mono (a1+a2+a3) a3 a2 a1 “A picture is worth a thousand lines of code…” inote k3
csound Orchestra Code The previous example is equivalent to the following csound code: instr 6 irel =.01 idel1 = p3 * p10 isus = p3 - (idel1 + irel) iamp = ampdb(p4) inote = cpspch(p5) k3 linseg 0, idel1, p9, isus, p9, irel, 0 k2 oscil k3, p8, 1 k1 linen iamp/3, p6, p3, p7 a3 oscil k1, inote*.999+k2, p11 a2 oscil k1, inote*1.001+k2, p11 a1 oscil k1, inote+k2, p11 out a1+a2+a3 endin This is not bad! But there are some funny things going on: sometimes there is an “equal” sign (=), other times not “out” looks like a variable, not a function what happens when the order of the “statements” is changed?
Motivation and Goals HasSound allows Haskore user to define instru- ments without leaving “convenience” of Haskell. HasSound provides simple framework for algorithmic instrument synthesis. HasSound is: –as expressive as csound. –purely functional (thus the “value added” is the power of functional programming). –compilable into csound (for efficiency). (this is a big constraint!)
Problems There are several problems that arise in meeting these goals: –Global variables in csound. –Delay lines in csound. –Imperative glue in csound. –Recursive signals (not allowed in csound). –Csound is enormous (the manual is 1200 pages long!) I will talk about some ( ) of these…
Global Variables Suppose we want reverb that lasts past the end of a note. One way to do this is to use global variables to “communicate” to another instrument that is “always on”: garvb init 0 ; initialize global (in header) instr 9 …; instrument using reverb out a1; normal output garvb = garvb + a1 * rvbgain; add reverb to global variable endin instr 99 ; global reverb instrument asig reverb garvb, p4; compute reverb outasig garvb = 0 ; then clear global! endin
Globals in HasSound Global variables are replaced by global signals in HasSound. In addition to the normal signal output, an instrument can attach signals to global signal names: data InstrBlock a = InstrBlock Instr-- instrument number SigExp-- note extension a-- normal output -- (i.e. Mono se, Stereo se1 se2, Quad … ) [(GlobalSigName, SigExp)] -- global signals data GlobalSig = Global (SigExp -> SigExp -> SigExp) -- combining function Int-- unique identifier
Example in HasSound So the previous example can be written in HasSound like this: let gsig = Global (+) 1 in [ InstBlock 9 0 (mono a1) [ (gsig, a1*rvbgain) ], InstBlock 99 0 (mono (reverb (read gsig) p4)) [ ] ] Note: more than one “instance” of instrument 9 may be active at any given time.
Imperative Glue Global variables in csound are not modular – if you combine different instrument definitions, global variable names may clash. So in HasSound we introduce a monad of gobal signal names at the top level to ensure modularity. For example: leta1 = oscI AR (tableNumber 1) comp = do h <- mkSignal AR (+) addInstr (InstrBlock 1 0 (Mono a1) [(h, a1)]) addInstr (InstrBlock 2 0 (Mono (readGlobal h)) []) in saveIA (mkOrc (44100, 4410) comp)
Delay Lines A delay line delays a signal by a given duration. Useful for reverb, but also for certain percussive sounds. In csound, it is unnecessarily imperative: a1 delayr max; sets max delay … a2 deltapi atime 1 ; taps delay line a3deltapi atime 2 ; another tap … delayw asource ; input to delay line The effect is non-local, can’t have more than one delay line in same sequential fragment, and it is prone to errors.
Delay Lines in HasSound In HasSound we provide equivalent power, but purely functionally: data DelayLine = DelayLine SigExp-- max delay time SigExp-- signal to be delayed data SigExp = … | Tap DelayLine [SigExp]-- create multiple taps | Result DelayLine-- output of delay line Thus delay lines are “first-class values”.
Recursive Signals We would like to be able to write things such as: let x = delay (sig * x) 1.0 in … This looks like a finite loop. But as a data structure, it’s an infinite tree… We could design our own DSL, or require the user to “flag” each recursive reference. In HasSound, we introduce implicit looping via a fixed point operator: rec (\x -> delay (sig * x) 1.0)
The SigExp Data Type (sort of…) data SigExp = Const Float | Pfield Int | Str String | Read GlobalSig | Tap Function DelayLine [SigExp] | Result DelayLine | Conditional Boolean SigExp SigExp | Infix Function SigExp SigExp | Paren Function SigExp | SigGen Function EvalRate OutCount [SigExp] | Rec (SigExp -> SigExp) | Var Integer -- not exported | Loop Integer SigExp -- not exported | Index OutCount SigExp deriving (Show, Eq)
Translating Loops Conceptually, this expression: x = delay (sig * x) 1.0 must be translated into this csound code: axinit0-- initialize ax … ax1delay (sig * ax) create new sig ax = ax1-- update old sig … This is done in two steps, starting from the “rec” form: –Generate a unique variable name, and inject into functional: rec (\x -> delay (sig * x) 1.0) Loop n (delay (sig * Var n) 1.0)-- n unique –For each Loop, generate init code and usage code as above.
Compiling Into csound All of the “functional” translations are straightforward. However, common subexpression elimination is critical. Delay lines are tedious but straightforward. Global variables require special care in order to ensure proper initialization and resetting. Recursive signals also require careful sequencing of code.
Csound is Enormous The csound manual is 1200 pages long. Chapter 15, “Orchestra Opcodes and Operators”, is 972 pages long!! We cannot hope to import everything, but it’s easy to add your favorite operation, as long as its functional…
Conclusions Certain kinds of imperative ideas can be redesigned, and others can be reengineered. Embedding a DSL in Haskell works well, but has some limitations (for example, recursion is not transparent). Future work: –Better (more type-safe, etc.) interface between intruments and the score. –Import more of csound. –Graphical interface.