EE 194/BIO 196: Modeling,simulating and optimizing biological systems Spring 2018 Tufts University Instructor: Joel Grodstein joel.grodstein@tufts.edu Functions
Functions – why do we care? A=np.array([1,2,3,4,5]) t1 = A.sum() a1 = A.mean() d1 = A.std() print('Data for A =', t1, a1, d1) B = np.array([10,20,30]) t2 = B.sum() a2 = B.mean() d2 = B.std () print ('Data for B =', t2, a2, d2) C=np.array([100,200,300]) t3 = C.sum() a3 = C.mean() d3 = C.std() print ('Data for C =', t3, a3, d3) print ('total =', t1+t2+t3) print ('avg of avgs=', (a1+a2+a3)/3)) print ('avg of stds=', (d1+d2+d3)/3)) What does this long program do? It prints various information for three arrays, and then a summary. Problems with it? There’s a lot of nearly-replicated code. Takes a while to read You’re never quite sure if it’s completely replicated or not. If not, was the not-quite-replication on purpose or a typo? Did the dog step on the keyboard? EE 194/Bio 196 Joel Grodstein
Functions – why do we care? Software: the relentless quest to write less & less code that nonetheless manages to do more & more stuff. Functions are the 500 pound gorilla of software. Let's watch our gorilla in action. EE 194/Bio 196 Joel Grodstein
Functions – why do we care? A=np.array([1,2,3,4,5]) t1 = A.sum() a1 = A.mean() d1 = A.std() print ('Data for A =', t1, a1, d1) B = np.array([10,20,30]) t2= B.sum() a2 = B.mean() d2 = B.std() print ('Data for B =', t2, a2, d2) C=np.array([100,200,300]) t3= C.sum() a3 = C.mean() d3 = C.std() print ('Data for C =', t3, a3, d3) print ('total =', t1+t2+t3) print ('avg of avgs=', (a1+a2+a3)/3)) print ('avg of stds=', (d1+d2+d3)/3)) Three repetitions of almost the same thing. But they’re not exactly the same, so what can our gorilla do? EE 194/Bio 196 Joel Grodstein
We've already used functions b = (3 + 5 + 9 + 6)/4 a=np.array([4, 3, 7]).mean() b=np.array([3, 5, 9, 6]).mean() mean() can do the same operation (i.e., compute the average) but on whatever numbers you give it It’s the same, only different. It would be cool if we could write our own version of mean(), but one that does exactly what we want it to. EE 194/Bio 196 Joel Grodstein
Functions – why do we care? A=np.array([1,2,3,4,5]) t1 = A.sum() a1 = A.mean() d1 = A.std() print ('Data for A =', t1, a1, d1) B = np.array([10,20,30]) t2= B.sum() a2 = B.mean() d2 = B.std() print ('Data for B =', t2, a2, d2) C=np.array([100,200,300]) t3= C.sum() a3 = C.mean() d3 = C.std() print ('Data for C =', t3, a3, d3) print ('total =', t1+t2+t3) print ('avg of avgs', (a1+a2+a3)/3)) print ('avg of stds', (d1+d2+d3)/3)) t1,a1,d1=sm_mn_std(A) t2,a2,d2=sm_mn_std(B) t3,a3,d3=sm_mn_std(C) EE 194/Bio 196 Joel Grodstein
Progress We’ve moved from 14pt font to 18pt – that’s progress! A=np.array([1,2,3,4,5]) t1,a1,d1=sm_mn_std(A) print ('Data for A =', t1, a1, d1) B = np.array([10,20,30]) t2,a2,d2=sm_mn_std(B) print ('Data for B =', t2, a2, d2) C=np.array([100,200,300]) t3,a3,d3=sm_mn_std(C) print ('Data for C =', t3, a3, d3) print ('total =', t1+t2+t3) print ('avg of avgs=', (a1+a2+a3)/3)) print ('avg of stds=', (d1+d2+d3)/3)) We’ve moved from 14pt font to 18pt – that’s progress! It’s now obvious that we’re doing the same thing – i.e., doing sm_mn_std() – to A, B and C. But we still have to write our function sm_mn_std(). EE 194/Bio 196 Joel Grodstein
Coding our function name of the function says that this is a user-defined function. name of the function def sm_mn_arr.std(): sm=arr.sum() avg=arr.mean() dev=arr.std() return sm, avg, dev body of the function input(s) says what the outputs of the function are EE 194/Bio 196 Joel Grodstein
Functions as a little house Do the demo with a little house to compute a function. EE 194/Bio 196 Joel Grodstein
Let's watch the gorilla in action A=np.array([1,2,3,4,5]) t1,a1,d1=sm_mn_std(A) print ('Data for Af =', t1, a1, d1) [1,2,3,4,5] Data for Af=15 3 1.6 A t1 a1 d1 def sm_mn_arr.std(): sm=arr.sum() avg=arr.mean() dev=arr.std() return sm,avg,dev 15 15 3 1.6 3 1.6 arr sm avg dev EE 194/Bio 196 Joel Grodstein
And another step A=np.array([1,2,3,4,5]) t1,a1,d1=sm_mn_std(A) print ('Data for A =', t1, a1, d1) B = np.array([10,20,30]) t2,a2,d2=sm_mn_std(B) print ('Data for B =', t2, a2, d2) C=np.array([100,200,300]) t3,a3,d3=sm_mn_std(C) print ('Data for C =', t3, a3, d3) print ('total =', t1+t2+t3) print ('avg of avgs=', (a1+a2+a3)/3)) print ('avg of stds=', (d1+d2+d3)/3)) Can we go a step further, and suck up the print() statements into our sm_mn_std() function? But each one is a bit different: prints different variables prints out a different array name Any ideas how to deal with those issues? EE 194/Bio 196 Joel Grodstein
First try Does this work? def sm_mn_arr.std(): sm=arr.sum() avg=arr.mean() dev=arr.std() print ('Data for A =', t1, a1, d1) return sm,avg,dev Does this work? No, it’s an error. Inside of sm_mn_std(), the variables t1, a1 and d1 are not even defined. More on that shortly – it's about scope. EE 194/Bio 196 Joel Grodstein
Try #2 What does this print out? Data for A = 15 3 2.6 def sm_mn_arr.std(): sm=arr.sum() avg=arr.mean() dev=arr.std() print ('Data for A =', sm, avg, dev) return sm,avg,dev A=np.array([1,2,3,4,5]) t1,a1,d1=sm_mn_std(A) B = np.array([10,20,30]) t2,a2,d2=sm_mn_std(B) C=np.array([100,200,300]) t3,a3,d3=sm_mn_std(C) print ('total =', t1+t2+t3) print ('avg of avgs=', (a1+a2+a3)/3)) print ('avg of stds=', (d1+d2+d3)/3)) What does this print out? Data for A = 15 3 2.6 Data for A = 60 20 10 Data for A = 600 200 100 totals=… What went wrong, and how might we fix it? EE 194/Bio 196 Joel Grodstein
Try #2 Is it fixed now? Data for A = 15 3 2.6 Data for B = 60 20 10 def sm_mn_std(arr,name): sm=arr.sum() avg=arr.mean() dev=arr.std() print ('Data for', name, '=', sm, avg, dev) return sm,avg,dev A=np.array([1,2,3,4,5]) t1,a1,d1=sm_mn_std(A,'A') B = np.array([10,20,30]) t2,a2,d2=sm_mn_std(B,'B') C=np.array([100,200,300]) t3,a3,d3=sm_mn_std(C,'C') print ('total =', t1+t2+t3) print ('avg of avgs=', (a1+a2+a3)/3)) print ('avg of stds=', (d1+d2+d3)/3)) Is it fixed now? Data for A = 15 3 2.6 Data for B = 60 20 10 Data for C = 600 200 100 totals=… EE 194/Bio 196 Joel Grodstein
Let's watch the gorilla in action A=np.array([1,2,3,4,5]) t1,a1,d1=sm_mn_std(A,'A') Data for A = 15 3 1.6 A t1 a1 d1 def sm_mn_std(arr,name): sm=arr.sum() avg=arr.mean() dev=arr.std() print ('Data for, name, '=', sm, avg, dev) return sm,avg,dev 15 3 1.6 15 3 1.6 arr sm avg dev A [1,2,3,4,5] name EE 194/Bio 196 Joel Grodstein
Number of outputs A function can return no outputs, one output, or multiple. The syntax is slightly different. def f1(x): stuff def f3(x): stuff return out1,out2 f1(3) a = f2(4) b,c = f3(5) def f2(x): stuff return out1 Why might we want a function that doesn’t return any outputs, anyway? EE 194/Bio 196 Joel Grodstein
Scope: the problem Often people want to work together (really!). In general, when a large team of people work together, it's difficult to avoid re-using variable names. This is a really big problem These slides are for the kinetic-proofreading HW EE 194/Bio 196 Joel Grodstein
Sudden youth? def f_me(in): age=55 incr=f_you()+1 do other stuff return (age+incr) def f_you(): age=19 do other stuff return age+1 Say I write the function f_me(), and you write f_you() How do we avoid tromping on each other? What if I use a variable called age and I set it to 55. Any you use a variable called age and set it to 19 Do I suddenly get a lot younger (and perhaps less wise) when my function calls f_you()? Would the reverse happen if f_you() called f_me()? EE 194/Bio 196 Joel Grodstein
Sudden youth? def f_me(in): age=55 incr=f_you()+1 do other stuff return (age+incr) def f_you(): age=19 do other stuff return age+1 sets incr to 20+1=21 returns 19+1=20 We want to return 55+21=76. Do we actually return 19+21=40? When f_you() sets age=19, does it also set age=19 in f_me()? EE 194/Bio 196 Joel Grodstein
Scope Variables have scope – i.e., who can access them and when. For this class, we'll keep it simple Each function is its own world. any variables in one function are completely separate from variables in other functions. even if two different functions both use a variable called age, the two do not interact even if two functions both use an input called input or an output z, they do not interact. Each time any given function is called, its variables start out uninitialized They do not keep their values from the last time that function was called. (not actually quite true) EE 194/Bio 196 Joel Grodstein
Scope is really really good It allows people to work together without stepping on each others toes too much It allows you to call a function without having to worry if somebody else called the function before you Without scope, large programs would be almost impossible to write EE 194/Bio 196 Joel Grodstein
Functions calling other functions We’ve seen functions calling other functions. What are some places this would be useful? Any time you build big pieces of functionality out of littler ones. This is called hierarchical design. HW #4 (kinetic proofreading) has a bunch of functions. Some of them are written by me, others by you. Let’s take a look EE 194/Bio 196 Joel Grodstein
HW #4 HW #4 is in three files: Functions and scope are great because: kinetic_transcr.py is where you write your code sim_infrastructure.py is a file I wrote; it contains the infrastructure for the chemical-reaction simulator sim_library.py is another file I wrote; it contains some simple reactions (and their differential equations) that we use in HW4. Functions and scope are great because: You get to use all of these functions that do useful things You don’t have to write them You do have to know how to call them, and what they do So let’s say you call the function add_metab('mRNA',1) How does Python know where to find it? There are a million random files on the computer; which one should it get add_metab() from? EE 194/Bio 196 Joel Grodstein
Packages kinetic_transcr.py has the line Next comes Tells Python where to look for add_metab() kinetic_transcr.py has the line import sim_infrastructure as si Next comes si.add_metab('mRNA',1) What’s going on here? Sim_infrastructure is called a package. The cool thing about Python: There are thousands of useful packages that other people have already written, and you can just use (like numpy). HW4 uses matplotlib, a package that has functions to draw graphs. Tells Python to specifically find the file sim_infrastructure.py Gives it the nickname ‘si’ EE 194/Bio 196 Joel Grodstein
Packages and scope Packages are kind of like function scope Two different functions might both have a variable named age Function scope means that the two ages are completely unrelated Two different files may both have a function add_metab() By saying si.add_metab(), we no longer care about any other function add_metab() that some random person wrote. We only care about the one in sim_infrastructure.py The goal, in both cases: allow people to work together without stepping on each other’s toes Last of the kinetic-proofreading slides EE 194/Bio 196 Joel Grodstein
Functions with lots of parameters Define a function def manducaRun (n_gen, pop_size, n_matings, n_mutations, seed): …stuff… And, later on, call it: manducaRun (100, 50, 10, 10, 5) These slides are for the Manduca HW Problem: it’s kind of hard to remember what all those number mean! And what order they go in Quick mini quiz: what do the parameters mean? EE 194/Bio 196 Joel Grodstein
Call by keyword You can supply the parameters in any order at all! Define a function def manducaRun (n_gen, pop_size, n_matings, n_mutations, seed): …stuff… And, later on, call it: manducaRun (n_gen=100, pop_size=50, n_matings=10, n_mutations=10, seed=5) Or: manducaRun (seed=5 , pop_size=50, n_gen=100, n_matings=10, n_mutations=10) You can supply the parameters in any order at all! Pros and cons: Easy to use and quite clear But it is a bit verbose EE 194/Bio 196 Joel Grodstein
Closely-related functions def random_manduca (): n_time_seg=10 … In most of our Manduca work, we assume that the 100-second simulation period is broken into 10 segments i.e., we can send 10 commands to our muscles What if we want to experiment with sending commands every 1 second, for 100 total commands? EE 194/Bio 196 Joel Grodstein
Closely-related functions Option #1. Make two copies of random_manduca(). def random_manduca (): n_time_seg=10 … def random_manduca2 (): n_time_seg=100 Pros and cons? It’s easy We have more code What if we find a bug in random_manduca() and forget to change it in random_manduca2? EE 194/Bio 196 Joel Grodstein
Closely-related functions Option #2. Add a parameter. def random_manduca (n_time_seg): … Call it as pop[i] = random_manduca(10) pop[i] = random_manduca(100) Pros and cons? Now there’s only one version of random_manduca(), so we don’t have to worry about keeping two versions in sync The people not doing the extra credit have to realize that there’s another parameter and remember what it means Even though, in their world, it never changes usually In the extra credit EE 194/Bio 196 Joel Grodstein
Closely-related functions Option #3. Use a default parameter. def random_manduca (n_time_seg=10): … Call it as pop[i] = random_manduca() pop[i] = random_manduca(100) Pros and cons? Only one version of random_manduca() The people not doing the extra credit can ignore it completely usually In the extra credit End of the Manduca HW slides EE 194/Bio 196 Joel Grodstein
Follow-up activities Try the examples from this lecture yourself Vary them, or even mis-type some to see what happens More exercises. Check out http://codingbat.com/python EE 194/Bio 196 Joel Grodstein
BACKUP EE 194/Bio 196 Joel Grodstein
Recursive functions A function can call itself! That seems a bit bizarre: why would it be useful? The classic example is computing a factorial. Reminder: 5! = 1*2*3*4*5=120. Factorial implementation #1: Fact = factorial(n) % Yes, Matlab has a factorial function! Factorial implementation #2: fact=1 for i=i:n fact = fact*i end EE 194/Bio 196 Joel Grodstein
Recursive factorial Seems obvious that it works, right? def fact = factorial(n): if (n<=1) fact=1 else fact = n * factorial(n-1) end Seems obvious that it works, right? But under the hood, there are multiple copies of this function all working at once!?! Let’s walk through it. EE 194/Bio 196 Joel Grodstein
Factorial(3) walkthrough def fact = factorial(n): if (n<=1) fact=1 else fact = n * factorial(n-1) end n is 3, which is not <=1 Now we’ve hit our recursion! We’re going to start a new call to factorial() – but we’re not done with this one yet! We must remember where we are in this function, and start a new one. 3 n fact EE 194/Bio 196 Joel Grodstein
Factorial (2) def fact = factorial(n): if (n<=1) fact=1 else fact = n * factorial(n-1) end n is 2, which is not <=1 Now we’ve hit our recursion! Remember where we are in this function… and start again 2 n fact EE 194/Bio 196 Joel Grodstein
Factorial (1) def fact = factorial(n): if (n<=1) fact=1 else fact = n * factorial(n-1) end n is 1, which is <=1. Finally! Cool. We actually finished the function. But remember, we got called from factorial(2). Time to finish that unfinished business… 1 1 n fact EE 194/Bio 196 Joel Grodstein
Factorial (2) def fact = factorial(n): if (n<=1) fact=1 else fact = n * factorial(n-1) end Now we finished this function. We’ve calculated that factorial(2)=2. But we still have unfinished business (yet again)… This is where we were. We called fact(1) and it just returned 1. So n*1 is 2*1, or 2. 2 2 n fact EE 194/Bio 196 Joel Grodstein
Factorial (3) walkthrough def fact = factorial(n): if (n<=1) fact=1 else fact = n * factorial(n-1) end No more unfinished business. We’ve calculated that 3!=6. That was kind of a lot of work for a simple job. And we had two easier ways to do it. So what good is this recursion stuff? This is where we were. We called fact(2) and it just returned 2. So n*2 is 3*2, or 6. 3 6 n fact EE 194/Bio 196 Joel Grodstein
Combinations How many ways can you arrange the numbers 1,2,3? 1,2,3 1,3,2 2,1,3 2,3,1 3,1,2 3,2,1 6 ways. Can we write a program to generate these? EE 194/Bio 196 Joel Grodstein
Combinations of 1,2,3 def combinations3 (): for i=1:3 for j=1:3 for k=1:3 if ((i ~= j) & (i~= k) & (j~=k)) output i, j, k end EE 194/Bio 196 Joel Grodstein
Group activity Try to write def combinations(n): that outputs all combinations (also called 'permutations') of n numbers. This one is not so easy! Clarification EE 194/Bio 196 Joel Grodstein
One solution I've not actually tried this. def combinations(n): recurse ([],[1:n]) end def recurse(used, not_used): if (length(not_used==0)) output used for i=1:length(not_used) this_one = not_used(i) used2=[used, this_one] not_used2 = not_used not_used2(i)=[] recurse (used2, not_used2) I've not actually tried this. Feel free to try it yourself & find my bugs for me . EE 194/Bio 196 Joel Grodstein