Introduction to Computational Biology Programming with Matlab
Why Program? Programming a computer provides: Accuracy Reduce human error Speed and efficiency Processing the human genome manually is impossible Adding all numbers from is a waste of our time Repetition Same program can be run many times using different input data Automation
Anatomy of a program InputInstructionsOutput DataSteps encoded in a programming language Results – readable by you or another program 3, 4Sum7 GAGCAGACACGTTTTTGGTGGTTTT GAGCAGACATTTTTTGATTCTGGTTTT Align GACAT-TTTTTGATTCTGGTTTT GACACGTTTTTGG---TGGTTTT |||| |||||| |||||||
Computing A computable task Does it make sense to create a program to solve my problem? Text Editor A program that allows you to save text files Emacs (unix), vi (unix), gedit (unix), notepad (Windows) Compiler/Interpreter gcc (C), javac (Java), perl (Perl)
Strings “This is a string” ‘This is also a string’ “ACGACTACGACTAGCATCAGCATCAG”
vectors >> v = [3 1] v = 3 1
vectors >> v = [3 1]; >>
Row vectors >> v = [ ] v =
column vectors >> v = [ ]' v =
A new vector >> v = [1:8] v =
A new vector with increments of 0.25 >> v = [2:.25:4] v = Columns 1 through Columns 8 through
Accessing vector elements >> v(1) ans = 2
Basic operations >> v v = >> v(1:3) ? >> v(2:4) ? >> v(1:3)-v(2:4) ans =
Basic operations >> u = [0:-1:-4] u = >> -2*u ans =
A matrix: n x n (row x column) >> A = [ 1 2 3; 3 4 5; 6 7 8] A =
Indexing a matrix >> A(1:2,3:4) ??? Index exceeds matrix dimensions. >> A(1:2,2:3) ans = >> A(1:2,2:3)' ans = A =
Vector operations >> v = [1 2 3]' v = >> b = [2 4 6]' b = >> v+b ans = >> v-b ans = -2 -3
Plotting >> v v = >> b b = >> plot(v,b)
Loops >> for j=1:4, j end j = 1 j = 2 j = 3 j = 4
Loops >> v = [1:3:10] v = >> for j=1:4, v(j) = j; end >> v v =
If statements a = 2; b = 3; if (a<b) j = -1; end
If and elseif statements a = 4; b = 3; if (a<b) j = -1; else if (a>b) j = 2; end
Character arrays >> c = ’atcg’
Example 1: find the complement function c = watson_crick(v) %This is a function m-file for finding the Watson-Crick complement of a %string of nucleotide symbols % %Author: Winfried Just % Department of Mathematics % Ohio University % Athens, OH % % %Date: March 31, 2005 %Input: v - string of letters from the alphabet {a, c, g, t} %Output: c - the string of Watson-Crick complements to the terms of v %
Example 1 cont. function c = watson_crick(v) for i = 1:length(v) if v(i) == 'a' c(i) = 't'; elseif v(i) == 'c' c(i) = 'g'; elseif v(i) == 'g' c(i) = 'c'; elseif v(i) == 't' c(i) = 'a'; else c(i) = '?'; disp('Non-nucleotide symbol encountered'); end
Run watson_crick >> watson_crick (’accgatgcttatggatc’)
Example 2: find the AUG function start_codon(v) %This is a function m-file for finding the position of the first start codon % %Author: Winfried Just % Department of Mathematics % Ohio University % Athens, OH % % %Date: March 31, 2005 %Input: v - string of letters from the alphabet {a, c, g, t} %Output: a message that shows the position of the first % start codon in v %
Example 2: find the AUG function start_codon(v) found = 0; %this variable will tell us when we have found a start codon i = 1; while ~found & i < length(v) - 1 if v(i) == 'a' & v(i+1) == 't' & v(i+2) == 'g' found = 1; disp(['start codon at position ', num2str(i)]) else i = i+1; end if ~found disp('no start codon found') end
Run start_codon >> start_codon(’accgatgcttatggatc’)
Examples relevant to biology Log-log plot b/math266a/266A-fitting-logplot.pdf b/math266a/266A-fitting-logplot.pdf