Download presentation
Presentation is loading. Please wait.
Published byDwayne Daniels Modified over 9 years ago
1
Perl Part I: A Biology Primer
2
Conceptual Biology H. sapiens did not create the genetic code – but they did invent the transistor Biological life is not optimized – the modern synthesis Nature vs. Nurture What are the best ways to understand the important differences the make the difference?
3
A Molecular Primer Hierarchy of the eukaryote Organism > System > Organ > Tissue > Cell > Organelle > Protein > RNA > DNA Put Simply: DNA → RNA → Protein
4
The Building Blocks DNA is composed of four building blocks Nucleic acids, nucleotides, bases Adenine, Cytosine, Guanine, Thymine RNA also has four building blocks Adenine, Cytosine, Guanine, Uracil Proteins are composed of 20 building blocks Amino acids, residues Fragments of proteins are called peptides DNA, RNA and Proteins are polymers
5
CodeNucleic Acid(s) w/ Sugarw/P AAdenineAdenosineAdenylic Acid CCytosineCytodineCytidylic Acid GGuanineGuanosineGuanylic Acid TThymineTymidineThymidylic Acid UUracilUridineUridylic Acid MA or C (amino)CodeNucleic Acid RA or G (purine)VA or C or G WA or T (weak)HA or C or T SC or G (strong)DA or G or T YC or T (pyrimidine) BC or G or T KG or T (keto)NA, G, C, T (any)
6
CodeNucleic Acid(s) w/ Sugar w/P AAdenineAdenosineAdenylic Acid CCytosineCytodineCytidylic Acid GGuanineGuanosin e Guanylic Acid TThymineTymidineThymidyli c Acid UUracilUridineUridylic Acid MA or C (amino) CodeNucleic Acid RA or G (purine) VA or C or G WA or T (weak) HA or C or T SC or G (strong) DA or G or T YC or T (pyrimidin e) BC or G or T KG or T (keto) NA, G, C, T (any) DNARNAA=T→A C=G→C G=C→G C=G→C T=A→U T=A→U M=K→M W=W→? N=N→N C=G→C C=G→C T=A→U Y=R→? B=V→? N=N→N K=M→? S=S→S T=A→U T=A→U
7
DNARNA A=T→A C=G→C G=C→G C=G→C T=A→U T=A→U M=K→M W=W→? N=N→N C=G→C C=G→C T=A→U Y=R→? B=V→? N=N→N K=M→? S=S→S T=A→U T=A→U One Dimensional Two Dimensional Three Dimensional
8
DNARNA A=T→A C=G→C G=C→G C=G→C T=A→U T=A→U M=K→M W=W→? N=N→N C=G→C C=G→C T=A→U Y=R→? B=V→? N=N→N K=M→? S=S→S T=A→U T=A→U
9
DNARNA A=T→A T=A→U G=C→G C=G→C T=A→U T=A→U M=K→M W=W→? N=N→N C=G→C C=G→C T=A→U Y=R→? B=V→? N=N→N K=M→? S=S→S T=A→U T=A→U
10
DNARNA A=T→A T=A→U G=C→G C=G→C T=A→U T=A→U M=K→M W=W→? N=N→N C=G→C C=G→C T=A→U Y=R→? B=V→? N=N→N K=M→? S=S→S T=A→U T=A→U
11
One-Letter Code Amino AcidThree- Letter Code One-Letter Code Amino AcidThree- Letter Code CCysteineCysDAspartic acid Asp EGlutamic Acid GluFPhenylalaninPhe GGlycineGlyHHistidineHis IIsoleucineIleKLysineLys LLeucineLeuMMethionineMet NAsparagineAsnPProlinePro QGlutamineGlnRArgineArg SSerineSerTThreonineThr VValineValWTryptophanTrp XUnknownXxxYTyrosineTyr ZGlutamic acid or GlutimineGlx
12
DNARNA A=T→A T=A→U G=C→G C=G→C T=A→U T=A→U M=K→M W=W→? N=N→N C=G→C C=G→C T=A→U Y=R→? B=V→? N=N→N K=M→? S=S→S T=A→U T=A→U Met (Start) Leu AA?, AU?, CA?, CU? -> Asn, Lys, Ile, Met, His, Gln, Val Pro UU?, UG?, UC?, CU?, CG?, CC? -> Phe, Leu, Cys, Stop, Trp, Ser, Leu, Arg, Pro UCU, UGU, GCU, GGU -> Ser, Cys, Ala, Gly
13
DNARNA A=T→A T=A→U G=C→G C=G→C T=A→U T=A→U M=K→M W=W→? N=N→N C=G→C C=G→C T=A→U Y=R→? B=V→? N=N→N K=M→? S=S→S T=A→U T=A→U Cys Phe, Leu A?C, U?C -> Ile, Thr, Asn, Ser, Phe, Ser, Tyr, Cys Leu U?U, U?G, C?U, C?G -> Phe, Ser, Tyr, Cys, Leu, Stop, Trp, Leu, Pro, His, Arg, Gln GUU, CUU -> Val, Leu
14
DNA RNA Protein
15
Lecture II Part II: One-Dimensional Strings
16
Hello World… A few perls of wisdom Concatenating Sequences Making a reverse complement Read sequences from data files
17
Every journey starts with a first 10bp #!/usr/bin/perl –w #storing DNA in a variable, and printing it out #First, storing DNA in a variable called $DNA $DNA = ‘CGGGCTATTC’; #Next, print the DNA onto the screen print $DNA; #Finally, specifically tell the program to end exit;
18
Every journey starts with a first 10bp #!/usr/bin/perl –w #storing DNA in a variable, and printing it out #First, storing DNA in a variable called $DNA $DNA = ‘CGGGCTATTC’; #Next, print the DNA onto the screen print $DNA; #Finally, specifically tell the program to end exit;
19
Every journey starts with a first 10bp #!/usr/bin/perl –w #storing DNA in a variable, and printing it out #First, storing DNA in a variable called $DNA $DNA = ‘CGGGCTATTC’; #Next, print the DNA onto the screen print $DNA; #Finally, specifically tell the program to end exit;
20
Every journey starts with a first 10bp #!/usr/bin/perl –w #storing DNA in a variable, and printing it out #First, storing DNA in a variable called $DNA $DNA = ‘CGGGCTATTC’; #Next, print the DNA onto the screen print $DNA; #Finally, specifically tell the program to end exit;
21
Concatenating DNA Fragments #!/usr/bin/perl –w #Store DNA in 2 variables $DNA1 = ‘AGTGCGTCGCTAG’; $DNA2 = ‘ACCGCATGCATTG’; #using string interpolation $DNA3 = “$DNA1$DNA2”; print “$DNA3\n\n”; #dot operator $DNA3 = $DNA1. $DNA2; print “$DNA3\n\n”; Print $DNA1,$DNA2,”\n”; exit;
22
Transcription: DNA to RNA #!/usr/bin/perl –w $DNA = ‘ACGACTGCACGATCGTACG’; #print the DNA onto the screen print “$DNA\n\n”; #Transcribe the DNA->RNA by substituting all T’s with U’s $RNA = $DNA; $RNA =~ s/T/U/g; #print the result to the screen print “Here is the result of DNA->RNA:\t$RNA\n\n”; exit;
23
$RNA =~ s/T/U/g; VariableBinding Operator Delimiters to separate the operator Substitute operator Pattern to be replaced Replacement Text of replace pattern Pattern modifier g = globally i = case insensititve m = multiline s = single line x = permit comments o = compile only once for speed e = treat replacement as Perl code
24
Calculating the Reverse Complement #!usr/bin/perl –w $DNA = ‘ACGTCAGTCGAGCT’; #print the starting DNA onto the screen print “Here is the starting DNA:\t$DNA\n\n”; #Calculate the reverse complement, first copying the DNA onto #a new variable called $revcom $revcom = reverse $DNA; #substitute all bases by their complement $revcom =~ s/A/T/g; $revcom =~ s/T/A/g; $revcom =~ s/C/G/g; $revcom =~ s/G/C/g; print “$revcom\n”;
25
Calculating the Reverse Complement #!usr/bin/perl –w $DNA = ‘ACGTCAGTCGAGCT’; #print the starting DNA onto the screen print “Here is the starting DNA:\t$DNA\n\n”; #Calculate the reverse complement, first copying the DNA onto #a new variable called $revcom $revcom = reverse $DNA; #substitute all bases by their complement $revcom =~ tr/ACGTacgt/TGCAtgca/; print “$revcom\n”;
26
Reading Data from Files #### Sample Data in FASTA Format #### >NM_012345 | Sample Data | Muppet Stuffing Protein MNIDDKLEFGDEMGOSSRTMV FGDLVRSMPHOEILAADEVLISHEE GLOYAKLEFGDEMGOGHDDEFGVY
27
Reading Files #!/usr/bin/perl –w #The filename of the file containing the sequence data $proteinFilename = ‘NM_012345.pep’; #open the file, and associate a ‘filehandle’ with it open (PROTEINFILE {IN}, $proteinFilename); #assign file with an input operator $muppetProtein = ; #print the protein file print “Here is the protein:\t$muppetProtein\n\n”; exit;
28
Reading Data from Files #### Sample Data in FASTA Format #### >NM_012345 | Sample Data | Muppet Stuffing Protein MNIDDKLEFGDEMGOSSRTMV FGDLVRSMPHOEILAADEVLISHEE GLOYAKLEFGDEMGOGHDDEFGVY
29
Lets try this again … #!usr/bin/perl –w $proteinFilename = ‘NM_012345.pep’; open(PROTEINFILE, $proteinFilename); $muppetProtein = ; print “Here is the first line:\t$muppetProtein\n\n”; $muppetProtein = ; print “Here is the second line:\t$muppetProtein\n\n”; $muppetProtein = ; print “Here is the third line:\t$muppetProtein\n\n”; close PROTEINFILE; exit;
30
Using Arrays to Read Files #!usr/bin/perl –w $proteinFilename = ‘NM_012345’; #open the file open(PROTEINFILE, $proteinFilename); #Read the sequence data from the file, and store it in the array #variable @protein @protein = ; #print the protein onto the screen print @protein; close PROTEINFILE; exit;
31
Arrays #Here’s one way to declare an array @bases = (‘A’,’C’,’G’,’T’); #Now print each element of the array print “\nFirst element: “, $bases[0]; print “\nSecond Element: “, $bases[1]; print “\nThird Element: “, $bases[2]; print “\nFourth Element: “, $bases[3];
32
Arrays #Here’s one way to declare an array @bases = (‘A’,’C’,’G’,’T’); #Now print each element of the array in a row print “\nHere are all of the bases: “, @bases; #This prints out: ‘Here are all of the bases: ACGT’ #But, you can print them out with spaces in between print “\nHere they are with spaces”, “@bases”;
33
Arrays #Here’s one way to declare an array @bases = (‘A’,’C’,’G’,’T’); #Here’s how to take an element off of the end $base1 = pop @bases; print “Here’s the last element: “, $base1, “\n\n”; #The other elements still remain print “\nHere are the remaining elements: ”, “@bases”;
34
Arrays #Here’s one way to declare an array @bases = (‘A’,’C’,’G’,’T’); #Here’s how to take an element off of the front $base2 = shift @bases; print “Here’s the first element: “, $base2, “\n\n”; #The other elements still remain print “\nHere are the remaining elements: ”, “@bases”;
35
Arrays #Here’s one way to declare an array @bases = (‘A’,’C’,’G’,’T’); #Here’s how you put an element at the beginning of an array #Our example will put the last element at the beginning $base1 = pop @bases; unshift (@bases, $base1); print “Here’s the last element put first: “, “@bases\n\n”;
36
Arrays #Here’s one way to declare an array @bases = (‘A’,’C’,’G’,’T’); #Here’s how you put an element at the end of an array #Our example will put the first element at the end $base1 = shift @bases; push (@bases, $base1); print “Here’s the first element put last: “, “@bases\n\n”;
37
Arrays #Here’s one way to declare an array @bases = (‘A’,’C’,’G’,’T’); #Here’s how to reverse an array @reverse = reverse @bases; #Here’s how to get the length print scaler @bases, “\n\n”; #Here’s how to insert an element at an arbitrary place splice (@bases, 2, 0, ‘X’);
38
Arrays #Arrays can be evaluated as lists and scalers @bases = (‘A’,’C’,’G’,’T’); #Here’s how to print the array print “@bases\n”; #Here’s how to assign it to a scaler $a = @bases; print $a; #Here’s how to assign an array to a list ($a) = @bases; print $a;
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.