Bioinformatics 生物信息学理论和实践 唐继军 13928761660.

Slides:



Advertisements
Similar presentations
The genetic code.
Advertisements

Translation (The Specifics) Audra Brown Ward Marist School Atlanta, Georgia
Protein Synthesis (making proteins)
Translation By Josh Morris.
ATG GAG GAA GAA GAT GAA GAG ATC TTA TCG TCT TCC GAT TGC GAC GAT TCC AGC GAT AGT TAC AAG GAT GAT TCT CAA GAT TCT GAA GGA GAA AAC GAT AAC CCT GAG TGC GAA.
RNA Say Hello to DNA’s little friend!. EngageEssential QuestionExplain Describe yourself to long lost uncle. How do the mechanisms of genetics and the.
Supplementary Fig.1: oligonucleotide primer sequences.
Today… Genome 351, 8 April 2013, Lecture 3 The information in DNA is converted to protein through an RNA intermediate (transcription) The information in.
Transcription and Translation
Proteins are made by decoding the Information in DNA Proteins are not built directly from DNA.
1 Perl: subroutines (for sorting). 2 Good Programming Strategies for Subroutines #!/usr/bin/perl # example why globals are bad $one = ; $two = ; $max.
PROTEIN SYNTHESIS NOTES #1. Review What is transcription? Copying of DNA onto mRNA Where does transcription occur? In the Nucleus When copying DNA onto.
Supplemental Table S1 For Site Directed Mutagenesis and cloning of constructs P9GF:5’ GAC GCT ACT TCA CTA TAG ATA GGA AGT TCA TTT C 3’ P9GR:5’ GAA ATG.
PART 1 - DNA REPLICATION PART 2 - TRANSCRIPTION AND TRANSLATION.
BINF634 FALL15 - LECTURE 41 Topics Logical expression string functions: substr and index random numbers and mutation hashes Transcription, translation,
TRANSLATION: information transfer from RNA to protein the nucleotide sequence of the mRNA strand is translated into an amino acid sequence. This is accomplished.
Bioinformatics 生物信息学理论和实践 唐继军
Learning Targets “I Can...” -State how many nucleotides make up a codon. -Use a codon chart to find the corresponding amino acid.
RNA Structure Like DNA, RNA is a nucleic acid. RNA is a nucleic acid made up of repeating nucleotides.
Aim: How does DNA direct the production of proteins in the cell?
Today… Genome 351, 8 April 2013, Lecture 3 The information in DNA is converted to protein through an RNA intermediate (transcription) The information in.
Bioinformatics 生物信息学理论和实践 唐继军
Do Now Look at the picture below and answer the following questions.
Place your keyboard aside. Only use the mouse.
Chapter 11 DNA and Genes.
NSCI 314 LIFE IN THE COSMOS 4 - The Biochemistry of Life on Earth Dr. Karen Kolehmainen Department of Physics CSUSB
The Purpose of DNA To make PROTEINS! Proteins give us our traits (ex: one protein gives a person blue eyes, another gives brown Central Dogma of Molecular.
Online – animated web site 5Storyboard.htm.
Passing Genetic Notes in Class CC106 / Discussion D by John R. Finnerty.
Body System Project Animal Nutrition Chapter 41 Kristy Blake and Krystal Brostek.
Parts is parts…. AMINO ACID building block of proteins contain an amino or NH 2 group and a carboxyl (acid) or COOH group PEPTIDE BOND covalent bond link.
Replication, Transcription, Translation PRACTICE.
Example 1 DNA Triplet mRNA Codon tRNA anticodon A U A T A U G C G
DNA, RNA and Protein.
G U A C G U A C C A U G G U A C A C U G UUU UUC UUA UCU UUG UCC UCA
Protein Synthesis Translation e.com/watch?v=_ Q2Ba2cFAew (central dogma song) e.com/watch?v=_ Q2Ba2cFAew.
From DNA to Protein.
Translation PROTEIN SYNTHESIS.
Protein Folding Notes.
Whole process Step by step- from chromosomes to proteins.
Please turn in your homework
Protein Synthesis: Translation
Translation Tutorial Place your keyboard aside. Only use the mouse.
RNA Ribonucleic Acid.
Supplementary information Table-S1 (Xiao)
Sequence – 5’ to 3’ Tm ˚C Genome Position HV68 TMER7 Δ mt. Forward
Python.
Supplemental Table 3. Oligonucleotides for qPCR
Biology Chapter 9 Section 2 Part 2
Section Objectives Relate the concept of the gene to the sequence of nucleotides in DNA. Sequence the steps involved in protein synthesis.
Protein Synthesis Translation.
DNA By: Mr. Kauffman.
Warm Up.
Translation Tutorial Place your keyboard aside. Only use the mouse.
Translation Tutorial Place your keyboard aside. Only use the mouse.
Transcription You’re made of meat, which is made of protein.
Fundamentals of Protein Structure
Transcription and Translation
Transcription and Translation
Today’s notes from the student table Something to write with
Transcription and Translation
Translation Tutorial Place your keyboard aside. Only use the mouse.
Python.
Translation.
Replication, Transcription, Translation PRACTICE
Bellringer Please answer on your bellringer sheet:
Do now activity #5 How many strands are there in DNA?
Aim: How does DNA direct the production of proteins in the cell?
Replication, Transcription, Translation PRACTICE
Replication, Transcription, Translation PRACTICE
Presentation transcript:

Bioinformatics 生物信息学理论和实践 唐继军

More Conditions

But Use ==,, >=, !=, ||, && for numeric numbers Use eq, lt, le, gt, ge, ne, or, and for string comparisons

More Arithmatics +, -, *, **, /, % +=, -=, *=, **=, /=, %= ++, --

$x = 28; $x = $x/2; print $x/=2, "\n"; print $x--, "\n"; print $x, "\n"; print --$x, "\n"; print $x, "\n"; print $x % 3, "\n"; print $x**2, "\n";

#!/usr/bin/perl -w print "Please type the filename of the DNA sequence data: "; $dna_filename = ; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = = ; close DNAFILE; $DNA = $DNA =~ s/\s//g; $count_of_CG = 0; $position = 0; while ( $position < length $DNA) { $base = substr($DNA, $position, 1); if ( $base eq 'C' or $base eq 'G') { ++$count_of_CG; } $position++; } print "CG content is ", $count_of_CG/(length $DNA)*100, "%\n";

#!/usr/bin/perl –w print "Please type the filename of the DNA sequence data: "; $dna_filename = ; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = = ; close DNAFILE; $DNA = $DNA =~ s/\s//g; $count_of_CG = 0; for ( $position = 0 ; $position < length $DNA ; ++$position ) { $base = substr($DNA, $position, 1); if ( $base eq 'C' or $base eq 'G') { ++$count_of_CG; } print "CG content is ", $count_of_CG/(length $DNA)*100, "%\n";

#!/usr/bin/perl –w print "Please type the filename of the DNA sequence data: "; $dna_filename = ; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = = ; close DNAFILE; $DNA = $DNA =~ s/\s//g; $count_of_CG = 0; while($DNA =~ /c/ig) {$count_of_CG++;} while($DNA =~ /g/ig) {$count_of_CG++;} print "CG content is ", $count_of_CG/(length $DNA)*100, "%\n";

$DNA = "ACCTAAACCCGGGAGAATTCCCACCAATTCTACGTAAC"; $s = ""; for ($i = 0, $j = 5; $i < $j; $i+=2, $j++) { $s.= substr($DNA, $i, $j); } print $s, "\n";

$DNA = "ACCTAAACCCGGGAGAATTCCCACCAATTCTACGTAAC"; $s = ""; for ($i = 0, $j = 5; $i < $j; $i+=2, $j++) { $s.= substr $DNA, $i, $j; } print ($s, "\n");

Call functions/subroutines Name p1, p2, p3; Name(p1, p2, p3); print $DNA1, $DNA2, "\n"; print ($DNA1, $DNA2, "\n");

Exercise 1 Ask for a protein file in fasta format Ask for an amino acid Count the frequency of that amino acid TKFHSNAHFYDCWRMLQYQLDMRCMRAISTF SPHCGMEHMPDQTHNQGEMCKPRMWQVS MNQSCNHTPPFRKTYVEWDYMAKALIAPYTL GWLASTCFIW

Exercise 2 Ask for an RNA file in fasta format Convert it to RNA Ask for a codon Count the frequency of that codon TCGTACTTAGAAATGAGGGTCCGCTTTTGCCC ACGCACCTGATCGCTCCTCGTTTGCTTTTAAG AACCGGACGAACCACAGAGCATAAGGAGAA CCTCTAGCTGCTTTACAAAGTACTGGTTCCCT TTCCAGCGGGATGCTTTATCTAAACGCAATGA GAGAGGTATTCCTCAGGCCACATCGCTTCCTA GTTCCGCTGGGATCCATCGTTGGCGGCCGAA GCCGCCATTCCATAGTGAGTTCTTCGTCTGTG TCATTCTGTGCCAGATCGTCTGGCAAATAGCC GATCCAGTTTATCTCTCGAAACTATAGTCGTA CAGATCGAAATCTTAAGTCAAATCACGCGACT AGACTCAGCTCTATTTTAGTGGTCATGGGTTT TGGTCCCCCCGAGCGGTGCAACCGATTAGGA CCATGTAGAACATTAGTTATAAGTCTTCTTTTA AACACAATCTTCCTGCTCAGTGGTACATGGTT ATCGTTATTGCTAGCCAGCCTGATAAGTAACA CCACCACTGCGACCCTAATGCGCCCTTTCCAC GAACACAGGGCTGTCCGATCCTATATTACGA CTCCGGGAAGGGGTTCGCAAGTCGCACCCTA AACGATGTTGAAGGCTCAGGATGTACACGCA CTAGTACAATACATACGTGTTCCGGCTCTTAT CCTGCATCGGAAGCTCAATCATGCATCGCACC AGCGTGTTCGTGTCATCTAGGAGGGGCGCGT AGGATAAATAATTCAATTAAGATATCGTTATG CTAGTATACGCCTACCCGTCACCGGCCAACAG TGTGCAGATGGCGCCACGAGTTACTGGCCCT GATTTCTCCGCTTCTAATACCGCACACTGGGC AATACGAGCTCAAGCCAGTCTCGCAGTAACG CTCATCAGCTAACGAAAGAGTTAGAGGCTCG CTAAATCGCACTGTCGGGGTCCCTTGGGTATT TTACACTAGCGTCAGGTAGGCTAGCATGTGT CTTTCCTTCCAGGGGTATG

Subroutine Some code needs to be reused A good way to organize code Called “function” in some languages Name Return Parameters

#!/usr/bin/perl –w print "Please type the filename: "; $dna_filename = ; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = = ; close DNAFILE; $DNA = $DNA =~ s/\s//g; $count_of_G = countG($DNA); print $count_of_G; sub countG { my($dna) my($count) = 0; $count = ( $dna =~ tr/Gg//); return $count; }

#!/usr/bin/perl –w print "Please type the filename: "; $dna_filename = ; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = = ; close DNAFILE; $DNA = $DNA =~ s/\s//g; $count_of_G = count($DNA, 'Gg'); print $count_of_G; sub count { my($dna, $pattern) my($count) = 0; $count = ( eval("$dna =~ tr/$pattern//") ); return $count; }

Codon

sub codon2aa { my($codon) if ( $codon =~ /TCA/i ) { return 'S' } # Serine elsif ( $codon =~ /TCC/i ) { return 'S' } # Serine elsif ( $codon =~ /TCG/i ) { return 'S' } # Serine elsif ( $codon =~ /TCT/i ) { return 'S' } # Serine elsif ( $codon =~ /TTC/i ) { return 'F' } # Phenylalanine elsif ( $codon =~ /TTT/i ) { return 'F' } # Phenylalanine elsif ( $codon =~ /TTA/i ) { return 'L' } # Leucine elsif ( $codon =~ /TTG/i ) { return 'L' } # Leucine elsif ( $codon =~ /TAC/i ) { return 'Y' } # Tyrosine elsif ( $codon =~ /TAT/i ) { return 'Y' } # Tyrosine elsif ( $codon =~ /TAA/i ) { return '_' } # Stop elsif ( $codon =~ /TAG/i ) { return '_' } # Stop elsif ( $codon =~ /TGC/i ) { return 'C' } # Cysteine elsif ( $codon =~ /TGT/i ) { return 'C' } # Cysteine elsif ( $codon =~ /TGA/i ) { return '_' } # Stop elsif ( $codon =~ /TGG/i ) { return 'W' } # Tryptophan elsif ( $codon =~ /CTA/i ) { return 'L' } # Leucine elsif ( $codon =~ /CTC/i ) { return 'L' } # Leucine elsif ( $codon =~ /CTG/i ) { return 'L' } # Leucine elsif ( $codon =~ /CTT/i ) { return 'L' } # Leucine elsif ( $codon =~ /CCA/i ) { return 'P' } # Proline elsif ( $codon =~ /CCC/i ) { return 'P' } # Proline elsif ( $codon =~ /CCG/i ) { return 'P' } # Proline elsif ( $codon =~ /CCT/i ) { return 'P' } # Proline elsif ( $codon =~ /CAC/i ) { return 'H' } # Histidine elsif ( $codon =~ /CAT/i ) { return 'H' } # Histidine elsif ( $codon =~ /CAA/i ) { return 'Q' } # Glutamine elsif ( $codon =~ /CAG/i ) { return 'Q' } # Glutamine elsif ( $codon =~ /CGA/i ) { return 'R' } # Arginine elsif ( $codon =~ /CGC/i ) { return 'R' } # Arginine elsif ( $codon =~ /CGG/i ) { return 'R' } # Arginine elsif ( $codon =~ /CGT/i ) { return 'R' } # Arginine elsif ( $codon =~ /ATA/i ) { return 'I' } # Isoleucine elsif ( $codon =~ /ATC/i ) { return 'I' } # Isoleucine elsif ( $codon =~ /ATT/i ) { return 'I' } # Isoleucine elsif ( $codon =~ /ATG/i ) { return 'M' } # Methionine elsif ( $codon =~ /ACA/i ) { return 'T' } # Threonine elsif ( $codon =~ /ACC/i ) { return 'T' } # Threonine elsif ( $codon =~ /ACG/i ) { return 'T' } # Threonine elsif ( $codon =~ /ACT/i ) { return 'T' } # Threonine elsif ( $codon =~ /AAC/i ) { return 'N' } # Asparagine elsif ( $codon =~ /AAT/i ) { return 'N' } # Asparagine elsif ( $codon =~ /AAA/i ) { return 'K' } # Lysine elsif ( $codon =~ /AAG/i ) { return 'K' } # Lysine elsif ( $codon =~ /AGC/i ) { return 'S' } # Serine elsif ( $codon =~ /AGT/i ) { return 'S' } # Serine elsif ( $codon =~ /AGA/i ) { return 'R' } # Arginine elsif ( $codon =~ /AGG/i ) { return 'R' } # Arginine elsif ( $codon =~ /GTA/i ) { return 'V' } # Valine elsif ( $codon =~ /GTC/i ) { return 'V' } # Valine elsif ( $codon =~ /GTG/i ) { return 'V' } # Valine elsif ( $codon =~ /GTT/i ) { return 'V' } # Valine elsif ( $codon =~ /GCA/i ) { return 'A' } # Alanine elsif ( $codon =~ /GCC/i ) { return 'A' } # Alanine elsif ( $codon =~ /GCG/i ) { return 'A' } # Alanine elsif ( $codon =~ /GCT/i ) { return 'A' } # Alanine elsif ( $codon =~ /GAC/i ) { return 'D' } # Aspartic Acid elsif ( $codon =~ /GAT/i ) { return 'D' } # Aspartic Acid elsif ( $codon =~ /GAA/i ) { return 'E' } # Glutamic Acid elsif ( $codon =~ /GAG/i ) { return 'E' } # Glutamic Acid elsif ( $codon =~ /GGA/i ) { return 'G' } # Glycine elsif ( $codon =~ /GGC/i ) { return 'G' } # Glycine elsif ( $codon =~ /GGG/i ) { return 'G' } # Glycine elsif ( $codon =~ /GGT/i ) { return 'G' } # Glycine else { print STDERR "Bad codon \"$codon\"!!\n"; exit; } # }

sub codon2aa { my($codon) if ( $codon =~ /GC./i) { return 'A' } # Alanine elsif ( $codon =~ /TG[TC]/i) { return 'C' } # Cysteine elsif ( $codon =~ /GA[TC]/i) { return 'D' } # Aspartic Acid elsif ( $codon =~ /GA[AG]/i) { return 'E' } # Glutamic Acid elsif ( $codon =~ /TT[TC]/i) { return 'F' } # Phenylalanine elsif ( $codon =~ /GG./i) { return 'G' } # Glycine elsif ( $codon =~ /CA[TC]/i) { return 'H' } # Histidine elsif ( $codon =~ /AT[TCA]/i) { return 'I' } # Isoleucine elsif ( $codon =~ /AA[AG]/i) { return 'K' } # Lysine elsif ( $codon =~ /TT[AG]|CT./i) { return 'L' } # Leucine elsif ( $codon =~ /ATG/i) { return 'M' } # Methionine elsif ( $codon =~ /AA[TC]/i) { return 'N' } # Asparagine elsif ( $codon =~ /CC./i) { return 'P' } # Proline elsif ( $codon =~ /CA[AG]/i) { return 'Q' } # Glutamine elsif ( $codon =~ /CG.|AG[AG]/i) { return 'R' } # Arginine elsif ( $codon =~ /TC.|AG[TC]/i) { return 'S' } # Serine elsif ( $codon =~ /AC./i) { return 'T' } # Threonine elsif ( $codon =~ /GT./i) { return 'V' } # Valine elsif ( $codon =~ /TGG/i) { return 'W' } # Tryptophan elsif ( $codon =~ /TA[TC]/i) { return 'Y' } # Tyrosine elsif ( $codon =~ /TA[AG]|TGA/i) { return '_' } # Stop else {print STDERR "Bad codon \"$codon\"!!\n"; exit; } }

Exercise Make the subroutine of converting codon to aa Read in a dna fasta file, print out an Amino Acid sequence

#!/usr/bin/perl -w $dna = 'CGACGTCTTCGTACGGGACTAGCTCGTGTCGGTCGC'; $protein = ''; for(my $i=0; $i < (length($dna) - 2) ; $i += 3) { $codon = substr($dna,$i,3); $protein.= codon2aa($codon); } print "I translated the DNA\n\n$dna\n\n into the protein\n\n$protein\n\n"; sub codon2aa { #... }

Reading Frame 5' 3' atgcccaagctgaatagcgtagaggggttttcatcatttgaggacgatgtataa 1 atg ccc aag ctg aat agc gta gag ggg ttt tca tca ttt gag gac gat gta taa M P K L N S V E G F S S F E D D V * 2 tgc cca agc tga ata gcg tag agg ggt ttt cat cat ttg agg acg atg tat C P S * I A * R G F H H L R T M Y 3 gcc caa gct gaa tag cgt aga ggg gtt ttc atc att tga gga cga tgt ata A Q A E * R R G V F I I * G R C I three in the forward reading, three in the reverse complement reading

Exercise 3 Make the subroutine of converting codon to aa Read in a dna fasta file, print out an Amino Acid sequence There are 6 reading frame, can you try to print all 6 version?

#!/usr/bin/perl –w print "Please type the filename: "; $dna_filename = ; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = = ;close DNAFILE; $DNA = join( =~ s/\s//g; print "First ", dna2peptide($DNA), "\n"; print "Second ", dna2peptide(substr($DNA, 1)), "\n"; print "Third ", dna2peptide(substr($DNA, 2)), "\n"; $DNA = reverse $DNA; print "Fourth ", dna2peptide($DNA), "\n"; print "Fifth ", dna2peptide(substr($DNA, 1)), "\n"; print "Sixth ", dna2peptide(substr($DNA, 2)), "\n"; sub dna2peptide { my ($dna) my $protein = ""; for(my $i=0; $i < (length($dna) - 2) ; $i += 3) { $codon = substr($dna,$i,3); $protein.= codon2aa($codon); } return $protein; } sub codon2aa { #... }

Modules A Perl Module is a self-contained pieceof [Perl] code that can be used by a Perl program later Like a library End with extension.pm Needs a 1 at the end

#Bio.pm sub codon2aa { #.... } sub dna2peptide { #.... } 1

#!/usr/bin/perl -w use Bio; print "Please type the filename: "; $dna_filename = ; chomp $dna_filename; open(DNAFILE, $dna_filename); $name = = ;close DNAFILE; $DNA = join( =~ s/\s//g; print "First ", dna2peptide($DNA), "\n"; print "Second ", dna2peptide(substr($DNA, 1)), "\n"; print "Third ", dna2peptide(substr($DNA, 2)), "\n"; $DNA = reverse $DNA; $DNA =~ tr/ACGTacgt/TGCAtgca/; print "Fourth ", dna2peptide($DNA), "\n"; print "Fifth ", dna2peptide(substr($DNA, 1)), "\n"; print "Sixth ", dna2peptide(substr($DNA, 2)), "\n";

#Bio.pm sub codon2aa { #.... } sub dna2peptide { #.... } sub fasta_read { print "Please type the filename: "; my $dna_filename = ; chomp $dna_filename; unless (open(DNAFILE, $dna_filename)) { print "Cannot open file ", $dna_filename, "\n"; } $name = = ;close DNAFILE; $DNA = join( =~ s/\s//g; return $DNA; } 1

#!/usr/bin/perl -w use Bio; $DNA = fasta_read(); print "First ", dna2peptide($DNA), "\n"; print "Second ", dna2peptide(substr($DNA, 1)), "\n"; print "Third ", dna2peptide(substr($DNA, 2)), "\n"; $DNA = reverse $DNA; $DNA =~ tr/ACGTacgt/TGCAtgca/; print "Fourth ", dna2peptide($DNA), "\n"; print "Fifth ", dna2peptide(substr($DNA, 1)), "\n"; print "Sixth ", dna2peptide(substr($DNA, 2)), "\n";

Scope my provides lexical scoping; a variable declared with my is visible only within the block in which it is declared. Blocks of code are hunks within curly braces {}; files are blocks. Use use vars qw([list of var names]) or our ([var_names]) to create package globals.

#!/usr/bin/perl -w use Bio; use strict; use warnings; $DNA = fasta_read(); print "First ", dna2peptide($DNA), "\n"; print "Second ", dna2peptide(substr($DNA, 1)), "\n"; print "Third ", dna2peptide(substr($DNA, 2)), "\n"; $DNA = reverse $DNA; $DNA =~ tr/ACGTacgt/TGCAtgca/; print "Fourth ", dna2peptide($DNA), "\n"; print "Fifth ", dna2peptide(substr($DNA, 1)), "\n"; print "Sixth ", dna2peptide(substr($DNA, 2)), "\n";

Variable "$DNA" is not imported at frame2.pl line 6. Variable "$DNA" is not imported at frame2.pl line 8. Variable "$DNA" is not imported at frame2.pl line 9. Variable "$DNA" is not imported at frame2.pl line 10. Variable "$DNA" is not imported at frame2.pl line 12. Variable "$DNA" is not imported at frame2.pl line 13. Variable "$DNA" is not imported at frame2.pl line 14. Variable "$DNA" is not imported at frame2.pl line 15. Global symbol "$DNA" requires explicit package name at frame2.pl line 6. Global symbol "$DNA" requires explicit package name at frame2.pl line 8. Global symbol "$DNA" requires explicit package name at frame2.pl line 9. Global symbol "$DNA" requires explicit package name at frame2.pl line 10. Global symbol "$DNA" requires explicit package name at frame2.pl line 12. Global symbol "$DNA" requires explicit package name at frame2.pl line 13. Global symbol "$DNA" requires explicit package name at frame2.pl line 14. Global symbol "$DNA" requires explicit package name at frame2.pl line 15. Execution of frame2.pl aborted due to compilation errors.

#!/usr/bin/perl -w use Bio; use strict; use warnings; my $DNA = fasta_read(); print "First ", dna2peptide($DNA), "\n"; print "Second ", dna2peptide(substr($DNA, 1)), "\n"; print "Third ", dna2peptide(substr($DNA, 2)), "\n"; $DNA = reverse $DNA; $DNA =~ tr/ACGTacgt/TGCAtgca/; print "Fourth ", dna2peptide($DNA), "\n"; print "Fifth ", dna2peptide(substr($DNA, 1)), "\n"; print "Sixth ", dna2peptide(substr($DNA, 2)), "\n";