1 Mona Singh What is computational biology?. 2 Mona Singh Genome The entire hereditary information content of an organism.

Slides:



Advertisements
Similar presentations
Transcription & Translation Worksheet
Advertisements

Set up Cornell Notes on pg. 9
Transcription and Translation
Unit 7 RNA, Protein Synthesis & Gene Expression Chapter 10-2, 10-3
FEATURES OF GENETIC CODE AND NON SENSE CODONS
Molecular Biology Primer for CS and engineering students Alan Qi January, 2010.
Molecular Biology Primer for CS and engineering students Alan Qi Jan. 10, 2008.
How Proteins are Produced
Sec 5.1 / 5.2. One Gene – One Polypeptide Hypothesis early 20 th century – Archibald Garrod physician that noticed that some metabolic errors were found.
PowerPoint ® Lecture Slides prepared by Janice Meeking, Mount Royal College C H A P T E R Copyright © 2010 Pearson Education, Inc. 3 Cells: The Living.
Chapter 10: RNA & Protein Synthesis Mrs. Cook Biology
Chapter 21 Eukaryotic Genome Sequences
Chap. 1 basic concepts of Molecular Biology Introduction to Computational Molecular Biology Chapter 1.
7. Protein Synthesis and the Genetic Code a). Overview of translation i). Requirements for protein synthesis ii). messenger RNA iii). Ribosomes and polysomes.
Chapter 11 DNA and Genes.
Introduction to Human Genetics
Cell Division and Gene Expression
Chapter 14 Genetic Code and Transcription. You Must Know The differences between replication (from chapter 13), transcription and translation and the.
1 Human chromosomes: 50->250 million base pairs. Average gene: 3000 base pairs.
DNA and Protein Synthesis A blueprint for life. Protein Synthesis is divided into 2 parts in Eukaryotes:Transcription and Translation Transcription is.
PROTEIN SYNTHESIS DECEMBER 13, 2010 CAPE BIOLOGY UNIT 1 MRS. HAUGHTON.
©1998 Timothy G. Standish From DNA To RNA To Protein Timothy G. Standish, Ph. D.
Parts is parts…. AMINO ACID building block of proteins contain an amino or NH 2 group and a carboxyl (acid) or COOH group PEPTIDE BOND covalent bond link.
Today 14.2 & 14.4 Transcription and Translation /student_view0/chapter3/animation__p rotein_synthesis__quiz_3_.html.
Higher Human Biology Unit 1 Human Cells KEY AREA 3: Gene Expression.
Genetic code From Wikipedia, the free encyclopedia Edited by Jungho Kim.
Ch 10.4 RNA (RIBONUCLEIC ACID) Function: All traits you have due to proteins of amino acids Genes contain “blueprints” to make protein Ribosomes are site.
Chapter 17 Membrane Structure and Function From Gene to Proteins.
Figure 17.4 DNA molecule Gene 1 Gene 2 Gene 3 DNA strand (template) TRANSCRIPTION mRNA Protein TRANSLATION Amino acid ACC AAACCGAG T UGG U UU G GC UC.
Morse Code.. / / _..... __ _. dot short sound - dash long sound 1 st Copy code 2 nd Translate code AIM:
How Genes Work: From DNA to RNA to Protein Chapter 17.
Gene Translation:RNA -> Protein How does a particular sequence of nucleotides specify a particular sequence of amino acids?nucleotidesamino acids The answer:
F. PROTEIN SYNTHESIS [or translating the message]
DNA.
From DNA to Protein.
Translation PROTEIN SYNTHESIS.
Whole process Step by step- from chromosomes to proteins.
Please turn in your homework
Aim #39: How is DNA transcribed and translated?
The blueprint of life; from DNA to Protein
Where is Cytochrome C? What is the role? Where does it come from?
A Zero-Knowledge Based Introduction to Biology
Warm-Up 3/12/13 After transcription, an mRNA molecule with the sequence A U A C G C A G U was created. What was the sequence of the original DNA strand?
What is Transcription and who is involved?
From Gene to Phenotype- part 2
Ch. 17 From Gene to Protein Thought Questions
Gene Expression: From Gene to Protein
Gene Expression: From Gene to Protein
From Gene to Protein The information content of DNA is in the form of specific sequences of nucleotides The DNA inherited by an organism leads to specific.
Overview: The Flow of Genetic Information
Section Objectives Relate the concept of the gene to the sequence of nucleotides in DNA. Sequence the steps involved in protein synthesis.
Protein Synthesis Translation.
Overview: The Flow of Genetic Information
Chapter 17 From Gene to Protein.
Transcription You’re made of meat, which is made of protein.
Gene Expression: From Gene to Protein
SC-100 Class 25 Molecular Genetics
Modeling Protein Synthesis
NOTE SHEET 13 – Protein Synthesis
Warm Up 3 2/5 Can DNA leave the nucleus?
Protein Structure Timothy G. Standish, Ph. D..
Today’s notes from the student table Something to write with
Central Dogma and the Genetic Code
Normal DNA Strand DNA : TAC AAA GGA CGA GTA GTT TAA GCA AGA ATT
Protein Synthesis.
Bellringer Please answer on your bellringer sheet:
DNA, RNA, Amino Acids, Proteins, and Genes!.
DNA to proteins.
Transcription and Translation
Nucleic Acids Review.
Presentation transcript:

1 Mona Singh What is computational biology?

2 Mona Singh Genome The entire hereditary information content of an organism

3 Mona Singh DNA String over 4 letter alphabet A, T, G, C Organism’s genome is distributed over chromosomes (e.g., 46 chromosomes in human—22 pairs & XY) Genome size: number of base pairs in an organism

4 Mona Singh Genome Sizes Human3 billion bps Mouse3 billion bps Fruit fly165 million bps Nematode worm97 million bps Yeast15 million bps E coli5 million bps ~ 400 genomes sequenced

5 Mona Singh How are genomes sequenced? Can only sequence a few hundred base pairs at a time Make many copies of the DNA and cut into smaller (overlapping) pieces Assemble pieces: certain substrings occur in multiple fragments

6 Mona Singh Genomes to Life ATGCCTTAC GTACCCTGC GGCAGCACT ? Genome

7 Mona Singh Portions of DNA code for genes, which carry the information for making proteins Proteins play key roles in most biological processes (e.g., signaling, catalysis, immune response, etc.)

8 Mona Singh gucgcuaccauuaccaguuggucuggugucaaaaauaauaau aaccgggcaggccaugucugcccguauuucgcguaaggaaau ccauuauguacuauuuaaaaaacacaaacuuuuggauguucg guuuauucuuuuucuuuuacuuuuuuaucaugggagccuacu ucccguuuuucccgauuuggcuacaugacaucaaccauauca gcaaaagugauacggguauuauuuuugccgcuauuucucugu ucucgcuauuauuccaaccgcuguuuggucugcuuucugaca aacucgggcugcgcaaauaccugcuguggauuauuaccggca uguuagugauguuugcgccguucuuuauuuuuaucuucgggc cacuguuacaauacaacauuuuaguaggaucgauuguuggug guauuuaucuaggcuuuuguuuuaacgccggugcgccagcag uagaggcauuuauugagaaagucagccgucgcaguaauuucg aauuuggucgcgcgcggauguuuggcuguguuggcugggcgc ugugugccucgauugucggcaucauguucaccaucaauaauc aguuuguuuucuggcugggcucuggcugugcacucauccucg ccguuuuacucuuuuucgccaaaacggaugcgcccucuucug ccacgguugccaaugcgguaggugccaaccauucggcauuua gccuuaagcuggcacuggaacuguucagacagccaaaacugu gguuuuugucacuguauguuauuggcguuuccugcaccuacg auGuuuuugaccaacaguuugcuaauuucuuuacuucguucu gucaggugaa...gcaaucaaugucggaugcggcgcgacgcu Gene Finding

9 Mona Singh gucgcuaccauuaccaguuggucuggugucaaaaauaauaau aaccgggcaggccaugucugcccguauuucgcguaaggaaau ccauuauguacuauuuaaaaaacacaaacuuuuggauguucg guuuauucuuuuucuuuuacuuuuuuaucaugggagccuacu ucccguuuuucccgauuuggcuacaugacaucaaccauauca gcaaaagugauacggguauuauuuuugccgcuauuucucugu ucucgcuauuauuccaaccgcuguuuggucugcuuucugaca aacucgggcugcgcaaauaccugcuguggauuauuaccggca uguuagugauguuugcgccguucuuuauuuuuaucuucgggc cacuguuacaauacaacauuuuaguaggaucgauuguuggug guauuuaucuaggcuuuuguuuuaacgccggugcgccagcag uagaggcauuuauugagaaagucagccgucgcaguaauuucg aauuuggucgcgcgcggauguuuggcuguguuggcugggcgc ugugugccucgauugucggcaucauguucaccaucaauaauc aguuuguuuucuggcugggcucuggcugugcacucauccucg ccguuuuacucuuuuucgccaaaacggaugcgcccucuucug ccacgguugccaaugcgguaggugccaaccauucggcauuua gccuuaagcuggcacuggaacuguucagacagccaaaacugu gguuuuugucacuguauguuauuggcguuuccugcaccuacg auGuuuuugaccaacaguuugcuaauuucuuuacuucguucu gucaggugaa...gcaaucaaugucggaugcggcgcgacgcu MYYLKNTNFWMFGLFFFFYFFIMGAY FPFFPIWLHDINHISKSDTGIIFAAI SLFSLLFQPLFGLLSDKLGLRKYLLW IITGMLVMFAPFFIFIFGPLLQYNIL VGSIVGGIYLGFCFNAGAPAVEAFIE KVSRRSNFEFGRARMFGCVGWALCAS IVGIMFTINNQFVFWLGSGCALILAV LLFFAKTDAPSSATVANAVGANHSAF SLKLALELFRQPKLWFLSLYVIGVSC TYDVFDQQFANFFTSFFATGEQGTRV FGYVTTMGELLNASIMFFAPLIINRI GGKNALLLAGTIMSVRIIGSSFATSA LEVVILKTLHMFEVPFLLVGCFKYIT Gene Finding

10 Mona Singh AUG = methionine/start UUA = Leucine UUG = Leucine UAA = Stop UAG = Stop UGA = Stop. The Genetic Code Stryer, Biochemistry

11 Mona Singh Gene Finding gucgcuaccauuaccaguuggucuggugucaaaaauaauaauaaccgg gcaggccaugucugcccguauuucgcguaaggaaauccauuauguacu auuuaaaaaacacaaacuuuuggauguucgguuuauucuuuuucuuuu acuuuuuuaucaugggagccuacuucccguuuuucccgauuuggcuac augacaucaaccauaucagcaaaagugauacggguauuauuuuugccg cuauuucucuguucucgcuauuauuccaaccgcuguuuggucugcuuu cugacaaacucgggcugcgcaaauaccugcuguggauuauuaccggca uguuagugauguuugcgccguucuuuauuuuuaucuucgggccacugu uacaauacaacauuuuaguaggaucgauuguuggugguauuuaucuag gcuuuuguuuuaacgccggugcgccagcaguagaggcauuuauugaga aagucagccgucgcaguaauuucgaauuuggucgcgcgcggauguuug gcuguguuggcugggcgcugugugccucgauugucggcaucauguuca ccaucaauaaucaguuuguuuucuggcugggcucuggcugugcacuca uccucgccguuuuacucuuuuucgccaaaacggaugcgcccucuucug ccacgguugccaaugcgguaggugccaaccauucggcauuuagccuua agcuggcacuggaacuguucagacagccaaaacugugguuuuugucac uguauguuauuggcguuuccugcaccuacgauguuuuugaccaacagu uugcuaauuucuuuacuucguucugucaggugaa...gcaaucaaugu cggaugcggcgcgacgcu

12 Mona Singh Gene Finding aug ucu gcc cgu auu ucg cgu aag gaa auc cau uau gua cua uuu aaa... Met Ser Ala Arg Ile Ser Arg Lys Glu Ile His Tyr Val Leu Phe Lys... M S A R I S R K E I H Y V L F K... Reading off from 1 st start triplet Translating (3 letter amino acid code) (1 letter code)

13 Mona Singh Gene Finding aug ucu gcc cgu auu ucg cgu aag gaa auc cau uau gua cua uuu aaa... Met Ser Ala Arg Ile Ser Arg Lys Glu Ile His Tyr Val Leu Phe Lys... M S A R I S R K E I H Y V L F K... Reading off from 1 st start triplet Translating (3 letter amino acid code) (1 letter code) M Y Y L K N T N F W M F G L F F... Actual protein sequence

14 Mona Singh Computational Gene Finding Methods Statistical bias: protein coding regions “look different” - compare coding vs. non-coding regions (Hidden Markov Models, Neural Nets) Sequence similarity - similar to known protein?

15 Mona Singh Gene finding is hard In some genomes, only a small portion of genome codes for protein (needle in haystack) Some genes contain introns and exons – exons are the part that actually encode the protein part – and exons can be short Have to get the precise boundaries to get correct protein

16 Mona Singh Number of genes Human~30,000 Mouse~30,000 Fruit fly~13,500 Nematode worm~19,000 Yeast~6,000 E coli~4,000

17 Mona Singh MYYLKNTNFWMFGLFFFFYFFIMGAY FPFFPIWLHDINHISKSDTGIIFAAI SLFSLLFQPLFGLLSDKLGLRKYLLW IITGMLVMFAPFFIFIFGPLLQYNIL VGSIVGGIYLGFCFNAGAPAVEAFIE KVSRRSNFEFGRARMFGCVGWALCAS IVGIMFTINNQFVFWLGSGCALILAV LLFFAKTDAPSSATVANAVGANHSAF SLKLALELFRQPKLWFLSLYVIGVSC TYDVFDQQFANFFTSFFATGEQGTRV FGYVTTMGELLNASIMFFAPLIINRI GGKNALLLAGTIMSVRIIGSSFATSA LEVVILKTLHMFEVPFLLVGCFKYIT Predicting Protein Function DNA binding protein

18 Mona Singh Functions of Human Proteins Science, 2001

19 Mona Singh Sequence similarity CF: EGGNAILENISFSISPGQRVGLLGRTGSGKSTLLSAFLRLL----- NT: QAAQPLVHGVSLTLQRGRVLALVGGSGSGKSLTCAATLGILPAGVR CF: NTEGEIQIDGVSWDSITL QQWRKAFGVIPQKVFIFSG NT: QTAGEILADGKPVSPCALRGIKIATIMQNPRSAFNPL CF: TFRKNLDPYEQWSDQEIWKVADEVGLRSVIEQFP-GKLDFVLVDGG NT: ---HTMHTHARETCLALGKPADDATLTAAIEAVGLENAARVLKLYP CF: CVLSHGHKQLMCLARSVLSKAKILLLDEPSAHLDPV NT: FEMSGGMLQRMMIAMAVLCESPFIIADEPTTDLDVV Ex: cystic fibrosis gene and bacterial nickel transport gene

20 Mona Singh Database Searches

21 Mona Singh Database Searches Sequences producing significant alignments: E-Value gi| |gb|AAD |AF108138_1 (AF108138) DNA helicase 4e-84 gi| |pir||T37310 PIF1 protein - Caenorhabditis elegans helicase 1e-77 gi| |pir||T40739 rrm3-pif1 helicase homolog - fission... 3e-59 gi| |pir||T47241 RRM3/PIF1 helicase homolog - fission yeast 3e-59 gi| |ref|NP_ | DNA helicase; Rrm3p [Saccharomyces 4e-43 gi| |ref|NP_ | 5' to 3' DNA helicase; Pif1p [Saccharo 1e-41 gi|558414|emb|CAA | (Z38114) len: 750, CAI: 0.14, inc... 1e-41 gi| |emb|CAB | (AL354532) possible DNA helicase... 4e-41

22 Mona Singh Protein Structure Sequence: KETAAAKFERQHMDSSTSAASSSN… Structure:

23 Mona Singh Primary TertiarySecondaryQuaternary Amino acids  helix Polypeptide chain Assembled subunits Proteins Lehninger, Principles of Biochemistry

24 Mona Singh Protein Structure Prediction Physics-based methods Statistics-based method

25 Mona Singh Statistics & Protein Structure Prediction Given a new sequence and a library of folds, figure out which (if any) is a good fit to the sequence.

26 Mona Singh Secondary structure prediction Given a protein sequence, can you tell its secondary structure –E.g., LKVVAKRELVQNNQ aaaa bbbb aaaaaaa a=alpha, b=beta : ~70% accuracy (neural nets or other learning techniques)

27 Mona Singh Genome annotation Many other important features of DNA –E.g., proteins bind DNA regulatory elements: determines which genes are “on” when Statistical & comparative approaches for finding them –Motif finding

28 Mona Singh ProkaryotesEukaryotes Universal phylogenetic tree Woese et al.

29 Mona Singh Building phylogenetic trees Use DNA (or protein) sequences from various organisms e.g., human ATCGAGGC mouse ATCCAGCC yeast ATTAAGTA

30 Mona Singh Building phylogenetic trees HumanMouseYeast Human024 Mouse204 Yeast440 E.g., Distance Matrix: Tree: Human Mouse Yeast

31 Mona Singh Intracellular networks

32 Mona Singh Network of cells

33 Mona Singh fn

34 Mona Singh Lecture Notes notes.html