Wrong assumptions and misinterpretations in explanations of biological models, phenomena and processes Jacek Leluk ICM UW or Is biologist logical, and.

Slides:



Advertisements
Similar presentations
Traducción. Molécula de aminoácido Sitio de fijación del aminoácido Adaptador (RNAt) RNAm Triplete nucleotídico que codifica un aminoácido + -O 2 C—C—NH.
Advertisements

©2000 Timothy G. Standish Mutations Timothy G. Standish, Ph. D.
Mutations. DNA mRNA Transcription Introduction of Molecular Biology Cell Polypeptide (protein) Translation Ribosome.
Transcription & Translation Worksheet
Introduction to Bioinformatics Algorithms Sequence Alignment.
Scoring Matrices June 19, 2008 Learning objectives- Understand how scoring matrices are constructed. Workshop-Use different BLOSUM matrices in the Dotter.
It & Health 2009 Summary Thomas Nordahl Petersen.
Scoring Matrices June 22, 2006 Learning objectives- Understand how scoring matrices are constructed. Workshop-Use different BLOSUM matrices in the Dotter.
Introduction to bioinformatics
Introduction to Bioinformatics Algorithms Sequence Alignment.
Transcription and Translation
Nature and Action of the Gene
FEATURES OF GENETIC CODE AND NON SENSE CODONS
Chapter 17: From Gene to Protein.
How Proteins are Produced
Sec 5.1 / 5.2. One Gene – One Polypeptide Hypothesis early 20 th century – Archibald Garrod physician that noticed that some metabolic errors were found.
PowerPoint ® Lecture Slides prepared by Janice Meeking, Mount Royal College C H A P T E R Copyright © 2010 Pearson Education, Inc. 3 Cells: The Living.
Correlated mutations The phenomenon of several mutations occurring simultaneously and dependent on each other According to the current hypothesis of molecular.
Pairwise Sequence Alignment (II) (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 27, 2005 ChengXiang Zhai Department of Computer Science University.
Eric C. Rouchka, University of Louisville Sequence Database Searching Eric Rouchka, D.Sc. Bioinformatics Journal Club October.
. Sequence Alignment. Sequences Much of bioinformatics involves sequences u DNA sequences u RNA sequences u Protein sequences We can think of these sequences.
Pairwise alignment of DNA/protein sequences I519 Introduction to Bioinformatics, Fall 2012.
Gene Expression: From Gene to Protein
Gene to Protein Gene Expression.
Secondary structure prediction
Construction of Substitution Matrices
Jacek Leluk, Interdisciplinary Centre for Mathematical and Computational Modelling, Warsaw University CLASSIFICATION AND CHARACTERIZATION OF NATURAL PROTEIN.
7. Protein Synthesis and the Genetic Code a). Overview of translation i). Requirements for protein synthesis ii). messenger RNA iii). Ribosomes and polysomes.
Introduction to Human Genetics
Cell Division and Gene Expression
Chapter 14 Genetic Code and Transcription. You Must Know The differences between replication (from chapter 13), transcription and translation and the.
Chapter 17 From Gene to Protein. Protein Synthesis  The information content of DNA  Is in the form of specific sequences of nucleotides along the DNA.
Protein Sequence Alignment Multiple Sequence Alignment
1 Mona Singh What is computational biology?. 2 Mona Singh Genome The entire hereditary information content of an organism.
©1998 Timothy G. Standish From DNA To RNA To Protein Timothy G. Standish, Ph. D.
Parts is parts…. AMINO ACID building block of proteins contain an amino or NH 2 group and a carboxyl (acid) or COOH group PEPTIDE BOND covalent bond link.
Today 14.2 & 14.4 Transcription and Translation /student_view0/chapter3/animation__p rotein_synthesis__quiz_3_.html.
Prepared By: Syed Khaleelulla Hussaini. Outline Proteins DNA RNA Genetics and evolution The Sequence Matching Problem RNA Sequence Matching Complexity.
Figure 17.4 DNA molecule Gene 1 Gene 2 Gene 3 DNA strand (template) TRANSCRIPTION mRNA Protein TRANSLATION Amino acid ACC AAACCGAG T UGG U UU G GC UC.
How Genes Work: From DNA to RNA to Protein Chapter 17.
Gene Translation:RNA -> Protein How does a particular sequence of nucleotides specify a particular sequence of amino acids?nucleotidesamino acids The answer:
F. PROTEIN SYNTHESIS [or translating the message]
DNA.
Sequence similarity, BLAST alignments & multiple sequence alignments
From DNA to Protein.
Translation PROTEIN SYNTHESIS.
Whole process Step by step- from chromosomes to proteins.
Please turn in your homework
Modelling Proteomes.
The blueprint of life; from DNA to Protein
Where is Cytochrome C? What is the role? Where does it come from?
Warm-Up 3/12/13 After transcription, an mRNA molecule with the sequence A U A C G C A G U was created. What was the sequence of the original DNA strand?
Protein Sequence Alignments
Transcription and Translation
What is Transcription and who is involved?
From Gene to Phenotype- part 2
Ch. 17 From Gene to Protein Thought Questions
Gene Expression: From Gene to Protein
Overview: The Flow of Genetic Information
Protein Synthesis Translation.
Overview: The Flow of Genetic Information
SC-100 Class 25 Molecular Genetics
Warm Up 3 2/5 Can DNA leave the nucleus?
Protein Structure Timothy G. Standish, Ph. D..
Today’s notes from the student table Something to write with
Transcription and Translation
Central Dogma and the Genetic Code
DNA, RNA, Amino Acids, Proteins, and Genes!.
Gene Protein Genome Proteome Genomics Proteomics.
Mutations Timothy G. Standish, Ph. D..
Presentation transcript:

Wrong assumptions and misinterpretations in explanations of biological models, phenomena and processes Jacek Leluk ICM UW or Is biologist logical, and computer scientist alive?

How is it, that your genome is in 98% the same as genome of chimpanzee and only in 50% as your own father’s genome? "O składności członów człowieczych" Dlaczego ptacy mleka nie dają? Bo musiałyby mieć cyce, które by im wadziły ku lataniu. Andrzej z Kobylina (XVI w.)

Is biology „bilogical”? Nomenclature chaos: Mitochondria or chondriosomes? Is papain a proteolytic enzyme? definition of identity, similarity an homology Misinterpretaion: Amino acid sequence of gene? Why squash inhibitors are inhibitors? Is wheat aglutinin to aglutinate rabbit red cells? Incomplete knowledge Stochastic index matrices Statistical description of biological processes

The problem of terminology BPTI - Basic Pancreatic Trypsin Inhibitor - Bovine Pancreatic Trypsin Inhibitor - Basic Protein Trypsin Inhibitor PAM - Point Accepted Mutations - Percent Accepted Mutations Kunitz trypsin inhibitor - BPTI - mammalian organs - STI - soybean trypsin inhibitor

What may everybody do wrong? Monte Carlo approach in structure analysis and prediction - – what state do we predict? Mathematical modelling of life processes – - Markov chains and protein evolution and differentiation - significance similarity estimation

What may biologists do wrong? Amino acids and proteins – - do proteins consist of amino acids as we describe? Definitions and theory – - definition of species and theory of evolution - definitions and biology Correlated mutations – - dispersed correlation

What may theoreticians do wrong? Primitive or ancestral? – - (Cyanophyta, Archaebacteria, ape and human) Global and local energy minima – - can we predict the exact conformation at exact time? Microscopic/mesoscopic/macroscopic processes - - water molecule and tsunami Assumptions and conclusions – - incomplete assumptions and wrong conclusions - deformations by simplifying - is the protein sequence just a string of characters?

Sequence identity estimation in proteomics and genomics Identity threshold – does it make sense?

WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ? 1) Contribution (%) of identical positions 2) Length of the compared strings (sequences) 3) Distribution of the identical positions along the analyzed sequence

WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ? 4) Residues at the conservative positions 5) Structural/genetic similarity of the amino acids at non-conservative positions

Sequence multiple alignment Problem of gap manipulation Any protein can be aligned with each other as homologous/similar anybiologicalstring anybilogicalstrip anybiologicalstri-ng anybi-logicalstrip anyprotein canbealigned -an-yprote--i-n canb-----ealigned

Statistical approaches vs. accuracy How far may they be improved? Protein secondary structure prediction – accuracy 70-72% (not much changed since 1978) 100% accuracy requires the complete database for all possible structures. For 30 AA polypeptides – sequences/secondary structures Searching the database for appropriate sequence/structure with the rate sequences/sec. would proceed 1.8 bilion times longer than the age of the Universe.

Genetic conditioning of the amino acid replacement probabilities and spectrum in molecular evolution

The Markov model assumes that the substitution probability of amino acid AA 1 by AA 2 is the same, regardless of what the initial residue AA 1 was transformed from ( AA x, AA y ) The currently used statistical algorithms are based on Markovian model of the amino acid replacement (they directly use stochastic matrices of replacement frequency indices) AA 1 AA 2 AA x PaPa AA 1 AA 2 AA y PbPb P a = P b

BLOSUM62 matrix of amino acid replacements Why tryptophane is here the most conservative residue?

Replacemant Arg  Lys according to the statistical interpretation using stochastical matrix indices Arg Lys PAM250 3 BLOSUM62 2 BLOSUM35 2 BLOSUM45 3 BLOSUM100 3

Arginine-to-lysine mutational replacements Gln CAR Arg AGR Arg AGR Ser AGY Arg CGR His CAY Lys AAR Leu CTR Lys AAR Met ATG Lys AAG

Thr Ser UCG Ser AGU IleAsn ArgCys Gly Trp UGG AlaThrPro TrpSerLeu (UAG) Asn AAU Possible one-point-mutational processing of serine with respect to its origin

Possible codons for arginine: AGA AGG CGA CGG CGC CGT Is arginine the same as arginine?

Diagram of amino acid genetic relationshipsDiagram of codon genetic relationships

H H – – Y Y E E D D K K N N R R – W C C G G G G R S S P P P P S S S S A A A A T T T T L L L L L L F F V V V V I I I Genetic relationships between Arg and Met/Gln M R R Q R Q

What part of the codon contains the information about the previous amino acid that occurred at certain position of the protein sequence? At most 2/3 of the entire codon. Ala GCG Val GUG

How long is the information about codons of preceeding amino acids stored? Theoreticaly the longest period is infinite The shortest storage period is 3 transitions/transversions Ala GCG Val GUG Met AUG Ile AUA Ser UCC Ser UCU Thr ACU Ser AGU Lys AAA Asn AAC Asp GAC His CAC Gln CAG Glu GAG Asp GAU His CAU Asn AAU Lys AAG Gln CAG His CAC Tyr UAU...

Correlated mutations The phenomenon of several mutations occurring simultaneously and dependent on each other According to the current hypothesis of molecular positive Darwinian selection, correlated mutations are related to the changes occurring in their neighborhood, they reflect the protein-to-protein interaction and they preserve the biological activity and structural properties of the molecule

The current explanation of correlated mutations occurrence (example)

The three types of distribution of correlated positions present in myoglobins The residue location and relative distribution is shown on tertiary structure of human myoglobin (P0244, pdb1bzp) The spot correlation cluster Position no. and occurring residues Correlation versus position [AMSTV]A (58)S (7) 27 [ADEFLNT]ADEFNTE 31 [GKRS]GKRSR 78 [AKLQ]KALQ 109 [DEGNT]DEGTE 116 [AEHKQST]AEHKQSA 117 [AEKNQS]AEKQSE 122 [BDEN]BDEND

The three types of distribution of correlated positions present in Bowman-Birk inhibitor family The residue location and relative distribution is shown on tertiary structure of Bowman- Birk inhibitor from soybean (P01055) The narrow correlation cluster Position no. and occurring residues Correlation versus position [–ADFIKLMPRSTV]L (11)M (10) A (8) 4 [–RSTVY]V–SS 5 [–KPST]K–SS 7 [AEGKP]APP 11 [EFHIKLQRST]TEHQS 21 [EFIKMQT]TQEQ

The three types of distribution of correlated positions present in eglin-like proteins. The residue location and relative distribution is shown on tertiary structure of eglin C (P01051) Position no. and occurring residues Correlation versus position [–DGNT]D (8)G (9) 10 [–ELNQRST]ETLNQRS The dispersed correlation

The three types of distribution of correlated positions present in lysozymes The residue location and relative distribution is shown on tertiary structure of lysozyme from rat (P00697, pdb5lyz) The dispersed correlation Position no. and occurring residues Correlation versus position [GHKNR]G (7)H (31)N (16) 30 [ILMV]MVILMVV 40 [DFKNR]DNNFKNR

The observed number and contribution of three correlation types in four different protein families The correlation sets consist of 2 to over 20 residues The protein family (number of correlated positions/set) The correlation statistics Total number of correlation sets observed Number of dispersed sets Number of narrow clusters Number of undirected clusters Number of sets related to active center Eglin-like proteins (2-13) Bowman-Birk proteinase inhibitors (2-28) Myoglobins (2- 29) n.a. Lysozymes (2-15) All families125 (100%)59 (47.2%)38 (30.4%)28 (22.4%)-

Bowls are concave Bowls are convex A mathematician – biologist dialogue The communication problem

...not always the first conclusion is correct and the first impression consistent with the reality In entire splendour of natural phenomena...

Thank you for your attention !!!