Finding Mathematics in Genes and Diseases Ming-Ying Leung Department of Mathematical Sciences University of Texas at El Paso (UTEP)

Slides:



Advertisements
Similar presentations
Nucleic Acids Not considered a nutrient macromolecule
Advertisements

Nucleic Acids The amino acid sequence of a polypeptide is programmed by genes. Genes consist of DNA, which is a polymer belonging to the class of compounds.
Nucleic Acids Nucleic Acid Basics Contain instructions to build proteins 2 types: – DNA – RNA Composed of smaller units called nucleotides – Monomer:
SBI 3C1. Nucleic Acids  Associated with genetic/hereditary information  There are 2 different types of nucleic acids: 1. DNA - Deoxyribonucleic Acid.
1 DNA Analysis Amir Golnabi ENGS 112 Spring 2008.
Let’s Play Gene Mutations Chromosomal Mutations.
Biochemistry Part IV Nucleic Acids. Largest organic molecule made by organisms Largest organic molecule made by organisms Include 2 main types: Include.
Introduction to Bioinformatics Yana Kortsarts Bob Morris.
1 Genetics The Study of Biological Information. 2 Chapter Outline DNA molecules encode the biological information fundamental to all life forms DNA molecules.
Methods of identification and localization of the DNA coding sequences Jacek Leluk Interdisciplinary Centre for Mathematical and Computational Modelling,
Signal Processing Problems in Genomics Mohammad Al Bataineh Illinois Institute of Technology Chicago, IL.
Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151.
Molecular Biology Background. Schematic view of DNA organization in a cell.
Principles of Biology By Frank H. Osborne, Ph. D. Molecular Genetics.
Introduction to DNA and RNA Biology Standards B-4.1: Compare DNA and RNA in terms of structure, nucleotides, and base pairs. B-4.2: Summarize the relationship.
Undergraduate Participation in Bioinformatics Training (UPBiT) Ming-Ying Leung, 1,2 Stephen B. Aley, 2,3 Vladik Kreinovich, 2,4 and Elizabeth Walsh 2,3.
Nucleic Acids Nucleic Acid Basics Contain instructions to build proteins 2 types: – DNA – RNA Blueprint to build a car Protein built from DNA instructions.
Nucleic Acids Nucleic Acid Basics Contain instructions to build proteins 2 types: – DNA – RNA Composed of smaller units called nucleotides – Monomer:
How DNA helps make you you. DNA Function Your development and survival depend on… Your development and survival depend on…  which proteins your cells.
DNA and now RNA DNA is deoxyriboneucleic acid. RNA is ribonucleic acid.
DNA. Nucleic Acids Informational polymers Made of C,H,O,N and P No general formula Examples: DNA and RNA.
DNA Bases. Adenine: Adenine: (A) pairs with Thymine (T) only.
Deoxyribonucleic Acid Structure and Function
DNA structure.
AS Biology. Gnetic control of protein structure and function DNA STRUCTURE.
Finding Mathematics in Genes and Diseases Ming-Ying Leung Department of Mathematical Sciences University of Texas at El Paso (UTEP)
Chapter 11 DNA and GENES. DNA: The Molecule of Heredity DNA, the genetic material of organisms, is composed of four kinds nucleotides. A DNA molecule.
Introduction to DNA (Deoxyribonucleic acid). What do you know?
DNA RNA DNA Replication & Transcription Translation.
Nucleic Acids.
Regents Biology Nucleic Acids Information storage.
How Genes Function Quiz 6D. Four main points of how genes function Nucleotides (symbols in the language) are arranged into codons (letters) Codons (letters.
Nucleic Acid Nucleic Acids Examples: – RNA (ribonucleic acid) single helix – DNA (deoxyribonucleic acid) double helix Structure: – monomers = nucleotides.
Nucleic Acids Organic Molecules: Carbohydrates Proteins Lipids Nucleic Acids.
DNA AND RNA STUDY GUIDE FOR THE TEST. Name the three molecules DNA is made up of.
3.3.1 DNA Structure DNA is a polymer of Nucleotides 1.Sugar (5C) 2.Phosphate Group (C-5) 3.Nitrogenous Base (C-1) Phosphate Pentose Sugar Nitrogenous.
CREATED BY CHRIS WOODS STORES AND PASSES ON GENETIC INFORMATION FROM ONE GENERATION TO ANOTHER. DNA DEOXYRIBONUCLEIC ACID.
Nucleic Acids Nucleic acids provide the directions for building proteins. Two main types…  DNA – deoxyribonucleic acid  Genetic material (genes) that.
DNA Intro: DNA. Background Information: It is important to recall from the information from unit C about DNA. The acronym DNA stands for Deoxyribonucleic.
2015/04/10 Jun Min Jung MOLECULAR BIOLOGY & BIOCHEMISTRY.
Introduction to molecular biology Data Mining Techniques.
Find the replication origins in Genomics. Herpesvirus Members of the family herpesviridae are found in a wide range of host systems.
DNA and RNA Structure and Function Chapter 12 DNA DEOXYRIBONUCLEIC ACID Section 12-1.
DNA: WHAT IS IT, and WHAT IS ITS STRUCTURE? DNA is Deoxyribonucleic Acid, a coiled double helix molecule. Genes are made of DNA. All of your genetic Information.
The Structure of DNA. DNA is a nucleic acid. There are two types of nucleic acids: __________ or deoxyribonucleic acid __________ or ribonucleic acid.
DNA AND GENETICS Chapter 12 Lesson 3. Essential Questions What is DNA? What is the role of RNA in protein production? How do changes in the sequence of.
Data-intensive Computing: Case Study Area 1: Bioinformatics
RNA Ribonucleic Acid Single-stranded
(3) Gene Expression Gene Expression (A) What is Gene Expression?
H.B.2A.1 Construct explanations of how the structures of carbohydrates, lipids, proteins, and nucleic acids (including DNA and RNA) are related.
Nucleic Acids.
Nucleic Acids Section 3.5.
Nucleic Acids The stuff your genes are made of
How does genetic information become traits we can observe?
How Genes Function C5L3.
Characteristics of DNA
Notes: DNA Structure Topic 2.
Nucleic Acids 1 1.
Transcription.
Nucleic Acids.
Replication, Transcription, Translation
DNA & RNA Notes Unit 3.
Unit 5: DNA, RNA and Protein Synthesis
= DNA Nucleotide Phosphate Nitrogen Base Pairs:
Review DNA.
Making Proteins Transcription Translation.
DNA IS LIFE The rest is just details!
2/1/12 Reminder: Pick up science fair boards by Friday afternoon if you want them Objective: Understand the structure of DNA and how base-pairing allows.
9-1 DNA: the Indispensable Forensic Tool
Presentation transcript:

Finding Mathematics in Genes and Diseases Ming-Ying Leung Department of Mathematical Sciences University of Texas at El Paso (UTEP)

“1, 2, 3, … and Beyond” A slideshow for HKU Open Day in 1980 I did the narration and background music The experience has a great impact on my journey Mathematics is beyond numbers… We find it in buildings, banks, and supermarkets… …in atoms, molecules, and genes …

Outline: Cytomegalovirus (CMV) Particle DNA and RNA Genome, genes, and diseases Palindromes and replication origins in viral genomes Mathematics for prediction of replication origins

DNA and RNA T G C A G A C T G U A C DNA is deoxyribonucleic acid, made up of 4 nucleotide bases Adenine, Cytosine, Guanine, and Thymine. RNA is ribonucleic acid, made up of 4 nucleotide bases Adenine, Cytosine, Guanine, and Uracil. For uniformity of notation, all DNA and RNA data sequences deposited in GenBank are represented as sequences of A, C, G, and T. The bases A and T form a complementary pair, so are C and G.

Genes and Genome

Genes and Diseases

Virus and Eye Diseases CMV Particle CMV Retinitis inflammation of the retina triggered by CMV particles may lead to blindness Genome size ~ 230 kbp

Replication Origins and Palindromes High concentration of palindromes exists around replication origins of other herpesviruses Locating clusters of palindromes (above a minimal length) on CMV genome sequence might reveal likely locations of its replication origins.

Palindromes in Letter Sequences “A nut for a jar of tuna” “Step on no pets” ANUTFORAAROFTUNAJ remove spaces and capitalize STEPONNOPETS Even Palindrome : Odd Palindrome:

DNA Palindromes

Association of Palindrome Clusters with Replication Origins

Computational Prediction of Replication Origins Palindrome distribution in a random sequence model Criterion for identifying statistically significant palindrome clusters Evaluate prediction accuracy Try to improve…

A mathematical model can be used to generate a DNA sequence A DNA molecule is made up of 4 types of bases It can be represented by a letter sequence with alphabet size = 4 Adenosine Cytosine Guanine Thymine Wheel of Bases (WOB) Random Sequence Model G A C T

Adenosine Cytosine Guanine Thymine Wheel of Bases (WOB) Random Sequence Model Each type of the bases has its chance (or probability) of being used, depending on the base composition of the DNA molecule. G A C T

Adenosine Cytosine Guanine Thymine Wheel of Bases (WOB) Random Sequence Model G A C T Each type of the bases has its chance (or probability) of being used, depending on the base composition of the DNA molecule.

Poisson Process Approximation of Palindrome Distribution

Use of the Scan Statistic to Identify Clusters of Palindromes

Measures of Prediction Accuracy Attempts to improve prediction accuracy by: Adopting the best possible approximation to the scan statistic distribution Taking the lengths of palindromes into consideration when counting palindromes Using a better random sequence model

Markov Chain Sequence Models More realistic random sequence model for DNA and RNA It allows neighbor dependence of bases (i.e., the present base will affect the selection of bases for the next base) A Markov chain of nucleotide bases can be generated using four WOBs in a “Sequence Generator (SG)”

Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T

Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T

T Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T

T Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T

C T Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T

C T Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T

TT Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T C

TTT Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T C

TTTT Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T C

A TTTT Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T C

A TTTT Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T C A

AA CC G TT G TTTT Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T C AA

TTTT Sequence Generator (SG) Wheels of Bases (WOB) Bases G A C T C AA C AA C G TT G

Results Obtained for Markov Sequence Models Probabilities of occurrences of single palindromes Probabilities of occurrences of overlapping palindromes Mean and variance of palindrome counts

Related Work in Progress Finding the palindrome distribution on Markov random sequences Investigating other sequence patterns such as close repeats and inversions in relation to replication origins

Other Mathematical Topics in Genes and Diseases Optimization Techniques – prediction of molecular structures Differential Equations – molecular dynamics Matrix Theory – analyzing gene expression data Fourier Analysis – proteomics data

Acknowledgements Collaborators Louis H. Y. Chen (National University of Singapore) David Chew (National University of Singapore) Kwok Pui Choi (National University of Singapore) Aihua Xia (University of Melbourne, Australia) Funding Support NIH Grants S06GM , S06GM , and 2G12RR NSF DUE W.M. Keck Center of Computational & Struct. Biol. at Rice University National Univ. of Singapore ARF Research Grant (R ) Singapore BMRC Grants 01/21/19/140 and 01/1/21/19/217

St. Stephen’s Girls’ College

University of Hong Kong Department of Mathematics: A Beach Picnic

Continuing to Find Mathematics in Genes and Diseases Ming-Ying Leung Department of Mathematical Sciences University of Texas at El Paso (UTEP)