November 18, 2000ICTCM 2000 Introductory Biological Sequence Analysis Through Spreadsheets Stephen J. Merrill Sandra E. Merrill Marquette University Milwaukee,

Slides:



Advertisements
Similar presentations
Evolution and proteins You can see the effects of evolution, not only in the whole organism, but also in its molecules - DNA and protein For a mutation.
Advertisements

Pairwise Sequence Alignment Sushmita Roy BMI/CS 576 Sushmita Roy Sep 10 th, 2013 BMI/CS 576.
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
In-class activities Sat and Sun Tuesday Thursday Wednesday Friday Monday out-of-class activities Protein Module * ** * * -- !! -- / * = clicker questions.
Measuring the degree of similarity: PAM and blosum Matrix
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
DNA sequences alignment measurement
Structural bioinformatics
The construction of cells DNA or RNA Protein Carbohydrates Lipid etc.
Sequence Similarity Searching Class 4 March 2010.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
The construction of cells DNA or RNA Protein Carbohydrates Lipid etc. 04.
Sequence analysis June 18, 2008 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity.
Using Bioinformatics to Make the Bio- Math Connection The Confessions of a Biology Teacher.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
The construction of cells DNA or RNA Protein Carbohydrates Lipid etc
CHAPTER 25 TRACING PHYLOGENY. I. PHYLOGENY AND SYSTEMATICS A.TAXONOMY EMPLOYS A HIERARCHICAL SYSTEM OF CLASSIFICATION  SYSTEMATICS, THE STUDY OF BIOLOGICAL.
Similar Sequence Similar Function Charles Yan Spring 2006.
The construction of cells DNA or RNA Protein Carbohydrates Lipid etc
Computational Biology, Part 2 Representing and Finding Sequence Features using Consensus Sequences Robert F. Murphy Copyright  All rights reserved.
Molecular Evolution, Part 2 Everything you didn’t want to know… and more! Everything you didn’t want to know… and more!
Sequence comparisons June 23, 2009 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Lecture 12 Splicing and gene prediction in eukaryotes
The construction of cells DNA or RNA Protein Carbohydrates Lipid etc.
Welcome to Introduction to Bioinformatics Computing aka BIC1.
Sequencing a genome and Basic Sequence Alignment
Information theoretic interpretation of PAM matrices Sorin Istrail and Derek Aguiar.
SIAM Life Sci/Soc. for Math Biol. Summer Biocalculus: Reflecting the needs of the students Stephen J. Merrill Dept of MSCS Marquette University.
Alignment Statistics and Substitution Matrices BMI/CS 576 Colin Dewey Fall 2010.
Laboratory Training for Field Epidemiologists Typing May 2007 Sequencing and Phylogeny.
1 Patterns of Substitution and Replacement. 2 3.
Multiple Sequence Alignment May 12, 2009 Announcements Quiz #2 return (average 30) Hand in homework #7 Learning objectives-Understand ClustalW Homework#8-Due.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Welcome to Introduction to Bioinformatics Computing aka BIC1.
Intelligent Systems for Bioinformatics Michael J. Watts
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
1 The Interrupted Gene. Ex Biochem c3-interrupted gene Introduction Figure 3.1.
Sequencing a genome and Basic Sequence Alignment
Construction of Substitution Matrices
Sequence Alignment Csc 487/687 Computing for bioinformatics.
ARE THESE ALL BEARS? WHICH ONES ARE MORE CLOSELY RELATED?
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Overview of Bioinformatics 1 Module Denis Manley..
Questions?. Novel ncRNAs are abundant: Ex: miRNAs miRNAs were the second major story in 2001 (after the genome). Subsequently, many other non-coding genes.
Introduction to Bioinformatics Algorithms Algorithms for Molecular Biology CSCI Elizabeth White
Linear Algebra Course Activity 2: Finding Similarities and Dissimilarities in DNA Sequences of HIV Patients Objective: Classify the types of Distances.
Sequence Alignment.
Construction of Substitution matrices
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Motif Search and RNA Structure Prediction Lesson 9.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Finding Motifs Vasileios Hatzivassiloglou University of Texas at Dallas.
DNA sequences alignment measurement Lecture 13. Introduction Measurement of “strength” alignment Nucleic acid and amino acid substitutions Measurement.
Protein Structure and Function. Proteins are organic compounds made from amino acids held together by peptide bonds.
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
Biomathematics seminar Application of Fourier to Bioinformatics Girolamo Giudice.
Bioinformatics Overview
Introduction to Bioinformatics Resources for DNA Barcoding
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
AP Biology – Lecture #4 Big Idea #1 – Evolution Change in the genetic makeup of a population over time IS evolution. Adapted from Rebecca Rehder Wingerden.
Overview Bioinformatics: Analyzing biological data using statistics, math modeling, and computer science BLAST = Basic Local Alignment Search Tool Input.
Unit Genomic sequencing
Presentation transcript:

November 18, 2000ICTCM 2000 Introductory Biological Sequence Analysis Through Spreadsheets Stephen J. Merrill Sandra E. Merrill Marquette University Milwaukee, WI

November 18, 2000ICTCM 2000 Teaching Mathematics to Students of Biology  Need to make the math in the courses correlate with math that needed in that discipline  The most important “math” needed is statistics  The molecular biology revolution in biology presents data in a form in which calculus has little impact (sequences of letters)

November 18, 2000ICTCM 2000 The Nature of Biological Sequence Data  Primary structure of DNA, RNA, and proteins are sequences of letters -- 4 letters in the case of DNA (ATGC) and RNA (AUGC) and 20 letters representing the sequence of amino acids which makes up a protein  Secondary and Tertiary structures (bending, folding and twisting) of structures determines function -- hints seen through primary structure

November 18, 2000ICTCM 2000 Use of Spreadsheets in this setting  Commonly found and used in biological labs for data acquisition, storage and organization, and data analysis  Commonly present on student computers and computer labs  Unlike calculators -- able to handle data sets typical of “real world” applications  R.F. Murphy at CMU has developed a set of worksheets for sequence analysis

November 18, 2000ICTCM 2000 Meaningful Questions & Problems 1. Measuring the similarity between two strings -- “alignment” or “homology” 2. Finding instances of a pattern in a string 3. Describing the composition and properties of a string 4. Graphing the evolutionary process and construction of phylogenetic trees

November 18, 2000ICTCM 2000 Measuring the Similarity between Strings  Given a gene -- suggest the function of the protein coded for by finding a similar sequence (possibly in another species)  Simple homology involves assigning a “1” for agreement and “0” for nonagreement at each site. Then sum over all sites  Homology is the fraction of the highest possible score, in %

November 18, 2000ICTCM 2000 Spreadsheet #1 Simple Homology

November 18, 2000ICTCM 2000 Spreadsheet #1 (cont.) comparing random sequences

November 18, 2000ICTCM 2000 Finding Instances of a Particular Pattern in a String  The process of locating genes involves locating regions of the DNA sequences that contain patterns which resemble those of known genes  Identifying sites on DNA where one of the restriction enzymes can cleave DNA -- Also of interest is size of the fragments that result  Identify regions of RNA which correspond to particular features (e.g. loops) which may be splice sites

November 18, 2000ICTCM 2000 Describing the Composition and Properties of a String  Counts of frequencies of particular letters due to their properties (e.g. regions rich in G&C or A&T in DNA)  Properties of proteins (e.g. charge or hydrophobicity) which depend on the nature and frequencies of the particular amino acids

November 18, 2000ICTCM 2000 Spreadsheet #2 Hydropathy Plot

November 18, 2000ICTCM 2000 Spreadsheet #2 (Cont.)

November 18, 2000ICTCM 2000 Graphing Evolution and Phylogenetic Trees  Evolutionary distance between two DNA sequences used to determine the process of the changes in the sequences over time (e.g. the evolution of HIV or the flu viruses)  Trees constructed to express the relationship between related sequences -- distance in the tree a monotone function of homology

November 18, 2000ICTCM 2000 Spreadsheet #3 Mutation & Evolution

November 18, 2000ICTCM 2000 Spreadsheet #3 (cont.) To study the evolution of a sequence, we randomly pick a site for mutation, then change its letter

November 18, 2000ICTCM 2000 Conclusion  Use of a spreadsheet makes possible an experimental approach to introducing the mathematics of sequence analysis  The use of spreadsheets makes possible the use of real-world data and presents the computational tool in a meaningful context  The importance of the topics to all educated individuals suggests that the topics be included in many liberal arts math courses