Bioinformatics and Phylogenetic Analysis

Slides:



Advertisements
Similar presentations
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Advertisements

. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Bioinformatics Tutorial I BLAST and Sequence Alignment.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Types of homology BLAST
Molecular Evolution Revised 29/12/06
Introduction to Bioinformatics Burkhard Morgenstern Institute of Microbiology and Genetics Department of Bioinformatics Goldschmidtstr. 1 Göttingen, March.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Tree Pattern Matching in Phylogenetic Trees Automatic Search for Orthologs or Paralogs in Homologous Gene Sequence Databases By: Jean-François Dufayard,
Tutorial 2: Some problems in bioinformatics 1. Alignment pairs of sequences Database searching for sequences Multiple sequence alignment Protein classification.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Chapter 2 Sequence databases A list of the databases’ uniform resource locators (URLs) discussed in this section is in Box 2.1.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Sequence Alignment and Phylogenetic Prediction using Map Reduce Programming Model in Hadoop DFS Presented by C. Geetha Jini (07MW03) D. Komagal Meenakshi.
Multiple Sequence Alignment
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Alexis Dereeper Homology analysis and molecular phylogeny CIBA courses – Brasil 2011.
Protein Evolution and Sequence Analysis Protein Evolution and Sequence Analysis.
Christian M Zmasek, PhD 15 June 2010.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Database Searches BLAST. Basic Local Alignment Search Tool –Altschul, Gish, Miller, Myers, Lipman, J. Mol. Biol. 215 (1990) –Altschul, Madden, Schaffer,
Last lecture summary. Window size? Stringency? Color mapping? Frame shifts?
Introduction to Phylogenetics
Construction of Substitution Matrices
Calculating branch lengths from distances. ABC A B C----- a b c.
Bioinformatics Multiple Alignment. Overview Introduction Multiple Alignments Global multiple alignment –Introduction –Scoring –Algorithms.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Phylogeny & Systematics
Burkhard Morgenstern Institut für Mikrobiologie und Genetik Molekulare Evolution und Rekonstruktion von phylogenetischen Bäumen WS 2006/2007.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Sequence Alignment Abhishek Niroula Department of Experimental Medical Science Lund University
Step 3: Tools Database Searching
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
MGM workshop. 19 Oct 2010 Some frequently-used Bioinformatics Tools Konstantinos Mavrommatis Prokaryotic Superprogram.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Bioinformatics Shared Resource Bioinformatics : How to… Bioinformatics Shared Resource Kutbuddin Doctor, PhD.
What is BLAST? Basic BLAST search What is BLAST?
Computer Applications and Bioinformatics
Bioinformatics Overview
Introduction to Bioinformatics Resources for DNA Barcoding
Phylogenetic basis of systematics
Basics of Comparative Genomics
BLAST Anders Gorm Pedersen & Rasmus Wernersson.
Methods of molecular phylogeny
Bioinformatics and BLAST
Overview Bioinformatics: Analyzing biological data using statistics, math modeling, and computer science BLAST = Basic Local Alignment Search Tool Input.
Molecular Evolution.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Dr Tan Tin Wee Director Bioinformatics Centre
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Chapter 19 Molecular Phylogenetics
Pairwise Sequence Alignment
Bioinformatics Lecture 2 By: Dr. Mehdi Mansouri
Basic Local Alignment Search Tool
Presentation transcript:

Bioinformatics and Phylogenetic Analysis Edgar Scott Multicampus Bioinformatics Education Specialist

What is Bioinformatics Interdisciplinary field that combines principles and techniques from computer science, probability and statistics, and linguistics to the study of genomic and proteomic sequences. Biological database for storing and organizng DNA and protein sequences Computational tools for analyzing sequences

Phylogenetic Analysis and Bioinformatics Phylogenetics – study of evolutionary relationships Phylogenetic trees used to represent evolutionary relationships Use of protein or DNA sequences to detect relationships versus morphological characters Bioinformatics provides both sequence repositories and sequence analysis software.

Overview Acquiring Data Set Analyzing Data Set Text searching at the National Center for Biotechnology Information (NCBI) Sequence similarity and homology Sequence similarity searching with Basic Local Alignment Search Tool (BLAST) Analyzing Data Set Phylogenetic Analysis with Molecular Evolutionary Genetics Analysis (MEGA) 3.1 software Build multiple sequence alignments of sequences using ClustalW Build phylogenetic trees

Text Searching at NCBI NCBI maintains provides molecular information and bioinformatic tools to the scientific community GenBank – an archival DNA and protein sequence database RefSeq – a curated DNA and protein sequence database Entrez Gene – a gene centered database

Sequence Similarity and Homology Homology – sequence that share a common ancestral sequence Paralogs – arise via gene duplication Orthologs – arise via speciation event Xenologs – arise via gene transfer Evolutionarily related sequences have similar sequences. Sequence differences correspond to amount of change that has occurred since they last shared a common ancestral sequence.

Sequence Alignments Sequence Alignment – a process that identifies a series of characters or character patterns that are in the same order in both sequences. Pairwise Global alignment Pairwise Local alignment Optimal alignment – an alignment between sequences in which the number of matching characters are maximized and the mismatching characters are minimized. Quantifying alignments Alignment score of the optimal alignment Percent identity scores Percent similarity scores

Sequence Similarity Searching Basic Local Alignment Search Tool (BLAST) Blastp, Blastn, Blastx, Tblastn, & TblastX Local alignments are reported Expectation Value – the number of times an investigator can expect to find an alignment that has an alignment score as good or better than the alignment score under consideration.

Steps to Build a Tree Build a multiple sequence alignment of data set. Analyze multiple sequence alignment using either distance based methods or character based methods.

Molecular Evolutionary Genetics Analysis (MEGA) 3.1 Phylogenetic Analysis program Constructs multiple sequence alignment using ClustalW Provides tree building methods Distance based Methods UPGMA Neighbor-joining method Minimum Evolution Character based Method Maximum Parsimony Provides a great help document!

Multiple Sequence Alignment Multiple Sequence Alignment – an alignment between three or more sequences. Computationally classified as NP-hard Programs ClustalW – fast, applies a progressive method T-Coffee – slower, applies an advanced progressive method Dialign – slow, applies an iterative method Combine – combines multiple sequence alignments

Tree Building methods UPGMA, Neighbor-Joining, Minimum Evolution Distance based methods Analyze the multiple sequence alignment to calculate a distance matrix. Clustering algorithm analyzes the distance matrix to determine which sequences should be clustered. Maximum parsimony Character based method Analyze the multiple sequence alignment to create a tree whose tree length has been minimized.

Tree Reliability Bootstrapping – method for assessing the reliability of trees. Steps The original data set is resampled several times (e.g. 1000). For each resampling, a tree is built The trees created from the resampling iterations are compared to the original tree.

Review Acquiring Data Set Analyzing Data Set Text searching at the National Center for Biotechnology Information (NCBI) Sequence similarity and homology Sequence similarity searching with Basic Local Alignment Search Tool (BLAST) Analyzing Data Set Phylogenetic Analysis with Molecular Evolutionary Genetics Analysis (MEGA) 3.1 software Build multiple sequence alignments of sequences using ClustalW Build phylogenetic trees