Lecture 4 – Characters: Molecular First used by Luca Cavalli-Sforza and Anthony Edwards.

Slides:



Advertisements
Similar presentations
Introduction to genomes & genome browsers
Advertisements

Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Recombinant DNA Technology
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Bioinformatics lectures at Rice University Li Zhang Lecture 10: Networks and integrative genomic analysis-2 Genome instability and DNA copy number data.
Molecular Evolution Revised 29/12/06
Current Approaches to Whole Genome Phylogenetic Analysis Hongli Li.
Bioinformatics and Phylogenetic Analysis
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
CSE 291: Advanced Topics in Computational Biology Vineet Bafna/Pavel Pevzner
Protein Sequence Classification Using Neighbor-Joining Method
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display Human Genetics Concepts and Applications Seventh Edition.
With astonishing advance of the Human Genome Project, essentially all human genomic sequences are available in public databases. The major task for the.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Inferring Phylogeny using Permutation Patterns on Genomic Data 1 Md Enamul Karim 2 Laxmi Parida 1 Arun Lakhotia 1 University of Louisiana at Lafayette.
Restriction Fragment Length Polymorphisms (RFLPs) By Amr S. Moustafa, M.D.; Ph.D. Assistant Prof. & Consultant, Medical Biochemistry Dept. College of.
MCB 5472 Lecture #6: Sequence alignment March 27, 2014.
Reading the Blueprint of Life
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Alignment Statistics and Substitution Matrices BMI/CS 576 Colin Dewey Fall 2010.
Plant Molecular Systematics Michael G. Simpson
Assessment of sequence alignment Lecture Introduction The Dot plot Matrix visualisation matching tool: – Basics of Dot plot – Examples of Dot plot.
Using mutants to clone genes Objectives 1. What is positional cloning? 2.What is insertional tagging? 3.How can one confirm that the gene cloned is the.
What is the Human Genome Project? Identify all the approximately 35,000 genes in human DNA Determine the sequences of the 3,000,000,000 bases ( = 200 phone.
Molecular phylogenetics
CSE 6406: Bioinformatics Algorithms. Course Outline
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Amplifying DNA. The Power of PCR View the animation at
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Yeast genome sequencing: the power of comparative genomics MEDG 505, 03/02/04, Han Hao Molecular Microbiology (2004)53(2), 381 – 389.
Evolutionary Biology Concepts Molecular Evolution Phylogenetic Inference BIO520 BioinformaticsJim Lund Reading: Ch7.
Introduction to Phylogenetics
SPECIES AT THE GENOMIC LEVEL. DDH has been the gold standard  the “sex” for higher eukaryotes Stackebrandt et al., 2002, Int J Syst Evol Microbiol. 52:
Phylogenomics “The intersection of phylogenetics and genomics”
Genome Analysis II Comparative Genomics Jiangbo Miao Apr. 25, 2002 CISC889-02S: Bioinformatics.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
Phylogenetics.
CHROMOSOMAL INVERSIONS IN HUMAN POPULATIONS Andrea González Morales.
ESTs Ian Keller Laboratory Techniques in Molecular Bio.
No reference available
Molecular revolution. The first molecular markers: allozymes Allozymes Enzymes that diifer in amino acid sequence yet catalyze the same reaction -visible.
Finding genes in the genome
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
DNA Sequences Analysis Hasan Alshahrani CS6800 Statistical Background : HMMs. What is DNA Sequence. How to get DNA Sequence. DNA Sequence formats. Analysis.
Restriction Fragment Length Polymorphism. Definition The variation in the length of DNA fragments produced by a restriction endonuclease that cuts at.
Bioinformatics Overview
Introduction to Bioinformatics Resources for DNA Barcoding
Announcements Seminar today after class! Seminar Wednesday!
Comparative metagenomics quantifying similarities between environments
The ideal approach is simultaneous alignment and tree estimation.
Introduction to Phylogenetic Systematics
A Hybrid Algorithm for Multiple DNA Sequence Alignment
Relationship between Genotype and Phenotype
Human Molecular Genetics
Summary and Recommendations
Lecture 9 Genome Mapping By Ms. Shumaila Azam
Chapter 19 Molecular Phylogenetics
DNA Fingerprinting and Forensic Analysis
Molecular data assisted morphological analyses
DNA Profiling Vocabulary
Summary and Recommendations
Relationship between Genotype and Phenotype
General overview of the bioinformatic pipelines for the 16S rRNA gene microbial profiling and shotgun metagenomics. General overview of the bioinformatic.
Presentation transcript:

Lecture 4 – Characters: Molecular First used by Luca Cavalli-Sforza and Anthony Edwards

Lecture 4 – Characters: Molecular cwk1056 eaa292 cwk1025 eaa448 dsr5032 eaa028 fac1117 cwk1007 cwk eaa cwk eaa dsr eaa fac cwk eaa Pairwise distance matrix The units for these distances vary, but the matrix can then be subjected to a number of potential phylogenetic analyses. Information regarding comparative genomics may be presented as inherently distance data.

An example of a simple genomic distance. (Edwards et al Syst. Biol. 51:599 ) Large amounts of sequence data that is assumed to be a random sample from each respective genome. Begin by calculating the frequency of each of the 4 n bp words in each taxon, where n is the length of the word. n = 1, there are 4 words: G, A, T, C (data are the base frequencies). n = 2, there are 16 possible dinucleotide words – 16 frequencies.

Edwards et al. (2002) use 5 bp words, so there are 4 5 = 1024 possible words, and the frequency of each word is calculated from the genome sample for each OTU. So, for each taxon, we have a vector of penta-nucleotide frequencies. The Euclidian distance between each pair of genomes is calculated to generate a distance matrix. where f xi is the frequency of word x in taxon i and f xj is the frequency of word x in taxon j.

This matrix is then subjected to any of a number of tree-estimation methods. Deep split in bird phylogeny (Paleognthus birds) is reflected in the genomic signature.

2. Chromosomal Inversions have a long history due to Diptera having polytene chromosomes. Can puzzle out order of inversions, and use events as characters. Potential Molecular Characters 1. Allozymes – Allelic forms of proteins (usually enzymes) that vary by a charge changing amino-acid. Distance-based or character-based analyses were conducted.

Chromosomal Inversions (Kamail et al PLoS Pathogens)

3. Fragment Data DNA sequence variation can be assayed indirectly with restriction enzymes EcoR1 will cleave DNA anywhere there is the following sequence occurs...G – A – A – T – T – C.. | | | | | |..C – T – T – A – A – G.. 4. Sequence Data a. Gene sequences – 4 possible character states. b. Protein sequences - 20 possible character states.

5. Higher order molecular characters (Rare Genomic Changes) Rokas and Holland (2000. TREE, 15:454).

a. Insertions/Deletions in/of introns. These are often applied to already existing phylogenetic hypotheses. Murphy et al. (2007. Genome Res., 17: 413)

microRNA (miRNA) Profile Tarver et al. (2013. Mol. Biol. Evol. 30:2369)

microRNA (miRNA) Profile Losses are more frequent than reported, there is large heterogeneity in rates of gains and losses, there’s ascertainment bias, and model-based analyses that account for this can refute simple analyses.

Webster & Littlewood Int. J. Parasit. 42: Gene-order data

Genomic Distances Increasingly, gene content data have been applied to the growing database of prokaryotic genomes. High Scoring Pairs – “genes” that have high scores in BLAST searches. They measure the number of base-pairs shared in a pair of genomes in these putative homologous genes. Snel et al Nature Genetics 21: Korbel et al.2002 Trends Genet. 18: Bernard et al J. Comp. Syst. Sci. 65: Henz et al., Bioinformatics. 21: Auch et al Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs. Standards in Genomic Sciences. 2: