Lecture 6 : More trees 9/21/09.

Slides:



Advertisements
Similar presentations
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Advertisements

Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Phylogenetic Trees Lecture 4
Lecture #5 Vertebrate visual pigments 2/7/13. HW #3 There are two things on the assignment page: Assign#3.pdf which has the homework problems HumanGreenRedCones.xlsx.
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Systematics and the Phylogenetic Revolution
Phylogenetic reconstruction
1 Computational Vision CSCI 363, Fall 2012 Lecture 33 Color.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
. Maximum Likelihood (ML) Parameter Estimation with applications to inferring phylogenetic trees Comput. Genomics, lecture 7a Presentation partially taken.
Maximum Likelihood. Historically the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Its slow uptake by the scientific community.
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
We have shown that: To see what this means in the long run let α=.001 and graph p:
Introduction to Biological Sequences. Background: What is DNA? Deoxyribonucleic acid Blueprint that carries genetic information from one generation to.
. Phylogenetic Trees Lecture 13 This class consists of parts of Prof Joe Felsenstein’s lectures 4 and 5 taken from:
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
1 Perception and VR MONT 104S, Fall 2008 Lecture 7 Seeing Color.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
AP Biology Discussion Notes Wednesday 4/29/2015. Goals for the Day Understand how we influence the environment & organisms around us Be able to tie together.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Introduction to Phylogenetics
MOLECULAR PHYLOGENETICS Four main families of molecular phylogenetic methods :  Parsimony  Distance methods  Maximum likelihood methods  Bayesian methods.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Using blast to study gene evolution – an example.
Rooting Phylogenetic Trees with Non-reversible Substitution Models Von Bing Yap* and Terry Speed § *Statistics and Applied Probability, National University.
Phylogeny Ch. 7 & 8.
Phylogenetics.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Measuring genetic change Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Section 5.2.
Chapter 9: Perceiving Color. Figure 9-1 p200 Figure 9-2 p201.
5.4 Cladistics The images above are both cladograms. They show the statistical similarities between species based on their DNA/RNA. The cladogram on the.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. By Chris Paine
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are.
Section 2: Modern Systematics
C. PHYLOGENY - the theoretical evolutionary history of a species
Evolutionary genomics can now be applied beyond ‘model’ organisms
Journal of Vision. 2015;15(13):22. doi: / Figure Legend:
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
Ap Biology Discussion Notes
Models for DNA substitution
Department of Psychology - FSU
Linkage and Linkage Disequilibrium
Maximum likelihood (ML) method
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
Topic 5.1 Evidence for Evolution
Comparative Genomics.
Pipelines for Computational Analysis (Bioinformatics)
Outline Of Today’s Discussion
Section 2: Modern Systematics
Models of Sequence Evolution
Evolution of vertebrate visual pigments
Molecular Evolution.
Taxonomy Modern.
Evidences of Evolution
by , Christine G. Elsik, Ross L. Tellam, and Kim C. Worley
Chapter 20 Phylogenetic Trees. Chapter 20 Phylogenetic Trees.
Evolution of vertebrate visual pigments
First Draft of Chimpanzee Genome
Sequence Similarity Andrew Torda, wintersemester 2006 / 2007, Angewandte … What is the easiest information to find about a protein ? sequence history.
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
Unit Genomic sequencing
Interpreting Cladograms Notes
1 2 Biology Warm Up Day 6 Turn phones in the baskets
Evolution Biology Mrs. Johnson.
Phylogeny and the Tree of Life
Presentation transcript:

Lecture 6 : More trees 9/21/09

Biology retreat 2009

Homework - nucleotide blast ACTGCGTTAC ACTGCCCTACT

Tblastn T A L ACTGCCCTACC T A L L P Y C P T

Tblastn TGACGGGATGG T A L ACTGCCCTACC T A L L P Y C P T

Tblastn = 3 + 3 = 6 comparisons CCATCCCGTCA T A L GGTAGGGCAGT G R A V G Q . G S

Nature paper on retinal gene therapy Fig 1 a shows the construct they made in the virus to inject in the eye. It includes the LCR = locus control regions, PP = proximal promoter, RHLOPS = recombinant human long wavelength opsin. 1b shows the light they illuminate the eye with to test for response to red wavelengths. 1c shows the response in the retina. The small inset shows the response 16 weeks after inject. The larger image is 40 weeks after inject - so it takes a lot of time for the vector to turnon.

Dichromats

Trichromats

Squirrel monkey tested to see if it can discriminate colors from gray Color presented amongst gray bgd Monkey gets juice reward When the threshold gets very high, the monkey can not discriminate colors in those regions. They appear gray and so can not be distinguished from the gray dot background. The color appearance and the white point shifts depending on if the monkey is missing the long wavelength gene (top) or the medium wavelength gene (bottom left). If the monkey has al three visual pigments, it can discriminate colors across the full spectrum. Color appearance for dichromats Color appearance for trichromat

Squirrel monkey before and after treatment with virus containing Human LWS gene It takes about 20 weeks for the virus with the new gene to build up such that the monkey gets cones sensitive to the longer wavelengths and can distiinguish the green colors (circles) from the gray background. Enclosed triangle and square are untreated dichromat which does not improve. Each monkey is tested before treatment (circles). Then they are tested after treatment (blue dots). The color confusion region around 490 nm goes away and the monkey can distinguish colors across the full spectrum. Note: Pink points are from trichromatic females

Big ideas How do genomes evolve? How does that impact an organism? What forces drive this evolution? Small details How do we get gene sequences? How do we compare them?

Questions for today Making trees What can you learn from a tree? How do we root trees? Programs for making trees

Last time Parsimony Distance Count # changes in sequence for a given tree Tree with smallest # of changes is best = most parsimonius Distance Calculate %difference = distance between sequences Sequences that are most similar are closest on tree

Maximum likelihood methods Assume explicit model for how DNA evolves = rates of nucleotide change Many models of different complexity for how DNA can change Fit model to ALL data Shortest tree is best Branch lengths = time Rates of DNA change

A  T  G  C DNA models  Transition,  A  G C  T Transversion, A  C, T G  C, T    

DNA bases Purines Pyrimadines Transitions Adenine Guanine Cytosine Thymine Transitions A-G C-T Mutations are easy between purines or between pyrimadines Easy to lose CH3 group from T and get a C so T->C mutation rates are pretty high. This causes A->G mutation on other strand.

A  T G  C DNA models  Transition,  A  G C  T Transversion, A  C, T G  C, T     Different models of different complexity Many models will include a transition /transversion ratio

Matrix Give probability that change from one base to another = symmetric A G C T Stay A G C T GA Stay G GC GT CA CG Stay C CT TA TG TC Stay T Starting base This is change matrix

Jukes Cantor, 1969 Transitions = transversions, no difference Equal base frequencies A G C T 1-3  If alpha is rate or probability you change to something else, then 1 - 3 alpha is probability that you stay the same

Kimura 2 parameter, 1980 Transitions and transversions differ Equal frequencies A G C T 1-    Here we account for the fact that transitions and transversion occur at different rates.

Felsenstein 1981 Transitions = transversions Unequal frequencies A G C fA fA fG fG fC fC fT fT

Hasegawa Kishino Yano HKY 1985 Transitions and transversions differ Unequal frequencies A G C T fA(1- fA fA fG fG(1- fG fC fC(1- fC fT fT fT(1-

Tamura Nei 1993 Transitions and transversions differ Unequal frequencies Different sites can vary at different rates Gamma distribution to describe rate differences

Maximum likelihood Find most likely explanation for the data given the model The tree topology is one of the variables to be fit Rate matrix is another Best approach Fits all the data Requires A LOT of computer time

Can also build ML trees for proteins Use matrix which describes rate of change from one AA to another

Bayesian methods Also a maximum likelihood approach Takes what people know about parameter values and uses it as a posteriori probability. Does ML fit Gives confidence levels of fit Unfortunately, people often don’t know what values to use. All the rage - but may over estimate how well it does

Q2. What can you learn from a tree?

Phylogenetic time travel

Phylogenetics Compare sequences and determine the relatedness of things Calculate % similarity of DNA or AA sequences Draw relatedness as a tree Human Mouse Bird Human Mouse Bird

Vertebrates Mammals Amphibians Marsupials Bird Reptiles Cartilagenous fish Bony fish Jawless fish Primates

Vertebrate divergence times Mammals, 100 MY Fish, 450 MY The time scales show the times to most recent common ancestor. Cartilagenous fish = sharks, rays Agnatha = jawless fish = lamprey, hagfish Vertebrates are 500-600 MY old Mammals are 100 MY Tetrapods are 400 MY Kumar and Hedges 1998

Trees can tell you about genes Which organisms have the gene? Where did the gene come from? What happens to the gene once it’s there? Duplicate - tandem - mRNA can be inserted Lost

Default expectation - if gene arose early in vertebrates, all species will have a copy and gene will be related in same way as organisms Dog Gene A Opossum Gene A Chicken Gene A Frog Gene A Zebrafish Gene A

Examine whether a gene exists in all organisms Dog Gene A Opossum Gene A Chicken Gene A Frog Gene A Zebrafish No A Gained In sequencing gene A, we find out that zebrafish does not have any copy of gene A. However, all the other species have one copy. This suggests the gene arose after fish separated from the rest of the vertebrates. Amphibians, reptiles/birds, marsupials and mammals are called tetrapods as they have 4 limbs. This means this gene arose after fish and tetrapods diverged.

Examine whether a gene exists in all organisms Mouse Platypus Finch Frog Pufferfish Gene loss

What is happening? Dog Gene A Human Gene A Chicken Gene A Frog Gene A In this tree, zebrafish has two copies of the gene, but all the other organisms only have one copy. This gene duplication is specific to fish - only they had the gene duplicate. Frog Gene A Zebrafish Gene A1 Gene duplication Zebrafish Gene A2

Dog Gene duplication Human Chicken GeneA2 Frog Zebrafish GeneA1 Gene A In this tree, lamprey has only one copy of gene A. However, all the other vertebrates have two copies. These copies are very similar, but the two copies are unique from each other. One example of this could be the red opsin gene and the blue opsin gene. This gene duplication occurred early in history of vertebrates. Lamprey are vertebrates that do not have jaws. The rest of the vertebrates shown have jaws. Therefore, this figure shows that the gene duplicated between time of jawless and jawed vertebrates GeneA1 Gene A Lamprey

Dog Human Frog Chicken Frog Zebrafish Lamprey

Gene duplication and then losses Human Chicken Frog Zebrafish Dog Lamprey

What does this tree tell us? LWS ZebrafishR RH2 RH1 SWS2 SWS1 Gaustralis = lamprey

LWS RH2 RH1 SWS2 SWS1 Gaustralis = lamprey ZebrafishR Opsin classes evolved early in history of vertebrates Rods evolved from cones Gene loss has occurred Gene duplications are everywhere SWS2 SWS1

Conclusions from opsin tree #1 5 opsin classes arose very early in vertebrates SWS1 - very short wavelength sensitive SWS2 - short wavelength sensitive RH2 - like rhopopsin but in cones LWS - long wavelength sensitive RH1 - rhodopsin rods cones

Range of visual pigment lmax SWS2 LWS SWS1 RH2 This slide shows that each of the cone opsin classes occurs in a particular part of the spectrum. The line above each pigment curve is meant to show that the lambda max can be tuned over the range corresponding to the line length. Any one opsin sequence will correspond to just one lambda max. However, when you consider all of the SWS1 genes that occur throughout vertebrates, they produce pigments that range from about 360 to 400 nm. The SWS1 gene for any given animal, will have a particular set of amino acids that are directed into the retinal binding pocket. A species with a short SWS1 (say 360 nm) will have a few key changes from a longer SWS1 gene (whose lambda max is around 400 nm). The same is true of the other cone opsin classes and also the rod opsin class.

Conclusion #2 Rod opsins evolved from cone opsins RH1 RH2 SWS2 SWS1 This tree summarized the relationships of the major opsin groups and shows that the cone opsins occur earliest in the tree and the rod opsin, RH1, evolved from cone opsins. SWS1 LWS Rhodopsin is Greek for rose + vision refers to color of pigment when look at dissected retina

LWS RH2 RH1 SWS2 SWS1 Gaustralis = lamprey ZebrafishR Opsin classes evolved early in history of vertebrates Rods evolved from cones Gene loss has occurred Gene duplications are everywhere SWS2 SWS1

Conclusion #3 Mammals lost two of the opsin classes Mammals have LWS, SWS1 and RH1 Only 2 cone opsins (dichromat) Dogs, cats, mice, rats, horses, goats, pigs … Mammals went through “nocturnal period” during reign of dinosaurs

Q3. How do you root a tree?

Rooting the tree Phylogenetics tells you how things are related Doesn’t tell you anything about what came first Human Chimp Gorilla Orangutan

Define outgroup Human Gorilla Orangutan Chimp Human Chimp Gives you an ordered tree of relationships Gorilla Orangutan

What if an in-group is selected as the outgroup? Human Orangutan Chimp Gorilla Gorilla Human Orangutan Chimp Everything is inverted

How to pick the root Use a more primitive organism For vertebrates invertebrate For jawed vertebrates lamprey For mammals marsupial For primates mammal If studying gene families For group A use gene from group B

Cladogram vs phylogram Cladogram only shows relationships No information on sequence difference Phylogram branch lengths are proportional to difference Human Human Chimp Chimp Mouse Mouse Chick Chick Cladogram Phylogram

Line lengths are proportional to how different sequences are Human Chimp Dog Humans and chimps are 5 MY since common ancestor so genes will be very similar Dogs and other mammals are about 100 MY so genes will be 20x more different from human as compared to human-chimp

Gene can evolve quickly Phylogram Rat Human Chicken Salamander Shark

Phylogram Human Distances  matter Distances  don’t count!! Chimp Human - chimp distance Chimp Gorilla When we draw the phylogram, we are allowed to add whatever vertical lines we need to space things out and make them easy to look at. Only the horizontal line lengths matter. These are proportional to the distance (sequence difference) between two genes. Orangutan