Welcome to Part 3 of Bio 219 Lecturer – David Ray Contact info: Office hours – 1:00-2:00 pm MTW Office location – LSB 5102 Office phone – ext – Lectures are available online at go to ‘Teaching’ link
How Genes and Genomes Evolve
Variation There is obviously variation among and within taxa. How does the variation arise in genomes? Are there patterns to the variation? How is the variation propagated? What questions can be addressed using the variation? What patterns exist in humans with regard to genomic variability?
Generating Genetic Variation Somatic vs. germ line cells –Somatic cells – “body” cells, no long term descendants, live only to help germ cells perform their function. –Germ cells – reproductive cells, give rise to descendants in the next generation of organisms.
Generating Genetic Variation Somatic vs. germ line mutations –Somatic mutations – occur in somatic cells and will only effect those cells and their progeny, cannot not be passed on to subsequent generations of organisms. –Germ mutations – can be passed on to subsequent generations.
Generating Genetic Variation Five types of change contribute to evolution. –Mutation within a gene –Gene duplication –Gene deletion –Exon shuffling –Horizontal transfer – rare in Eukaryotes
Generating Genetic Variation Most changes to a genome are caused by mistakes in the normal process of copying and maintaining genomic DNA.
Mutations within genes –Point mutations – errors in replication at individual nucleotide sites occur at a rate of about in the human genome. –Most point mutations have no effect on the function of the genome – are selectively neutral. Generating Genetic Variation
DNA duplications –Slipped strand mispairing –Unequal crossover during recombination Generating Genetic Variation
Gene duplication allows for the acquisition of new functional genes in the genome Generating Genetic Variation
Gene Duplication: the globin family –A classic example of gene duplication and evolution –Globin molecules are involved in carrying oxygen in multicellular organisms –Ancestral globin gene (present in primitive animals) was duplicated ~500 mya. –Mutations accumulated in both genes to differentiate them - α and β present in all higher vertebrates –Further gene duplications produced alternative forms in mammals and in primates Generating Genetic Variation
Mammals Primates
Gene Duplication –Almost every gene in the vertebrate genome exists in multiple copies –Gene duplication allows for new functions to arise without having to start from scratch –Studies suggest the early in vertebrate evolution the entire genome was duplicated at least twice Generating Genetic Variation
Exon Duplication –Duplications are not limited to entire genes –Proteins are often collections of distinct amino acid domains that are encoded by individual exons in a gene –The separation of exons by introns facilitates the duplication of exons and individual gene evolution Generating Genetic Variation
Exon Shuffling –The exons of genes can sometimes be thought of as individual useful units that can be mixed and matched through exon shuffling to generate new, useful combinations Generating Genetic Variation
Review from last week Overall theme – There are lots of ways to create genetic variation. Genetic variation is the basis of evolutionary change but the variation must be introduced into the germ line to contribute to evolutionary change. Two cell lines in multicellular organisms –Somatic – short term genetic repository –Germ line – long term genetic repository Variation that occurs in the germ line are the only ones that can contribute to evolutionary change Genetic variation can be accumulated through various events –Mutations in genes – point mutations –DNA duplications – microsatellites (small), unequal crossover (large) –Gene and exon duplications are the major method for generating new gene functions –Exon shuffling can produce new gene functions by creating new combinations of functional exons/protein domains
Mobile elements contribute to genome evolution in several ways –Exon shuffling –Insertion mutagenesis –Homologous and non-homologous recombination Generating Genetic Variation
What are mobile elements and how do they work? –Fragments of DNA that can copy itself and insert those copies back into the genome –Found in most eukaryotic genomes –Humans – Alu (SINE); Ta, PreTa (LINEs); SVA; plus several families that are no longer active Generating Genetic Variation
Pol III transcription Reverse transcription and insertion 1. Usually a single ‘master’ copy 2. Pol III transcription to an RNA intermediate 3. Target primed reverse transcription (TPRT) – enzymatic machinery provided by LINEs Generating Genetic Variation: Normal SINE mobilization
Generating Genetic Variation Mobile elements contribute to genome evolution in several ways –Exon shuffling
Generating Genetic Variation: Exon shuffling via SINE mobilization exon 1SINE intron exon 2 SINE transcription can extend past the normal stop signal Reverse transcription creates DNA copies of both the SINE and exon 2 DNA copy of transcript Reinsertion occurs elsewhere in the genome SINEexon 2
Generating Genetic Variation Mobile elements contribute to genome evolution in several ways –Exon shuffling –Insertion mutagenesis The insertion of mobile elements can disrupt gene structure and function
Generating Genetic Variation
Gene expression alteration via a P- element mobilization in Drosophila
Generating Genetic Variation Mobile elements contribute to genome evolution in several ways –Exon shuffling –Insertion mutagenesis The insertion of mobile elements can disrupt gene structure and function –Homologous and non homologous recombination 10,000 – 1,000,000 + nearly identical DNA fragments scattered throughout the genome
Generating Genetic Variation Unequal crossover due to non-homologous recombination
Generating Genetic Variation Gene transfer can move genes between entire genomes –Horizontal gene transfer –Main problem with the development of drug resistant strains of bacteria
Generating Genetic Variation Bacterial conjugation
Reconstructing Life’s Tree Evolutionary theory predicts that organisms that are derived from a common ancestor will share genetic signatures Organisms that shared an ancestor more recently will be more similar than those that shared a more distant common ancestor Similarity can include sequence composition, genome organization, presence/absence of mobile elements, presence/absence of gene families, etc.
09_15_Phylogen.trees.jpg
09_16_Ancestral.gene.jpg
09_22_genetic.info.jpg
09_17_Human_chimp.jpg Chromosome 1
Review from last time Overall themes: Genetic variation can be introduced due to the activities and presence of mobile elements (MEs); Genetic information can be introduced into organisms through horizontal transfer. MEs are fragments of DNA that can make copies of themselves and insert those copies back into the genome –MEs can lead to variation through exon shuffling, insertion mutagenisis, and recombination –Many human diseases are the result of MEs Horizontal transfer can introduce genetic variation into bacteria via the process of conjugation Introduction of concepts for discussion of “Reconstructing life’s tree” –All sorts of variation provide information on the relationships among organisms –Homology – derived from the same ancestral source –Phylogeny – a reconstruction of relationships based on observations
Basic terms –Homologous – derived from a common ancestral source –Phylogeny – a reconstruction of relationships based on observed patterns Reconstructing Life’s Tree
Homologous genes can be recognized over large amounts of evolutionary time Reconstructing Life’s Tree
Homologous genes can be recognized over large amounts of evolutionary time Why? –Selectively advantageous genes and sequences tend to be conserved (preserved) –Selectively disadvantageous genes and sequences are tend not to be passed on to offspring Reconstructing Life’s Tree
Most DNA of most genomes is non-coding –Changes to much of this DNA are selectively neutral – cause no harm or good to the genome –Different portions of the genome will therefore diverge at different rates depending on their function The neutral regions tend to change in a clock-like fashion –We can estimate divergence times for certain groups
09_19_human_mouse1.jpg
Most DNA of most genomes is non-coding –Changes to much of this DNA are selectively neutral – cause no harm or good to the genome –Different portions of the genome will therefore diverge at different rates depending on their function The neutral regions tend to change in a clock- like fashion –We can estimate divergence times for certain groups Reconstructing Life’s Tree
The accumulation of changes can be quantified by several logical methods –Parsimony – the best hypothesis is the one requiring the fewest steps (i.e. Occam’s razor) –Distance – count the number of differences between things, the ones with the fewest numbers of differences are most closely related –Sequence based models – take into account what we know about the ways sequences change over time Reconstructing Life’s Tree
These slides and the sequence files used to produce them are available as a supplement on the class website: DNA sequence from six taxa Reconstructing Life’s Tree: An example using distance
human Sumatran orang Bornean orang bonobo chimp common chimp gorilla
ATGGCTAAGACGAAGACTCAGGCT T-AA-C Reconstructing Life’s Tree: An example using parsimony ATGGCTAAGACGAAGACTCAGGCT ATGGCTAAGACGAAGACTCAGGCT ATGGCTAAGACGAAGACTCAGGCT T-G G-A 6 steps
ATGGCTAAGACGAAGACTCAGGCT T-A G-A T-G A-C G-A 5 steps Reconstructing Life’s Tree: An example using parsimony
ATGGCTAAGACGAAGACTCAGGCT T-A G-A T-G A-C 4 steps Reconstructing Life’s Tree: An example using parsimony
The accumulation of changes can be quantified by several logical methods The accumulation of mobile elements provides a nearly perfect record of evolutionary relationships Reconstructing Life’s Tree
Phylogenetic Inference Using SINEs
Species ASpecies DSpecies CSpecies B Phylogenetic Inference Using SINEs
Resolution of the Human:Chimp:Gorilla Trichotomy (H,C)G (H,G)C (C,G)H (H,C,G)
Phylogenetic Analysis PCR of 133 Alu loci 117 Ye5 13 Yc1 1 Yi6 1 Yd3 1 undefined subfamily PNAS (2003) 22:
Alu Elements and Hominid Phylogeny PNAS (2003) 22:
Review from last time The variation that is present in genomes allows us to make determinations about the relationships among living things Different parts of the genome accumulate variation at different rates depending on their function (or lack thereof) The presence of different rates allows for different questions to be addressed depending on the level of divergence Several methods are available to analyze variation for phylogenetic signal –Parsimony, distance, sequence based models Patterns of mobile element insertion can be used to infer relationships among taxa
Much of the “junk” DNA is dispensible –The Fugu (Takifugu rubripes) genome is almost completely devoid of unnecessary sequences –Exon number and organization is similar to mammals –Compared to other vertebrates Intron size (not number) is reduced Intergenic regions are reduced in size No mobile elements Reconstructing Life’s Tree
09_21_Fugu.introns.jpg
Using all of the available information, we can reconstruct relationships between organisms back to the earliest forms of life Reconstructing Life’s Tree
The human genome is large and complex –23 pairs of chromosomes –~3.2 x 10 9 (3.2 billion) nucleotide pairs –Human genome composition Our Own Genome
09_26_noncoding.jpg
09_25_Chromosome22.jpg
Nuclear genome –3300 Mb –23 (XX) or 24 (XY) linear chromosomes –30-35,000 genes –1 gene/40kb –Introns –3% coding –Repetitive DNA sequences (45%) Our Own Genome
The human genome is large and complex –23 pairs of chromosomes –~3.2 x 10 9 (3.2 billion) nucleotide pairs –Human genome composition –The human genome project was one of the largest undertakings in human history Our Own Genome
Progress in human genome sequencing – Hierarchical vs. whole genome shotgun (WGS) sequencing – Repetitive DNA represents a significant problem for WGS sequencing in particular Our Own Genome
10_09_Shotgun.sequenc.jpg
08_03.jpg
Progress in human genome sequencing – Hierarchical vs. whole genome shotgun (WGS) sequencing – Repetitive DNA represents a significant problem for WGS sequencing in particular Our Own Genome
10_10_Repetit.sequence.jpg
Progress in human genome sequencing – Hierarchical vs whole genome shotgun sequencing – Repetitive DNA represents a significant problem for WGS sequencing in particular –Landmark papers in Nature and Science (2001) Venter et al Science 16 February 2001; 291: Lander et al Nature 409 (6822): Our Own Genome
A typical high- throughput genomics facility Our Own Genome
Exploring and exploiting the genome sequences BLAST/BLAT and other tools – BLAST - Basic local alignment search tool Input a sequence and find matches to human or other organisms – publication information – DNA and protein sequence (if applicable) Our Own Genome
Exploring and exploiting the genome sequences BLAST/BLAT and other tools – BLAT – BLAST-like alignment tool A “genome browser” Genomes available : – human, chimp, rhesus monkey, dog, cow, mouse, opossum, rat, chicken, Xenopus, Zebrafish, Tetraodon, Fugu, nematode (x3), Drosophila (x10), Apis (x3), Saccharomyces (yeast), SARS example: chr6:121,387, ,720,836 Our Own Genome
Query sequence - CallithrixHuman ortholog Our Own Genome BLAT can be used to make direct comparisons between our genome and others.
Comparisons with other genomes inform us about our own –Important genes and regulatory sequences can easily be identified if they are conserved between genomes Our Own Genome
Human variation –~0.1% difference in nucleotide sequence between any two individual humans –Translates to about 3 million differences in the genome –Most of these differences are Single Nucleotide Polymorphisms (SNPs) –We can use these differences to investigate human variation, population structure and evolution Our Own Genome
Human evolution –Coalescence analyses (mtDNA and Y chromosome) –Mutiregional vs. Out of Africa Predictions of the Multiregional Hypothesis – Equal diversity in human subpopulations – No obvious root to the human tree Predictions of the Out of Africa Hypothesis – Higher diversity in African subpopulations – Root of the human tree in Africa Our Own Genome
Population Relationships Based on 100 Autosomal Alu Elements Africa Asia Europe S. India
Human evolution –Higher diversity in African subpopulations Insulin minisatellite Table 12.6 in text 22 divergent lineages exist in the human population All are found in Africa. Only 3 are found outside of Africa. Our Own Genome
Interpreting the information generated by the human genome project –The complexity of genome function makes interpretation difficult –Ex. What are the regulatory sequences? –Ex. Exons can be spliced together in different ways in different tissues Our Own Genome
09_30_alt.splice.RNA.jpg