Genomes and Their Evolution 18 Genomes and Their Evolution
Overview: Reading the Leaves from the Tree of Life Complete genome sequences exist for a human, chimpanzee, E. coli and numerous other prokaryotes, corn, fruit fly, house mouse, orangutan, and others Comparisons of genomes among organisms provide information about the evolutionary history of genes and taxonomic groups 2
Genomics is the study of whole sets of genes and their interactions Bioinformatics is the application of computational methods to the storage and analysis of biological data 3
Figure 18.1 Figure 18.1 What genomic information distinguishes a human from a chimpanzee? 4
Concept 18.1: The Human Genome Project fostered development of faster, less expensive sequencing techniques The Human Genome Project officially began in 1990, and the sequencing was largely completed by 2003 Even with automation, the sequencing of all 3 billion base pairs in a haploid set presented a formidable challenge A major thrust of the Human Genome Project was the development of technology for faster sequencing 5
Cut the DNA into overlapping fragments short enough for sequencing. Figure 18.2-1 1 Cut the DNA into overlapping fragments short enough for sequencing. 2 Clone the fragments in plasmid or other vectors. Figure 18.2-1 Whole-genome shotgun approach to sequencing (step 1) 6
Cut the DNA into overlapping fragments short enough for sequencing. Figure 18.2-2 1 Cut the DNA into overlapping fragments short enough for sequencing. 2 Clone the fragments in plasmid or other vectors. 3 Sequence each fragment. CGCCATCAGT AGTCCGCTATACGA ACGATACTGGT Figure 18.2-2 Whole-genome shotgun approach to sequencing (step 2) 7
…CGCCATCAGTCCGCTATACGATACTGGT… Figure 18.2-3 1 Cut the DNA into overlapping fragments short enough for sequencing. 2 Clone the fragments in plasmid or other vectors. 3 Sequence each fragment. CGCCATCAGT AGTCCGCTATACGA ACGATACTGGT Figure 18.2-3 Whole-genome shotgun approach to sequencing (step 3) CGCCATCAGT ACGATACTGGT 4 Order the sequences into one overall sequence with computer software. AGTCCGCTATACGA …CGCCATCAGTCCGCTATACGATACTGGT… 8
This approach starts with cloning and sequencing random DNA fragments The whole-genome shotgun approach was developed by J. Craig Venter and colleagues This approach starts with cloning and sequencing random DNA fragments Powerful computer programs are used to assemble the resulting short overlapping sequences into a single continuous sequence 9
Concept 18.2: Scientists use bioinformatics to analyze genomes and their functions The Human Genome Project established databases and refined analytical software to make data available on the Internet National Library of Medicine and the National Institutes of Health (NIH) created the National Center for Biotechnology Information (NCBI) European Molecular Biology Laboratory DNA Data Bank of Japan BGI in Shenzhen, China This has accelerated progress in DNA sequence analysis 10
Identifying the Functions of Protein-Coding Genes DNA sequence may vary more than the protein sequence does Scientists interested in proteins often compare the predicted amino acid sequence of a protein with that of other proteins Protein function can be deduced from sequence similarity or a combination of biochemical and functional studies 11
Understanding Genes and Gene Expression at the Systems Level Genomics is a rich source of insights into questions about gene organization, regulation of expression, growth and development, and evolution A project called ENCODE (Encyclopedia of DNA Elements) has yielded a wealth of information about protein-coding genes, genes for noncoding RNA, and sequences that regulate DNA replication, gene expression, and chromatin modification 12
Systems Biology Proteomics is the systematic study of the full protein sets (proteomes) encoded by genomes We must study when and where proteins are produced in an organism in order to understand the function of cells and organisms Systems biology aims to model the dynamic behavior of whole biological systems based on the study of interactions among the system’s parts 13
Application of Systems Biology to Medicine A systems biology approach has several medical applications The Cancer Genome Atlas project (completed in 2010) attempted to identify all the common mutations in three types of cancer by comparing gene sequences and expression in cancer versus normal cells This was so fruitful that it will be extended to ten other common cancers Silicon and glass “chips” have been produced that hold a microarray of most known human genes 14
Figure 18.4 Figure 18.4 A human gene microarray chip 15
Concept 18.3: Genomes vary in size, number of genes, and gene density By August of 2012, about 3,700 genomes had been completely sequenced, including 3,300 bacterial,160 archaeal, and 183 eukaryotic genomes Sequencing of over 7,500 genomes and about 340 metagenomes was in progress 16
Table 18.1 Table 18.1 Genome sizes and estimated numbers of genes 17
Concept 18.4: Multicellular eukaryotes have much noncoding DNA and many multigene families The bulk of most eukaryotic genomes encodes neither proteins nor functional RNAs Sequencing of the human genome reveals that 98.5% does not code for proteins, rRNAs, or tRNAs About a quarter of the human genome codes for introns and gene-related regulatory sequences 18
Intergenic DNA is noncoding DNA found between genes Pseudogenes are former genes that have accumulated mutations and are now nonfunctional Repetitive DNA is present in multiple copies in the genome About three-fourths of repetitive DNA is made up of transposable elements and sequences related to them 19
Simple sequence DNA (3%) Figure 18.5 Exons (1.5%) Regulatory sequences (5%) Introns (20%) Repetitive DNA that includes transposable elements and related sequences (44%) Unique noncoding DNA (15%) L1 sequences (17%) Figure 18.5 Types of DNA sequences in the human genome Repetitive DNA unrelated to transposable elements (14%) Alu elements (10%) Large-segments duplications (5–6%) Simple sequence DNA (3%) 20
Transposable Elements and Related Sequences The first evidence for mobile DNA segments came from geneticist Barbara McClintock’s breeding experiments with Indian corn McClintock identified changes in the color of corn kernels that made sense only if some genetic elements move from other genome locations into the genes for kernel color These transposable elements move from one site to another in a cell’s DNA; they are present in both prokaryotes and eukaryotes 21
Figure 18.6 Figure 18.6 The effect of transposable elements on corn kernel color 22
Movement of Transposons and Retrotransposons Eukaryotic transposable elements are of two types Transposons, which move by a “cut and paste” method that sometimes leaves a copy behind Retrotransposons, which move by means of an RNA intermediate and always leave a copy behind 23
New copy of transposon Transposon is copied Figure 18.7 New copy of transposon Transposon DNA of genome Transposon is copied Insertion Figure 18.7 Transposon movement Mobile transposon 24
New copy of retrotransposon Reverse transcriptase Figure 18.8 New copy of retrotransposon Retrotransposon Formation of a single-stranded RNA intermediate RNA Insertion Reverse transcriptase Figure 18.8 Retrotransposon movement 25
Sequences Related to Transposable Elements Multiple copies of transposable elements and related sequences are scattered throughout eukaryotic genomes In primates, a large portion of transposable element–related DNA consists of a family of similar sequences called Alu elements Many Alu elements are transcribed into RNA molecules; however, their function, if any, is unknown 26
A series of repeating units of 2 to 5 nucleotides is called a short tandem repeat (STR) The repeat number for STRs can vary among sites (within a genome) or individuals STR diversity can be used to identify a unique set of genetic markers for each individual, his or her genetic profile Forensic scientists can use STR analysis on DNA samples to identify victims of crime or natural disasters 27
Genes and Multigene Families Many eukaryotic genes are present in one copy per haploid set of chromosomes The rest occur in multigene families, collections of identical or very similar genes Some multigene families consist of identical DNA sequences, usually clustered tandemly, such as those that code for rRNA products 28
Alterations of Chromosome Structure Humans have 23 pairs of chromosomes, while chimpanzees have 24 pairs Following the divergence of humans and chimpanzees from a common ancestor, two ancestral chromosomes fused in the human line 29
12 13 Human chromosome 2 Chimpanzee chromosomes Telomere sequences Figure 18.10 Human chromosome 2 Chimpanzee chromosomes Telomere sequences Centromere sequences Telomere-like sequences 12 Centromere-like sequences Figure 18.10 Related human and chimpanzee chromosomes 13 30
Figure 18.11 Large blocks of genes from human chromosome 16 can be found on four mouse chromosomes Human chromosome 16 Mouse chromosomes Figure 18.11 A comparison of human and mouse chromosomes 7 8 16 17 Blocks of genes have stayed together during evolution of mouse and human lineages 31
Concept 18.6: Comparing genome sequences provides clues to evolution and development Genome sequencing and data collection have advanced rapidly in the last 25 years Comparative studies of genomes Reveal much about the evolutionary history of life Help clarify mechanisms that generated the great diversity of present-day life-forms 32
Bacteria Most recent common ancestor of all living things Eukarya Figure 18.15 Bacteria Most recent common ancestor of all living things Eukarya Archaea 4 3 2 1 0 Billions of years ago Chimpanzee Figure 18.15 Evolutionary relationships of the three domains of life Human Mouse 70 60 50 40 30 20 10 Millions of years ago 33
Comparing Distantly Related Species Highly conserved genes have remained similar over time These help clarify relationships among species that diverged from each other long ago Bacteria, archaea, and eukaryotes diverged from each other between 2 and 4 billion years ago Comparative genomic studies confirm the relevance of research on model organisms to our understanding of biology in general and human biology in particular 34
Comparing Genomes Within a Species As a species, humans have only been around about 200,000 years and have low within-species genetic variation Most of the variation within humans is due to single nucleotide polymorphisms (SNPs) There are also inversions, deletions, and duplications and a large number of copy-number variants (CNVs) These variations are useful for studying human evolution and human health 35
Widespread Conservation of Developmental Genes Among Animals Molecular analysis of the homeotic genes in Drosophila has shown that they all include a sequence called a homeobox An identical or very similar nucleotide sequence has been discovered in the homeotic genes of both vertebrates and invertebrates The vertebrate genes homologous to homeotic genes of flies have kept the same chromosomal arrangement 36
Adult fruit fly Fruit fly embryo (10 hours) Fly chromosome Mouse Figure 18.17 Adult fruit fly Fruit fly embryo (10 hours) Fly chromosome Mouse chromosomes Figure 18.17 Conservation of homeotic genes in a fruit fly and a mouse. Related homeobox sequences have been found in regulatory genes of yeasts and plants The homeodomain is the part of the protein that binds to DNA when the protein functions as a transcriptional regulator The more variable domains in the protein recognize particular DNA sequences and specify which genes are regulated by the protein Mouse embryo (12 days) Adult mouse 37
Genomes and Their Evolution 22 Genomes and Their Evolution
Speciation is the process by which one species splits into two or more species Speciation explains the features shared between organisms due to inheritance from their recent common ancestor 39
Speciation forms a conceptual bridge between microevolution and macroevolution Microevolution consists of changes in allele frequency in a population over time Macroevolution refers to broad patterns of evolutionary change above the species level 40
Concept 22.1: The biological species concept emphasizes reproductive isolation Species is a Latin word meaning “kind” or “appearance” Biologists compare morphology, physiology, biochemistry, and DNA sequences when grouping organisms 41
The Biological Species Concept The biological species concept states that a species is a group of populations whose members have the potential to interbreed in nature and produce viable, fertile offspring; they do not breed successfully with other populations Gene flow between populations holds the populations together genetically 42
(a) Similarity between different species Figure 22.2 Figure 22.2 The biological species concept is based on the potential to interbreed rather than on physical similarity (a) Similarity between different species (b) Diversity within a species 43
Reproductive Isolation Reproductive isolation is the existence of biological barriers that impede two species from producing viable, fertile offspring Hybrids are the offspring of crosses between different species Reproductive isolation can be classified by whether barriers act before or after fertilization 44
Figure 22.3 Prezygotic barriers Postzygotic barriers Habitat isolation Temporal isolation Behavioral isolation Mechanical isolation Gametic isolation Reduced hybrid viability Reduced hybrid fertility Hybrid breakdown VIABLE, FERTILE OFF- SPRING MATING ATTEMPT FERTILI- ZATION (a) (c) (e) (f) (g) (h) (i) (l) (d) (j) (b) Figure 22.3 Exploring reproductive barriers (k) 45
Prezygotic barriers block fertilization from occurring by Impeding different species from attempting to mate Preventing the successful completion of mating Hindering fertilization if mating is successful 46
Habitat isolation Temporal isolation Behavioral isolation Figure 22.3a Prezygotic barriers Habitat isolation Temporal isolation Behavioral isolation MATING ATTEMPT (a) (c) (e) Figure 22.3a Exploring reproductive barriers (part 1: prezygotic barriers) (d) (b) 47
Mechanical isolation Gametic isolation Figure 22.3b Prezygotic barriers Mechanical isolation Gametic isolation MATING ATTEMPT FERTILIZATION (f) (g) Figure 22.3b Exploring reproductive barriers (part 2: prezygotic barriers) 48
Reduced hybrid viability Reduced hybrid fertility Hybrid breakdown Figure 22.3c Postzygotic barriers Reduced hybrid viability Reduced hybrid fertility Hybrid breakdown VIABLE, FERTILE OFFSPRING FERTILIZATION (h) (i) (l) (j) Figure 22.3c Exploring reproductive barriers (part 3: postzygotic barriers) (k) 49
Prezygotic barriers Habitat isolation Temporal isolation Behavioral Figure 22.3d Prezygotic barriers Habitat isolation Temporal isolation Behavioral isolation MATING ATTEMPT Mechanical isolation Gametic isolation MATING ATTEMPT FERTILIZATION Figure 22.3d Exploring reproductive barriers (part 4: art summary) Postzygotic barriers Reduced hybrid viability Reduced hybrid fertility Hybrid breakdown VIABLE, FERTILE OFFSPRING FERTILIZATION 50
Limitations of the Biological Species Concept The biological species concept cannot be applied to fossils or asexual organisms (including all prokaryotes) The biological species concept emphasizes absence of gene flow However, gene flow can occur between distinct species For example, grizzly bears and polar bears can mate to produce “grolar bears” 51
Grizzly bear (U. arctos) Polar bear (U. maritimus) Figure 22.4 Grizzly bear (U. arctos) Figure 22.4 Hybridization between two species of bears in the genus Ursus Hybrid “grolar bear” Polar bear (U. maritimus) 52
Other Definitions of Species Other species concepts emphasize the unity within a species rather than the separateness of different species The morphological species concept defines a species by structural features It applies to sexual and asexual species but relies on subjective criteria 53
The ecological species concept views a species in terms of its ecological niche It applies to sexual and asexual species and emphasizes the role of disruptive selection The phylogenetic species concept defines a species as the smallest group of individuals on a phylogenetic tree It applies to sexual and asexual species, but it can be difficult to determine the degree of difference required for separate species 54
Concept 22.2: Speciation can take place with or without geographic separation Speciation can occur in two ways Allopatric speciation Sympatric speciation 55
Allopatric speciation: forms a new species while Figure 22.5 Figure 22.5 Two main modes of speciation Allopatric speciation: forms a new species while geographically isolated. (a) (b) Sympatric speciation: a subset forms a new species without geographic separation. 56
Evidence of Allopatric Speciation Fifteen pairs of sister species of snapping shrimp (Alpheus) are separated by the Isthmus of Panama These species originated from 9 million to 3 million years ago, when the Isthmus of Panama formed and separated the Atlantic and Pacific waters 57
A. formosus A. nuttingi ATLANTIC OCEAN Isthmus of Panama PACIFIC OCEAN Figure 22.7 A. formosus A. nuttingi ATLANTIC OCEAN Isthmus of Panama PACIFIC OCEAN Figure 22.7 Allopatric speciation in snapping shrimp (Alpheus) A. panamensis A. millsae 58
Initial population of fruit flies (Drosophila pseudoobscura) Figure 22.8 Experiment Initial population of fruit flies (Drosophila pseudoobscura) Some flies raised on starch medium Some flies raised on maltose medium Mating experiments after 40 generations Results Female Female Starch population 1 Starch population 2 Starch Maltose Figure 22.8 Inquiry: Can divergence of allopatric populations lead to reproductive isolation? Starch population 1 Starch 22 9 18 15 Male Male population 2 Starch Maltose 8 20 12 15 Number of matings in experimental group Number of matings in control group 59
Sympatric (“Same Country”) Speciation In sympatric speciation, speciation takes place in populations that live in the same geographic area Sympatric speciation occurs when gene flow is reduced between groups that remain in contact through factors including Polyploidy Habitat differentiation Sexual selection 60
Polyploidy Polyploidy is the presence of extra sets of chromosomes due to accidents during cell division Polyploidy is much more common in plants than in animals. Many important crops (oats, cotton, potatoes, tobacco, and wheat) are polyploids An autopolyploid is an individual with more than two chromosome sets, derived from one species The offspring of matings between autopolyploids and diploids have reduced fertility 61
Concept 22.3: Hybrid zones reveal factors that cause reproductive isolation A hybrid zone is a region in which members of different species mate and produce hybrids Hybrids are the result of mating between species with incomplete reproductive barriers 62
Concept 22.4: Speciation can occur rapidly or slowly and can result from changes in few or many genes Many questions remain concerning how long it takes for new species to form, or how many genes need to differ between species Speciation can be studied using the fossil record, morphological data, or molecular data 63
Patterns in the Fossil Record The fossil record includes examples of species that appear suddenly, persist essentially unchanged for some time, and then apparently disappear These periods of apparent stasis punctuated by sudden change are called punctuated equilibria The punctuated equilibrium model contrasts with a model of gradual change in a species’ existence 64
(a) Punctuated model Time (b) Gradual model Figure 22.14 Figure 22.14 Two models for the tempo of speciation, based on patterns observed in the fossil record (b) Gradual model 65