Genomes
Figure 17.7 Synthetic Cells Figure 17.7 Synthetic Cells Cells of Mycoplasma mycoides JCVI-syn 1.0, the first synthetic organism, are shown in this false-colored micrograph.
17.1 How Are Genomes Sequenced? 17.2 What Have We Learned from Sequencing Prokaryotic Genomes? 17.3 What Have We Learned from Sequencing Eukaryotic Genomes? 17.4 What Are the Characteristics of the Human Genome? 17.5 What Do the New Disciplines of Proteomics and Metabolomics Reveal?
Opening Question: What does dog genome sequencing reveal about other No other mammal shows as much phenotypic variation as dogs. The Dog Genome Project sequences entire genomes of different breeds and identifies genes that control specific traits, such as size. Opening Question: What does dog genome sequencing reveal about other animals?
17.1 How Are Genomes Sequenced? Genome sequencing: determine the nucleotide base sequence of an entire genome. The information is used to: Compare genomes of different species to trace evolutionary relationships Compare individuals of the same species to identify mutations that affect phenotypes
17.1 How Are Genomes Sequenced? Identify genes for particular traits, such as genes associated with diseases The Human Genome Project was proposed in 1986 to determine the normal sequence of all human DNA. Methods used were first developed to sequence prokaryotes and simple eukaryotes.
17.1 How Are Genomes Sequenced? To sequence an entire genome, the DNA is first cut into millions of small, overlapping fragments. Then many sequencing reactions are performed simultaneously.
17.1 How Are Genomes Sequenced? High-throughput sequencing uses miniaturization techniques, principles of DNA replication, and polymerase chain reaction (PCR). It is fully automated, rapid, and inexpensive.
Figure 17.1 DNA Sequencing Figure 17.1 DNA Sequencing High-throughput sequencing involves (A) the chemical amplification of DNA fragments and (B) the synthesis of complementary strands using fluorescently labeled nucleotides.
17.1 How Are Genomes Sequenced? DNA is cut into small fragments physically or using enzymes. The fragments are denatured using heat, separating the strands. Short, synthetic oligonucleotides are attached to each end of each fragment, and these are attached to a solid support.
17.1 How Are Genomes Sequenced? Fragments are amplified by PCR. Sequencing: Universal primers, DNA polymerase, and the 4 nucleotides (dNTPs, tagged with fluorescent dyes) are added. One nucleotide is added to the new DNA strand in each cycle, and the unincorporated dNTPs are removed.
17.1 How Are Genomes Sequenced? Fluorescence color of the new nucleotide at each location is detected with a camera. Fluorescent tag is removed and the cycle repeats.
17.1 How Are Genomes Sequenced? Then the sequences must be put together. The DNA sequence fragments, called “reads,” are overlapping, so they can be aligned.
17.1 How Are Genomes Sequenced? Example: Using a 10 bp fragment, cut three different ways: TG, ATG, and CCTAC AT, GCC, and TACTG CTG, CTA, and ATGC The correct order is ATGCCTACTG.
Figure 17.2 Arranging DNA Fragments Figure 17.2 Arranging DNA Fragments A series of different cuts is used to generate overlapping DNA fragments. Their sequences are arranged in order by computers. Millions of short segments are arranged in this way to generate the complete sequence of a genome.
17.1 How Are Genomes Sequenced? The field of bioinformatics was developed to analyze DNA sequences using complex mathematics and computer programs.
Figure 17.3 The Genomic Book of Life Figure 17.3 The Genomic Book of Life Genome sequences contain many features, some of which are summarized in this overview. Sifting through all the information contained in a genome sequence can help us understand how an organism functions and what its evolutionary history might be.
17.1 How Are Genomes Sequenced? Genome sequence information is used in two research fields: Functional genomics—sequence information is used to identify functions of various parts of genomes: Open reading frames—gene coding regions
17.1 How Are Genomes Sequenced? Amino acid sequences, deduced from sequences of open reading frames Regulatory sequences, such as promoters and terminators. RNA genes Other noncoding sequences
17.1 How Are Genomes Sequenced? Comparative genomics: comparison of a newly sequenced genome with sequences from other organisms. This provides more information about functions of sequences and can be used to trace evolutionary relationships.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? The first life forms to be sequenced were the simplest viruses with small genomes. The first complete genome sequence of a free-living cellular organism was for the bacterium Haemophilus influenzae in 1995.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Bacterial and archaeal genomes are: Small, and usually organized into a single chromosome Compact—85% is coding sequences Usually do not have introns Have plasmids, which may be transferred between cells
Table 17.1
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Functional genomics: H. influenzae chromosome has 1,727 open reading frames. When it was first sequenced, only 58% coded for proteins with known functions. Since then, the roles of many other proteins have been identified.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Highly infective strains of H. influenzae have genes for surface proteins that attach the bacterium to the human respiratory tract. These surface proteins are now a focus of research on treatments for H. influenzae infections.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Comparative genomics: M. genitalium lacks enzymes to synthesize amino acids, so it must obtain them from the environment. E. coli has 55 genes that encode transcriptional activators, whereas M. genitalium has only 7—a relative lack of control over gene expression.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Genome sequencing provides insights into microorganisms that are important in agriculture and medicine. Surprising relationships between organisms suggests that genes may be transferred between different species.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Rhizobium bacteria form symbiotic relationships with plants. The bacteria fix N into forms useable by plants. Sequencing has identified genes involved in successful symbiosis, and may broaden the range of plants that can form these relationships.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? E. coli strain O157:H7 causes illness in humans. 1,387 genes are different from those in the harmless strains of this bacterium, but are also present in other pathogenic bacteria, such as Salmonella. This suggests genetic exchange among species.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Severe acute respiratory syndrome (SARS) was first detected in southern China in 2002 and rapidly spread in 2003. Isolation and sequencing of the virus revealed novel proteins that are possible targets for antiviral drugs or vaccines.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Genome sequencing of organisms involved in global ecological cycles: Some bacteria produce methane, a greenhouse gas, in cow stomachs. Others remove methane from the air. Understanding the genes involved in methane production and consumption may help us slow the progress of global warming.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Traditionally, microorganisms have been identified by culturing them in the laboratory. Now, PCR and DNA analysis allow microbes to be studied without culturing.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? DNA can also be analyzed directly from environmental samples. Metagenomics—genetic diversity is explored without isolating intact microorganisms. Sequencing is used to detect presence of known microbes and previously unidentified organisms.
Figure 17.4 Metagenomics Figure 17.4 Metagenomics Microbial DNA extracted from the environment can be sequenced and analyzed. This has led to the description of many new genes and species.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? It is estimated that 90% of the microbial world has been “invisible” to biologists and is only now being revealed by metagenomics. The increased knowledge of the microbial world will improve our understanding of ecological processes and better ways to manage environmental problems.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Transposable elements (transposons) are DNA segments that can move from place to place in the genome or to a plasmid. If a transposable element is inserted into the middle of a gene, it will be transcribed, and result in abnormal proteins.
Figure 17.5 DNA Sequences That Move (A) Figure 17.5 DNA Sequences That Move Transposable elements are DNA sequences that move from one location to another. (A) In one method of transposition (“copy and paste”), the DNA sequence is replicated and the copy inserts elsewhere in the genome. (B) Composite transposons contain additional genes flanked by two transposable elements.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Composite transposons: transposable elements located near one another will transpose together and carry the intervening DNA sequence with them. Genes for antibiotic resistance can be multiplied and transferred between bacteria in this way, via plasmids.
Figure 17.5 DNA Sequences That Move (B) Figure 17.5 DNA Sequences That Move Transposable elements are DNA sequences that move from one location to another. (A) In one method of transposition (“copy and paste”), the DNA sequence is replicated and the copy inserts elsewhere in the genome. (B) Composite transposons contain additional genes flanked by two transposable elements.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Certain genes are present in all organisms (universal genes); and some universal gene segments are present in many organisms. This suggests that a minimal set of DNA sequences is common to all cells.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? Efforts to define a minimal genome involve computer analysis of genomes, the study of the smallest known genome (M. genitalium), and using transposons as mutagens. Transposons can insert into genes at random; the mutated bacteria are tested for growth and survival, and DNA is sequenced.
Figure 17.6 Using Transposon Mutagenesis to Determine the Minimal Genome Figure 17.6 Using Transposon Mutagenesis to Determine the Minimal Genome Mycoplasma genitalium has one of the smallest known genomes of any prokaryote. But are all of its genes essential to life? By inactivating the genes one by one, scientists determined which of them are essential for the cell’s survival. This research may lead to the construction of artificial cells with customized genomes, designed to perform functions such as degrading oil and making plastics.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? M. genitalium can survive in the laboratory with only 382 functional genes. One goal of the research is to design new life forms for specific purposes, such as cleaning up oil spills.
17.2 What Have We Learned from Sequencing Prokaryotic Genomes? An artificial genome has been created and inserted into bacterial cells. The entire genome of Mycoplasma mycoides was synthesized, then transplanted into empty cells of Mycoplasma capricolum. The new cell’s genome had extra sequences, so it was a new organism: Mycoplasma mycoides JCV1-syn.1.0.
Figure 17.7 Synthetic Cells Figure 17.7 Synthetic Cells Cells of Mycoplasma mycoides JCVI-syn 1.0, the first synthetic organism, are shown in this false-colored micrograph.
Working with Data 17.1: Using Transposon Mutagenesis to Determine the Minimal Genome In the experiment to create a synthetic genome and determine the minimum set of genes necessary for survival, transposon mutagenesis was used with Mycoplasma genitalium, which had the smallest known genome.
Working with Data 17.1: Using Transposon Mutagenesis to Determine the Minimal Genome Growth of M. genitalium strains with gene insertions (intragenic) was compared with strains with insertions in noncoding regions (intergenic).
If not, how many are essential? Working with Data 17.1: Using Transposon Mutagenesis to Determine the Minimal Genome Question 1: Explain these data in terms of genes essential for growth and survival. Are all of the genes in M. genitalium essential for growth? If not, how many are essential? Why did some of the insertions in intergenic regions prevent growth?
a. near the 3′ end of a coding region b. within a gene coding for rRNA Working with Data 17.1: Using Transposon Mutagenesis to Determine the Minimal Genome Question 2: If a transposon inserts into the following regions of a gene, there might be no effect on the phenotype. Explain in each case: a. near the 3′ end of a coding region b. within a gene coding for rRNA How does this affect your answer to Question 1?
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? There are major differences between eukaryotic and prokaryotic genomes: Eukaryotic genomes are larger and have more protein-coding genes. Eukaryotic genomes have more regulatory sequences. Greater complexity requires more regulation.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Much of eukaryotic DNA is noncoding, including introns, gene control sequences, and repeated sequences. Eukaryotes have multiple chromosomes; each must have an origin of replication, a centromere, and a telomeric sequence at each end.
Table 17.2
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Several model organisms have been studied extensively. Model organisms are easy to grow and study in a laboratory, their genetics are well studied, and they have characteristics that represent a larger group of organisms.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? The yeast, Saccharomyces cerevisiae: Yeasts are single-celled eukaryotes. Yeasts and E. coli appear to use about the same number of genes to perform basic functions. Compartmentalization of the eukaryotic yeast cell requires it to have many more genes.
Table 17.3
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? The nematode, Caenorhabditis elegans: A millimeter-long soil roundworm. The transparent body is made up of about 1,000 cells, yet has complex organ systems. It has about 3.3 times as many protein-coding genes as do yeasts.
Table 17.4
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? The fruit fly, Drosophila melanogaster: Studies of fruit flies led to formulation of many basic principles of genetics. More than 2,500 mutations have been described. It has 10 times more cells and a larger genome than C. elegans, but fewer coding genes.
Figure 17.8 Functions of the Eukaryotic Genome Figure 17.8 Functions of the Eukaryotic Genome The distribution of gene functions in Drosophila melanogaster shows a pattern that is typical of many complex organisms.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? The thale cress, Arabidopsis thaliana: A small plant with a small genome. Many of the genes found in animals have homologs in plants, suggesting a common ancestor. But many genes are also unique to plants.
Table 17.5
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Rice (Oryza sativa) and a poplar tree (Populus trichocarpa) have also been sequenced. Comparison of the genomes shows many genes in common, comprising the basic minimal plant genome.
Figure 17.9 Plant Genomes Figure 17.9 Plant Genomes Three plant genomes share a common set of approximately 21,000 genes that appear to comprise the “minimal” plant genome.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Eukaryotes have closely related genes called gene families. These arose over evolutionary time when different copies of genes underwent separate mutations. Example: Genes encoding the globin proteins all arose from a single common ancestral gene.
Figure 17.10 The Globin Gene Family Figure 17.10 The Globin Gene Family The -globin and -globin clusters of the human globin gene family are located on different chromosomes. The genes of each cluster are separated by noncoding “spacer” DNA. The nonfunctional pseudogenes are indicated by the Greek letter psi (). The gene has two variants, A and G.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? During development, different members of the globin gene family are expressed at different times in different tissues. Example: Hemoglobin of the human fetus contains γ-globin, which binds O2 more tightly than adult hemoglobin.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Many gene families include nonfunctional pseudogenes (Ψ), resulting from mutations that cause a loss of function. A pseudogene may simply lack a promoter, and thus fail to be transcribed, or a recognition site needed for the removal of an intron.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Eukaryotic genomes have repetitive DNA sequences: Highly repetitive sequences—short sequences (< 100 bp) repeated thousands of times in tandem; not transcribed. Short tandem repeats (STRs) of 1–5 bp can be used in DNA fingerprinting.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Moderately repetitive sequences are repeated 10–1,000 times. Includes genes for tRNAs and rRNAs Single copies of the tRNA and rRNA genes would be inadequate to supply the large amounts of these molecules needed by cells.
Figure 17.11 A Moderately Repetitive Sequence Codes for rRNA (Part 1) Figure 17.11 A Moderately Repetitive Sequence Codes for rRNA (A) This rRNA gene, along with its nontranscribed spacer region, is repeated 280 times in the human genome, with clusters on five chromosomes. (B) This electron micrograph shows transcription of multiple rRNA genes.
Figure 17.11 A Moderately Repetitive Sequence Codes for rRNA (Part 2) Figure 17.11 A Moderately Repetitive Sequence Codes for rRNA (A) This rRNA gene, along with its nontranscribed spacer region, is repeated 280 times in the human genome, with clusters on five chromosomes. (B) This electron micrograph shows transcription of multiple rRNA genes.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Transposons (transposable elements) are moderately repetitive sequences. Three types are retrotransposons: SINEs (short interspersed elements) LINEs (long interspersed elements) LTRs (long terminal repeats)
Table 17.6
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Retrotransposons are transcribed into RNA, which is a template for new DNA. The new DNA becomes inserted at a new location, resulting in two copies of the transposon. DNA transposons are excised from the original location and become inserted at a new location without being replicated.
17.3 What Have We Learned from Sequencing Eukaryotic Genomes? Insertion of a transposon at a new location can have important consequences, such as mutations and gene duplications. They can result in shuffling the genetic material and creating new genes. Transposons may have played a role in endosymbiosis.
17.4 What Are the Characteristics of the Human Genome? Sequencing of the human genome revealed many interesting facts: Protein-coding regions make up about 1.2%, or 21,000 genes. The average gene must code for several different proteins, and posttranscriptional mechanisms result in different proteins.
17.4 What Are the Characteristics of the Human Genome? An average gene has 27,000 base pairs. All human genes have many introns. About half of the genome is transposons and other repetitive sequences.
17.4 What Are the Characteristics of the Human Genome? 99.5% of the genome is the same in all people. Variation among individuals is due to single nucleotide polymorphisms (SNPs), and differences in sequence copy number from chromosomal deletions, duplications, or translocations.
17.4 What Are the Characteristics of the Human Genome? Genes are not evenly distributed over the genome. The Y chromosome has the fewest genes (231); chromosome 1 has the most (2,968).
17.4 What Are the Characteristics of the Human Genome? Comparisons of prokaryote and eukaryote genomes have revealed evolutionary relationships between genes.
Figure 17.12 Evolution of the Genome Figure 17.12 Evolution of the Genome A comparison of the human and other genomes has revealed how genes with new functions have been added over the course of evolution. Each percentage number refers to genes in the human genome. Thus 21 percent of human genes have homologs in prokaryotes and other eukaryotes, 32 percent of human genes occur only in other eukaryotes, and so on.
17.4 What Are the Characteristics of the Human Genome? The genomes of many primates have been sequenced, and biologists are interested in which genes make humans unique. Chimpanzees are our closest living relative: they share almost 99% of our DNA sequences.
17.4 What Are the Characteristics of the Human Genome? DNA from the bones of Neanderthals, who lived in Europe up to 50,000 years ago, has also been sequenced. It is 99% identical to human DNA, justifying classification of Neanderthals as part of the same genus, Homo.
17.4 What Are the Characteristics of the Human Genome? Comparisons of human and Neanderthal genes: A mutation in MC1R in Neanderthals causes lower activity of MC1R, known to result in fair skin and red hair. FOXP2, involved in vocalization, is identical in humans and Neanderthals, suggesting that Neanderthals were capable of speech.
Figure 17.13 A Neanderthal Child Figure 17.13 A Neanderthal Child Genome sequencing and analyses have led to this reconstruction of a Neanderthal child who lived about 60,000 years ago.
17.4 What Are the Characteristics of the Human Genome? There are some distinctive “human” DNA sequences and also distinctive “Neanderthal” sequences. There is some mixture of the two, indicating that humans and Neanderthals interbred.
17.4 What Are the Characteristics of the Human Genome? Rapid genotyping technologies are being used to understand the genetic basis of diseases such as diabetes, heart disease, and Alzheimer’s disease. “Haplotype maps” are used to identify SNPs that are linked to genes involved in disease.
17.4 What Are the Characteristics of the Human Genome? A haplotype is a piece of chromosome with a set of SNPs that are usually inherited as a unit. By comparing the haplotypes of individuals with and without a particular genetic disease, the loci associated with the disease can be identified.
Figure 17.14 SNP Genotyping and Disease Figure 17.14 SNP Genotyping and Disease Scanning the genomes of people with and without particular diseases reveals correlations between SNPs and complex diseases.
17.4 What Are the Characteristics of the Human Genome? New technologies analyze thousands or millions of SNPs to determine which ones are associated with specific diseases. As the cost of sequencing entire genomes decreases, SNP testing may be superseded.
Table 17.7
17.4 What Are the Characteristics of the Human Genome? Pharmacogenomics is the study of how an individual’s genome affects response to drugs or other outside agents. SNPs that are associated with specific drug responses can be identified to personalize drug treatments and determine if a patient will respond to a drug.
Figure 17.15 Pharmacogenomics Figure 17.15 Pharmacogenomics Correlations between genotypes and responses to drugs will help physicians develop personalized medical care. The different colors indicate individuals with different SNPs.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Many genes encode more than one protein. Alternative splicing and posttranslational modifications increase the number of proteins that can be derived from one gene. But many proteins are produced only by certain cells under specific conditions.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Proteome: sum total of proteins produced by an organism; it is more complex than the genome. Proteomics seeks to identify and characterize all the expressed proteins in an organism.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Two techniques are used to analyze the proteome: Two-dimensional gel electrophoresis separates proteins based on size and electric charges. Mass spectrometry identifies proteins by their atomic masses.
Figure 17.16 Proteomics Figure 17.16 Proteomics (A) A single gene can code for multiple proteins. (B) A cell’s proteins can be separated on the basis of charge and size by two-dimensional gel electrophoresis. The two separations can distinguish most proteins from one another.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Comparisons of eukaryotic proteomes has revealed a common set of about 1,300 proteins that provide the basic metabolic functions.
Figure 17.17 Proteins of the Eukaryotic Proteome Figure 17.17 Proteins of the Eukaryotic Proteome About 1,300 proteins are common to all eukaryotes and fall into these categories. Although their amino acid sequences may differ to a limited extent, they perform the same essential functions in all eukaryotes.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Proteins have different functional regions or domains. Proteins that are unique to a particular organism are often just unique combinations of domains that exist in other organisms. This reshuffling of the genetic deck is a key to evolution.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Gene and protein function are both affected by the internal and external environments of the cell. Enzyme activities affect concentrations of their substrates and products, called metabolites. As the proteome changes, so will the abundances of metabolites.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Metabolome: quantitative description of all of the small molecules in a cell or organism. Primary metabolites—involved in normal processes such as pathways like glycolysis. Also includes hormones and other signaling molecules.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Secondary metabolites—often unique to particular organisms or groups. Examples include antibiotics made by microbes and chemicals made by plants for defense.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Measuring metabolites involves gas chromatography and high- performance liquid chromatography, which separate molecules. Mass spectrometry and nuclear magnetic resonance spectroscopy are used to identify them.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? A human metabolome database has been established and contains 6,500 metabolites. The challenge now is to relate levels of these substances to physiology.
17.5 What Do the New Disciplines Proteomics and Metabolomics Reveal? Plant metabolomics has been studied for many years. Tens of thousands of secondary metabolites have been identified. The metabolome of the model organism Arabidopsis thaliana is now being described.
17 Answer to Opening Question Myostatin is a protein that inhibits muscle growth. In dog breeds with highly developed leg muscles, the gene for myostatin has a mutation that makes the protein inactive. In humans it may be possible to manipulate myostatin to treat muscle- wasting diseases such as muscular dystrophy.