Genetics: From Genes to Genomes

Slides:



Advertisements
Similar presentations
The Human Genome Project
Advertisements

MCB 317 Genetics and Genomics Topic 11, part 2 Genomics.
Major insights from the HGP on Nature (2001) 15 th Feb Vol 409 special issue; pgs 814 & )Gene content 2)Proteome content 3)SNP identification.
Recombinant DNA Technology
Chapter 15 The Human Genome Project and Genomics
Duplication, rearrangement, and mutation of DNA contribute to genome evolution Chapter 21, Section 5.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display Human Genetics Concepts and Applications Eighth Edition.
1 Genetics The Study of Biological Information. 2 Chapter Outline DNA molecules encode the biological information fundamental to all life forms DNA molecules.
9 Genomics and Beyond Brief Chapter Outline
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CHAPTER 18 LECTURE SLIDES.
Physical Mapping I CIS 667 February 26, Physical Mapping A physical map of a piece of DNA tells us the location of certain markers  A marker is.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
16 and 20 February, 2004 Chapter 9 Genomics Mapping and characterizing whole genomes.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Genetics: From Genes to Genomes
Cloning:Recombinant DNA
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Manipulating the Genome: DNA Cloning and Analysis 20.1 – 20.3 Lesson 4.8.
Today’s Lecture Genetic mapping studies: two approaches
DNA Technology and Genomics
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
20.1 – 1 Look at the illustration of “Cloning a Human Gene in a Bacterial Plasmid” (Figure 20.4 in the orange book). If the medium used for plating cells.
Fig 11-1 Chapter 11: recombinant DNA and related techniques.
Presentation on genome sequencing. Genome: the complete set of gene of an organism Genome annotation: the process by which the genes, control sequences.
Genomics Chapter 18.
HAPLOID GENOME SIZES (DNA PER HAPLOID CELL) Size rangeExample speciesEx. Size BACTERIA1-10 Mb E. coli: Mb FUNGI10-40 Mb S. cerevisiae 13 Mb INSECTS.
Mouse Genome Sequencing
AP Biology Ch. 20 Biotechnology.
Biotechnology SB2.f – Examine the use of DNA technology in forensics, medicine and agriculture.
Trends in Biotechnology
20.1 – 1 Look at the illustration of “Cloning a Human Gene in a Bacterial Plasmid” (Figure 20.4 in the orange book). If the medium used for plating cells.
CO 10.
Genomics BIT 220 Chapter 21.
Chapter 16 Gene Technology. Focus of Chapter u An introduction to the methods and developments in: u Recombinant DNA u Genetic Engineering u Biotechnology.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
Chapter 13 Table of Contents Section 1 DNA Technology
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
DNA TECHNOLOGY AND GENOMICS CHAPTER 20 P
Linkage and Mapping. Figure 4-8 For linked genes, recombinant frequencies are less than 50 percent.
By Melissa Rivera.  GENE CLONING: production of multiple identical copies of DNA  It was developed so scientists could work directly with specific genes.
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display Human Genetics Concepts and Applications Seventh Edition.
15.2, slides with notes to write down
Human Genome.
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Chapter 2 From Genes to Genomes. 2.1 Introduction We can think about mapping genes and genomes at several levels of resolution: A genetic (or linkage)
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
DNA Technology and Genomics
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
Genomics Chapter 18.
Genetic Engineering/ Recombinant DNA Technology
Notes: Human Genome (Right side page)
Genome Analysis. This involves finding out the: order of the bases in the DNA location of genes parts of the DNA that controls the activity of the genes.
Chapter 14 GENETIC TECHNOLOGY. A. Manipulation and Modification of DNA 1. Restriction Enzymes Recognize specific sequences of DNA (usually palindromes)
DNA Technology & Genomics CHAPTER 20. Restriction Enzymes enzymes that cut DNA at specific locations (restriction sites) yielding restriction fragments.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Biotechnology.
Human Genome Project.
Peter John M.Phil, PhD Atta-ur-Rahman School of Applied Biosciences (ASAB) National University of Sciences & Technology (NUST)
Genetics: From Genes to Genomes
Genomes and Their Evolution
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Genetics: From Genes to Genomes
Presentation transcript:

Genetics: From Genes to Genomes PowerPoint to accompany Genetics: From Genes to Genomes Third Edition Hartwell ● Hood ● Goldberg ● Reynolds ● Silver ● Veres Chapter 10 Prepared by Malcolm Schug University of North Carolina Greensboro Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Reconstructing the Genome Through Genetic and Molecular Analysis Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Outline of Chapter 10 Challenges and strategies of genome analysis Genome size Features to be analyzed Problems with DNA polymorphisms Development of whole-genome maps Insights emerging from complete genome sequencing Number and type of genes Extent of repeated sequences Genome organization and structure Evolution by lateral gene transfer High throughput tools for analyzing genomes and their protein products DNA sequencers DNA arrays Mass spectrophotometers Two paradigm changes propelled by whole-genome sequences and new tools of genome analysis Systems biology Predictive and preventative medicine Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

The genomes of living organisms vary enormously in size. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Genomicists look at two basic features of genomes: sequence and polymorphism. Major challenges to determine sequence of each chromosome in genome and identify many polymorphisms: How does one sequence a 500 Mb chromosome 600 bp at a time? How accurate should a genome sequence be? DNA sequencing error rate is about 1% per 600 bp. How does one distinguish sequence errors from polymorphisms? Rate of polymorphism in diploid human genome is about 1 in 500 bp. Repeat sequences may be hard to place. Unclonable DNA cannot be sequenced. Up to 30% of genome is heterochromatic DNA that can not be cloned Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Divide and conquer strategy meets most challenges. Chromosomes are broken into small overlapping pieces and cloned. Ends of clones sequenced and reassembled into original chromosome strings Each piece is sequenced multiple times to reduce error rate. 10-fold sequence coverage achieves a rate of error less than 1/10,000. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Figure 10.2 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.2

Techniques for mapping and cloning Library of DNA fragments 500 – 1,000,000 bp Insert into one of a variety of vectors Hybridization Location of a particular DNA sequence within the library of fragments PCR amplification Direct amplification of a particular region of DNA ranging from 1 bp to > 20kb DNA sequencing Automated DNA sequencer using Sanger method determines sequences 600 bp at a time. Computational tools Programs for identifying matches between a particular sequence and a large population of previously sequenced fragments Programs for identifying overlaps of DNA fragments Programs for estimating error rates Programs for identifying genes in chromosomal sequences Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Making a large scale linkage map Types of DNA polymorphisms used for large-scale mapping: Single nucleotide polymorphisms (SNPs) – 1/500 – 1/1000 bp across genome Simple sequence repeats (SSRs) – 1/20-1/40 kb across genome 2-5 nucleotides is repeated 4-50 or more times. Most SNPs and SSRs have little or no effect on the organism. Serve as DNA markers across the chromosomes Must be able to rapidly identify and assay in populations from 100s to 1000s of individuals Figure 10.3 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.3

Genome wide identification of genetic markers Initial genetic maps used SSRs which are highly polymorphic. Identified by screening DNA libraries with SSR probes Amplified by PCR and length differences assayed SNPs – millions more recently identified by comparison of orthologous regions of cDNA clones from different individuals Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Paralogous – arise by duplication within same species Homologous – genes with enough sequence similarity to be related somewhere in evolutionary history Orthologous – genes in two different species that arose from the same gene in the two species’ common ancestor Paralogous – arise by duplication within same species Orthologous genes are always homologous, but homologous genes are not always orthologous. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

SNPs and SSRs for genome coverage Until recently, maps were constructed from about 500 SSRs evenly spaced across genome (1 SSR every 6 Mb). SNPs provide more than 500,000 DNA markers across the genome. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Genome wide typing of genetic markers Two-stage assay for simple sequence repeats PCR amplification Size separation Figure 10.4 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.4

Long range physical maps: karyotypes and genomic libraries position markers on chromosomes. Overlapping DNA fragments ordered and oriented that span each of the chromosomes Based on direct analysis of DNA rather than recombination on which linkage maps are based Chart actual number of bp, kb, or Mb that separate a locus from its neighbors Linkage vs. physical maps 1 cM = 1 Mb in humans 1 cM = 2 Mb in mice Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Vectors used for clone large inserts for physical mapping YACs (yeast artificial chromosomes) Insert size 100-1,000,000 Mb BACs (bacterial artificial chromosomes) Insert size 50 – 300 kb More stable and easier to purify from host DNA than YACs Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

How to determine order of clones across genome Overlapping inserts help align cloned fragments. Bottom-up approach – overlapping sequences of tens of thousands of clones determined by restriction site analysis or sequence tag sites (STSs) Top-down approach – insert is hybridized against karyotype of entire genome. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Identifying and isolating a set of overlapping fragments from a library Two approaches: Linkage maps used to derive a physical map Set of markers less than 1 cM apart Use markers to retrieve fragments from library by hybridization. Construct contigs – two or more partially overlapping cloned fragments. Chromosome walk by using ends of unconnected contigs to probe library for fragments in unmapped regions Physical mapping techniques: Direct analysis of DNA Overlapping clones aligned by restriction mapping Sequence tag segments (STSs) Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Physical mapping by analysis of STSs Bottom-up approach Figure 10.5 Fig. 10.5 Each STS represents a unique segment of the genome amplified by PCR. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Human Karyotype (a) Complete set of human chromosomes stained with Giemsa dye shows bands. (b) Ideograms show idealized banding pattern. Figure 10.5 a, b Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.6 a, b

Chromosome 7 at three levels of resolution Figure 10.5 c Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10. 6 c

FISH protocol for top-down approach Figure 10.8 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.8

Sequence maps show the order of nucleotides in a cloned piece of DNA. Two strategies for sequence human genome: Hierarchical shotgun approach Whole-genome shotgun approach Shotgun – randomly generated overlapping insert fragments: Fragments from BACs Fragments from shearing whole genome Shearing DNA with sonication Partial digestion with restriction enzymes Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Hierarchical shotgun strategy Used in publicly funded effort to sequence human genome Shear 200 kb BAC clone into ~2 kb fragments Sequence ends 10 times Need about 1700 plasmid inserts per BAC and about 20,000 BACs to cover genome Data form linkage and physical maps used to assemble sequence maps of chromosomes Significant work to create libraries of each BAC and physically map BAC clones Figure 10.9 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.9

Whole-genome shotgun sequencing Private company Celera used to sequence whole human genome. Whole genome randomly sheared three times Plasmid library constructed with ~ 2kb inserts Plasmid library with ~10 kb inserts BAC library with ~ 200 kb inserts Computer program assembles sequences into chromosomes. No physical map construction Only one BAC library Overcomes problems of repeat sequences Figure 10.10 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.10

Limitations of whole genome sequencing Some DNA can not be cloned. e.g., heterochromatin Some sequences rearrange or sustain deletions when cloned. Future large genome sequencing will use both shotgun approaches. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Sequencing of the human genome Most of draft took place during last year of project. Instrument improvements – 500,000,000 bp/day Automated factory-like production line generated sufficient DNA to supply sequencers on a daily basis. Large sequencing centers with 100-300 instruments – 150,000,000 bp/day Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Integration of linkage, physical, and sequence maps Provides check on the correct order of each map against other two SSR and SNP DNA linkage markers readily integrated into physical map by PCR analysis across insert clones in physical map SSR, SNP (linkage maps), and STS markers (physical maps) have unique sequences 20 bp or more, allowing placement on sequence map. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Changes in biology, genetics and genomics from human genome sequence Genetics parts list Speeds gene-finding and gene-function analysis Sequence identification in second organism through homology Gene function in one organism helps understand function in another for orthologous and paralogous genes Genes often encode one or more protein domains Allows guess at function of new protein by comparison of protein sequence in databases of all known domains Ready access to identification of known human polymorphism Speeds mapping of new organisms by comparison e.g., mouse and human have high similarity in gene content and order Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Major insights from human and model organism sequences Approximately 25,000 human genes Genes encode noncoding RNA or proteins. Repeat sequences are > 50% of genome. Distinct types of gene organization: Gene families Gene rich regions Combinatorial strategies amplify genetic information and increase diversity. Evolution by lateral transfer of genes from one organism to another Males have twofold higher mutation rate than females. Human races have very few unique distinguishing genes. All living organisms evolve from a common ancestor. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Conserved segments of syntenic blocks in human and mouse genomes Figure 10.12 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.12

Noncoding RNA genes Transfer RNAs (tRNAs) – adaptors that translate triplet code of RNA into amino acid sequence of proteins Ribosomal RNAs (rRNAs) – components of ribosome Small nucleolar RNAs (snoRNAs) – RNA processing and base modification in nucleolus Small nuclear RNAs (sncRNAs) - spliceosomes Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Protein coding genes generate the proteome. Proteome – collective translation of 30,000 protein coding genes into proteins Complexity of proteome increase from yeast to humans. More genes Shuffling, increase, or decrease of functional modules More paralogs Alternative RNA splicing – humans exhibit significantly more Chemical modification of proteins is higher in humans. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Protein coding genes generate the proteome How transcription factor protein domains have expanded in specific lineages Figure 10.13 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.11

Examples of domain accretions in chromatin proteins Figure 10.16 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.13

Number of distinct domain architectures in four eukaryotic genomes Figure 10.14 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.14

Repeat sequences fall into five classes. Transposon-derived repeats Processed pseudogenes SSRs Segmental duplications of 10-300 kb Blocks of repeated sequences at centromere, telomeres and other chromosomal features Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Repeat sequences constitute more than 50% of the genome. Fig. 10.15 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.15

Gene organization of genome Gene families Closely related genes clustered or dispersed Gene-rich regions Functional or chance events? Gene deserts Span 144 Mb or 3% of genome Contain regions difficult to identify? e.g., big genes – nuclear transcript spans 500 kb or more with very large introns (exons < 1% of DNA) Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Genome has a distinct organization Genome has a distinct organization. Gene family – olfactory receptor gene family Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Class II region of human major histocompatibility complex contains 60 genes in 700 kb Figure 10.20 Fig. 10.17 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Combinatorial strategies At DNA level – T-cell receptor genes are encoded by a multiplicity of gene segments. Fig. 10.18 At RNA level – splicing of exons in different orders Figure 10.18 (top) / Figure 10.19a (bottom) Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.19a

Lateral transfer of genes > 200 human genes may arise by transfer from organisms such as bacteria. Lateral transfer is direct transfer of genes from one species into the germ line of another. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Twofold higher mutation rate in males Comparison of X and Y chromosomes Same may be true for autosomes, but difficult to measure. Majority of human mutations arise in males. Males give rise to more defects, but also more diversity. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Human races have similar genes. Genome sequence centers have sequenced significant portions of at least three races. Range of polymorphisms within a race can be much greater than the range of differences between any two individuals of different races. Very few genes are race specific. Genetically, humans are a single race. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

All living organisms are a single race. All living organisms have remarkably similar genetic components. Life evolved once and we are descendents of that event. Analysis of appropriate biological systems in model organisms provides fundamental insight into corresponding human systems. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

In the future, other features of chromosomes will become increasingly important. Chemical modification of bases Understand DNA methylation now Others may be discovered Interaction of various proteins with chromosome Three dimensional structure of proteins in nucleus May determine interactions of chromosomal regions with regions of nuclear envelope More effective tools need to be developed to examine chromosome features. Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Copyright © The McGraw-Hill Companies, Inc Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

High-throughput instruments DNA sequencer Figure 10.20 Fig. 10.20 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

High-throughput instruments e.g, microarrays Figure 10.21 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.21

Two color - DNA microarray Figure 10.22 Fig. 10.22 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Analysis of genomic and RNA sequences Quantitative analysis of mRNA levels Serial analysis of gene expression (SAGE) Small cDNA tags of 15 bp from 3’ ends of mRNA are linked and sequenced. Massively parallel signature sequence (MPSS) Transcriptome – population of mRNAs expressed in a single cell or cell type MPSS allows identification of most of cell’s rarely expressed mRNAs Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Lynx therapeutics sequencing strategy of MPSS Figure 10.24 Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display Fig. 10.24

New approach to studying biological systems has made possible: Systems Biology – the global study of multiple components of biological systems and their interactions New approach to studying biological systems has made possible: Sequencing genomes High-throughput platform development Development of powerful computational tools The use of model organisms Comparative genomics Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Human Genome Project has changed the potential for predictive/preventive medicine. Provided access to DNA polymorphisms underlying human variability Makes possible identification of genes predisposing to disease Understanding of defective genes in context of biological systems Circumvent limitations of defective genes Novel drugs Environmental controls Approaches such as stem-cell transplants or gene therapy Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display

Social, ethical, and legal issues Privacy of genetic information Limitations on genetic testing Patenting of DNA sequences Society’s view of older people Training of physicians Human genetic engineering Somatic gene therapy – inserting replacement genes Germ-line therapy – modifications of human germ line Copyright © The McGraw-Hill Companies, Inc. Permission required to reproduce or display