Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 24 topics: Genomics, Proteomics, Bioinformatics

Similar presentations


Presentation on theme: "Chapter 24 topics: Genomics, Proteomics, Bioinformatics"— Presentation transcript:

1 Chapter 24 topics: Genomics, Proteomics, Bioinformatics
Student learning outcomes: Describe tools to obtain DNA sequences of genomes Explain how microarrays analyze the transcriptome Describe how proteomics studies proteins of cells Define how bioinformatics manages vast stores of DNA data Figures: 1, 3-13, 16, 17, 19, 20, 23, 24, 27, 28, 30; Tables 1, 2, 3 Problems: 1, 2*, 3-7, 9,12*, 15, 17,18, 20*, 22, 23*, 24, AQ3*,4

2 24.1 Positional Cloning Positional cloning: discover genes for genetic traits Mapping studies to roughly locate gene of interest to relatively small region of DNA on chromosome Physical landmarks - relate to gene position: Restriction Fragment Length Polymorphisms (RFLP): lengths of restriction fragments from a specific enzyme vary among individuals CpG Islands: DNA with unmethylated CpG is often actively expressed; find with methylation-sensitive restriction enzymes (HpaII vs. MspI for CCGG)

3 Southern blots detect RFLPs
Fig. 1 People differ in presence of particular HindIII site

4 Classic example: Identifying Gene Mutated in Human Huntington’s Disease (HD)
Dominant disease, late onset, degenerative Used RFLPs with huge family groups having disesase, Wexler, Gusella to map HD gene near end of chromosome 4 Mutation causing disease is expansion of CAG repeat from normal range of copies to abnormal range of > 38 copies (triplet expansion) Extra repeats -> extra Gln inserted into huntingtin, product of HD gene Huntingtin has normal role in brain: interferes with transcription factor SP1 binding TAF130 Mouse knockout: heterozygotes have neuro problems; null are dead

5 RFLPs helped locate Huntington’s disease gene
Fig. 3 Combinations of RFLP distinguish 4 possible haplotypes Fig. 4 Southern blot defines haplotype genotypes of members

6 HD gene identified from studies large families
Pedigree studies, molecular studies of haplotypes, and correlation with disease: lead to cloning of gene and prediction for disease (variable age onset) Fig Haplotype C is associated with disease - predictive

7 24.2 Sequencing Genomes Information from genome sequences:
Location of exact coding regions for all genes Spatial relationships among genes, exact distances between them in bp Sanger dideoxy sequencing 1977 (fX174 phage) How is coding region recognized? Contains an ORF long enough to code for protein ORF (open reading frame) must Start with ATG triplet End with stop codon Phage or bacterial ORF same as coding region Eukaryotic ORF definition is more difficult: introns

8 Genome Results (Table 1 examples)
Numerous RNA or DNA sequences of genomes of viruses and organisms have been obtained: Phages, viruses Bacteria Animals Plants Human, Neanderthal Comparison of related genomes (close or distant) sheds light on evolution of species: phylogeny from combination of traditional and molecular data

9 * * * *

10 Human Genome Project (3 x 109 bp haploid)
A. Original plan systematic and conservative: (1990) Funded by NIH, Dept. of Energy Prepare genetic, physical maps with markers: then piece DNA sequences together in proper order Plan most sequencing after mapping complete [Also many model organisms sequenced to compare] Celera, a private, for-profit company (J.C. Venter) vowed to complete rough draft of genome by 2000 B. Celera method was shotgun sequencing: Whole genome chopped up and cloned Clones sequenced randomly Sequences pieced together by computer programs

11 Vectors for Large-Scale Genome Projects
BAC YAC Figs. 7, 8 Two high-capacity vectors for Human Genome Project Mapping mostly used yeast artificial chromosome (YAC), accepts million base pairs Sequencing used bacterial artificial chromosomes (BAC) accepts about 300,000 bp BACs are more stable, easier to work with than YACs

12 A. Clone-by-Clone Strategy
Mapping requires set of physical landmarks to relate positions of cloned genes, then sequence Some markers are genes; many are nameless stretches of DNA (must organize it all) RFLPs – want polymorphic regions Ideally different pattern for people with disease vs. normal people locates disease genes (like HD) VNTRs, variable number tandem repeats of small seq. Mini-satellite, Highly polymorphic, useful for forensics STSs, sequence-tagged unique sites, expressed-sequence tags and microsatellites

13 Sequence-Tagged Sites- physical maps
STSs unique sequences bp long Detectable by PCR Need sequence information for primers; Need not be in a gene Design short primers Hybridize few hundred bp apart Amplify predictable length of DNA – see on gel Fig. 9

14 Sequence-Tagged Sites - Physical Maps
Align cloned sequences to form contigs (contiguous overlapping DNA sequences) Fig. 10

15 Shotgun-Sequencing Method used by Celera
Fig. 11: Connect overlapping BAC clones by identification of STCs, sequence-tagged connectors

16 Human Genome Project Working draft (2001) reported by Venter (Celera) and NIH/DOE consortium: Estimated genome contained fewer genes than anticipated – 25,000 to 30,000 2007 completed version About half of genome from action of transposons Bacteria also donated dozens of genes Provides information about human evolution: chimpanzee, Neanderthal, many other genomes

17 Findings from Chromosome 22 – 1st one
679 annotated genes: 274 Known genes, previously identified 150 Related genes, homologous to known genes 134 Pseudogenes, sequences homologous to known genes, but defects preclude proper expression Coding regions of genes only a tiny fraction Annotated genes 39% of total length Exons only 3% Repeat sequences (Alu, LINEs, etc) are 41% Large chunks of human chromosome 22q conserved in several different mouse chromosomes

18 Homologs Orthologs: homologous genes in different species evolved from common ancestor: 8 regions to 7 mouse chromosomes Paralogs: homologous genes that evolved by gene duplication within a species Homologs: any kind of homologous genes, both orthologs and paralogs Fig. 13 Large chunks of human chromosome 22q conserved in several different mouse chromosomes (113 genes)

19 Chromosome 21 Relative few genes
59 pseudogenes All 24 genes shared with mouse chromosome 10 are in same order in both chromosomes Disease genes associated with chromosome 21: Down syndrome is extra chromosome Alzheimer’s, ALS (Lou Gehrig’s disease) genes

20 The X Chromosome Sequence of 151 Mb of human X chromosome:
protein-encoding genes 168 genes governing X-linked phenotype Genes for 173 noncoding RNAs Lot of genes identified for human disease (sex-linked) Chromosome rich in LINE1 repetitive elements Involved in X inactivation mechanism in female cells XIST RNA (X-inactivation specific) 32-kb RNA responsible for X-inactivation, heterochromatin X (and partner Y) evolved from ancestral autosomes

21 Other Vertebrate Genomes
Comparing human genome with other vertebrates: helped identify many human genes help identify defective genes for human genetic diseases Closely related species (mouse) identify when and where genes are expressed; predict when and where human genes likely expressed Fig. 14 Mouse, human

22 The Minimal Genome – J. Craig Venter
Define essential gene set of simple organism Mutate one gene at a time; see which required for life In theory, could define minimal genome: set of genes required for life Minimum genome likely larger than essential gene set Sequence a small genome, then delete genes Mycoplasma genitalium, 580 kb (480 protein-coding genes) No cell wall, intracellular parasite, only glycolysis 2010 placed synthetic minimal genome (1 x 106 bp) into Mycoplasma cell lacking genes : new life form that can live and reproduce under lab conditions – controversial approaches

23 The Barcode of Life CBOL (Consortium for the Barcode of Life: plan to create barcode to identify any species of life on earth First such barcode - sequence of 648-bp piece of mitochondrial COI gene from each organism Cytochrome C oxidase Isolate mitochondrial DNA, sequence Sequence can uniquely identify most organisms Other sequences needed for plants and bacteria, since less variation among their COI genes

24 24.3 Applications : Functional Genomics
Functional genomics deals with function or expression of genomes Transcriptome: all transcripts an organism makes at any given time Genomic functional profiling: use of genomic information to block expression systematically Proteomics: study structures and functions of protein products of genomes

25 Transcriptomics Study all transcripts organism makes
Create DNA microarrays (microchips) that hold 1000s of cDNAs or oligos Hybridize labeled RNAs (cDNAs) from cells to chips Intensity of hybridization at each spot reveals the extent of expression of corresponding gene Arrays measure expression of many genes at once Clustered expression of genes in time and space suggests products of these genes collaborate in some process -> function Affymetrix makes chips, 25-mer unique sequences

26 DNA chips: Oligo-nucleotides on a Glass Substrate
Fig. 16 Serum-starved human cells cDNA (labeled green); serum-fed cells cDNA (red) Equal expression of mRNA = yellow Fig. 17

27 Genomic Functional Profiling
Deletion analysis - mutants created by replacing genes with antibiotic resistance gene flanked by oligomers serving as barcode for that mutant Functional profile can be obtained by growing whole group of mutants together under various conditions to see which mutants disappear most rapidly Fig. 21 Growth of yeast mutants on galactose C source

28 RNAi Analysis Genomic functional analysis: RNAi inactivates genes
Ex. genes involved in early embryogenesis in C. elegans: 661 important genes (early embryo defect) 326 involved in embryogenesis Fig. 22: initial screen showed which genes were mutated with RNAi; Then see which stage of embryogenesis affected

29 * Locating Target Sites for Transcription Factors (ChIP-chip)
Chromatin immunoprecipitation (ChIP) followed by DNA microarray analysis can identify DNA-binding sites for activators and other proteins Small genome organisms - all intergenic regions can be included in microarray If genome is large, not practical To narrow areas of interest, can use CpG islands Non-methylated CpG associated with gene control region If timing/conditions of activator’s activity are known, control regions of genes known to be activated at those times, or under those conditions, can be used

30 ChIP-chip assays locate target sites for specific transcription factors
ChIP with specific antibody PCR adding generic primer to all fluorescent label microarray See Fig. 25 Yeast Gal4 protein binding sites Fig. 24

31 In Situ Expression Analysis ‘Mouse blots’
Fig. 26 Mouse as human surrogate in large-scale expression studies (ethically impossible in humans) Studied expression of almost all mouse orthologs of genes on human chromosome 21 Followed stages of embryonic development (E) Catalogued embryonic tissues in which genes expressed

32 Single-Nucleotide Polymorphisms; pharmacogenomics
Single-nucleotide polymorphisms (SNPs) are single bp differences between people; account for many genetic conditions caused by single genes, even multiple genes Might be able to predict response to a drug New focus for therapeutics Haplotype map with > 1 million SNPs: sort out important SNPs from those with no effect

33 24.4 Proteomics Proteome: all proteins produced by an organism
Proteomics: Study of all proteins, or subsets More accurate picture of gene expression than transcriptomics studies: Sometimes mRNA is degraded, not translated First separate proteins, often on massive scale 2-D gel electrophoresis is good tool After separation, identify proteins Digest proteins with proteases Identify peptides by mass spectrometry

34 MALDI-TOF Mass Spectrometry
Matrix-assisted laser desorption ionization – time of flight Peptides ionized; time to reach detector accurately reflects mass Fig. 27

35 Detecting Protein-Protein Interactions
Epitope tag on one protein (from gene level) permits isolation of complex containing that protein using affinity resins Common epitopes: His6-tag, HA- tag Flag-tag, TAP-tag ** In future, microchips with antibodies may allow analysis of proteins in complex mixtures without separation Fig. 28

36 Identifying Protein Interactions, networks
Most proteins function with other proteins Yeast two-hybrid analysis Protein microarrays Immunoaffinity chromatography with mass spectrometry Fig. 29. Identifying proteins binding kinases using Flag-tagged KssI or Cdc28

37 24.5 Bioinformatics Bioinformatics: building and using biological databases DNA sequences of genomes mining massive amounts of biological data for meaningful knowledge about gene structure and expression National Center for Biological Information (NCBI) website: vast store of biological information (genomic and proteomic) Start with DNA sequence, discover gene, then compare that sequence with that of similar genes or organisms View 3D protein structures on computer

38 Review questions 2. What kind of mutation gave rise to Huntington disease? 12. Compare/ contrast the clone-by-clone sequencing strategy with the shotgun sequencing strategy for large genomes 15. The pufferfish genome is nine times smaller than human genome, but contains as many genes. How can that be? 20. Describe hypothetical experiment using DNA microarray to measure transcription from SV40 viral genes at two stages of infection of cells by the virus. Show example results.


Download ppt "Chapter 24 topics: Genomics, Proteomics, Bioinformatics"

Similar presentations


Ads by Google