Download presentation
Presentation is loading. Please wait.
Published byGodwin Smith Modified over 9 years ago
1
The Human Genome (part 1 of 2) Wednesday, November 5, 2003 Introduction to Bioinformatics ME:440.714 J. Pevsner pevsner@jhmi.edu
2
Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by J Pevsner (ISBN 0-471-21004-8). Copyright © 2003 by Wiley. These images and materials may not be used without permission from the publisher. Visit http://www.bioinfbook.org Copyright notice
3
Today: Human genome Friday Nov. 7: computer lab Monday Nov. 10: Human disease (West Lecture Hall) Wednesday Nov. 12: Final exam (in class); find-a-gene project due Announcements
4
Final exam on November 12 Format: -- closed book -- one hour, in-class (ok to take longer) -- to practice, do the self-test quizzes at the ends of chapters 15-18. Some of the questions will be based on the recent article on human chromosome 6: Mungall AJ et al., The DNA sequence and analysis of human chromosome 6. Nature 425, 805-811, 23 October 2003. See also the accompanying News & Views: Grimwood J and Schmutz J, Six is seventh, Nature 425, 775-776, 23 October 2003.
5
Outline of today’s lecture 1.Summary of major findings of Human Genome Project 2. Web resources for the human genome 3.We will follow the outline of the February 2001 Nature paper describing the human genome. Page 607
6
Main conclusions of the human genome project Page 608
7
Main web sites for the human genome Genome Hub National Human Genome Research Institute (NHGRI) http://www.genome.gov/ NCBI Genome Central www.ncbi.nlm.nih.gov/genome/central Ensembl http://www.ensembl.org/genome/central/ Page 608
8
1.There are about 30,000 to 40,000 human genes. This number is far smaller than earlier estimates. Page 608 Main conclusions of human genome project
9
1.There are about 30,000 to 40,000 human genes. This number is far smaller than earlier estimates. The public consortium estimated 31,000, while Celera estimated 38,500. But note: Many predicted genes are unique to each group There are many transcripts of unknown function Current estimates (2003) are ~30,000 genes. Page 608 Main conclusions of human genome project
10
Page 608 Main conclusions of human genome project 1. We have about the same number of genes as fish and plants, and not that many more genes than worms and flies.
11
1. We have about the same number of genes as fish and plants, and not that many more genes than worms and flies. Fugu rubripes (pufferfish): 31,000 to 38,000 Arabidopsis thaliana (thale cress): 26,000 Caenorhabditis elegans (worm): 19,000 Drosophila melanogaster (fly): 13,000 Page 608 Main conclusions of human genome project
12
2. The human proteome is far more complex than the set of proteins encoded by invertebrate genomes. Page 608 Main conclusions of human genome project
13
2. The human proteome is far more complex than the set of proteins encoded by invertebrate genomes. Vertebrates have a more complex mixture of protein domain architectures. Additionally, the human genome displays greater complexity in its processing of mRNA transcripts by alternative splicing. Page 608 Main conclusions of human genome project
14
Page 608 Main conclusions of human genome project 3. Hundreds of human genes were acquired from bacteria by lateral gene transfer, according to the initial report.
15
3. Hundreds of human genes were acquired from bacteria by lateral gene transfer, according to the initial report. Evidence: compare the proteomes of human, fly, worm, yeast, Arabidopsis, eukaryotic parasites, and all completed prokaryotic genomes. Find some genes shared exclusively by humans and bacteria—but according to TIGR, only about 40 of these genes (or fewer?) were acquired by LGT. (See Salzberg et al., Science 292:1903, 2001). Reasons for artifactually high estimates include: -- gene loss -- small sample size of species Page 608 Main conclusions of human genome project
16
4. 98% of the genome does not code for genes Page 608 Main conclusions of human genome project
17
4. 98% of the genome does not code for genes >50% of the genome consists of repetitive DNA derived from transposable elements (also called interspersed repeats): LINEs (20%) SINEs (13%) LTR retrotransposons (8%) DNA transposons (3%) Page 608 Main conclusions of human genome project
18
4. 98% of the genome does not code for genes >50% of the genome consists of repetitive DNA derived from transposable elements: LINEs (20%) SINEs (13%) LTR retrotransposons (8%) DNA transposons (3%) There has been a decline in activity of some of these elements in the human lineage. Page 608 Main conclusions of human genome project
19
5. Segmental duplication is a frequent occurrence in the human genome. -- tandem duplications (rare) -- retrotransposition (intronless paralogs) -- segmental duplications (common) Page 608 Main conclusions of human genome project
20
6. There are 300,000 Alu repeats in the human genome. These are about 300 base pairs and contain an AluI restriction enzyme site. They occupy 3% of the genome. We saw an example of an Alu repeat in Chapter 16. Their distribution is non-random: they are retained in GC-rich regions and may confer some benefit. Page 608 Main conclusions of human genome project
21
7. The mutation rate is about twice as high in male meiosis than female meiosis. Most mutation probably occurs in males. Page 609 Main conclusions of human genome project
22
8. More than 1.4 million single nucleotide polymorphisms (SNPs; single base pair changes) were identified. Celera initially identified 2.1 million SNPs. Currently, dbSNP at NCBI (build 118) has about 5.8 million human SNPs (2.4 million validated). A SNP occurs every 100 to 300 base pairs. A random pair of haploid genomes differs at a rate of 1 base pair every 1250, on average (Celera). Fewer than 1% of SNPs alter protein sequence. Page 609 Main conclusions of human genome project
23
Three gateways to access the human genome Page 608
24
Three gateways to access the human genome NCBI map viewer www.ncbi.nlm.nih.gov Ensembl Project (EBI/Sanger Institute) www.ensembl.org UCSC (Golden Path) www.genome.ucsc.edu Page 609
25
Three gateways to access the human genome NCBI map viewer www.ncbi.nlm.nih.gov Ensembl Project (EBI/Sanger Institute) www.ensembl.org UCSC (Golden Path) www.genome.ucsc.edu Each of these three sites provides essential resources to study the human genome (and other genomes)
26
Fig. 17.1 Page 610 NCBI offers a human map viewer
27
Fig. 17.2 Page 611 Map viewer: RBP4 on chromosome 10 Click to customize the tracks on this map
29
LocusLink DNA (contig) OMIM Sequence viewer protein evidence viewer Model maker HomoloGene Confirmed gene model orientation
30
Fig. 17.3 Page 613 NCBI’s evidence viewer provides data on gene models (e.g. mapping ESTs to genomic DNA)
31
Fig. 17.3 Page 613 NCBI evidence viewer: gene structures
32
Fig. 17.3 Page 613 NCBI evidence viewer: gene structures Evidence for a discrepancy (e.g. sequencing error or polymorphism)
33
The Ensembl project currently includes genome browsers for nine organisms: Humanmousezebrafish Fugumosquitofruitfly C. elegans C. briggsaerat Visit http://www.ensembl.org Ensembl Page 610
34
Fig. 17.4 Page 614 Ensembl human genome browser
35
Fig. 17.5 Page 615 Ensembl: GeneView for RBP4
36
Fig. 17.6 Page 616 Ensembl: GeneView for RBP4
37
Fig. 17.7 Page 617 Ensembl human genome browser: ContigView
38
Fig. 17.7 Page 617 Ensembl human genome browser: ContigView
39
Fig. 17.8 Page 618 Ensembl human genome browser: TransView
40
Fig. 17.9 Page 619 Ensembl: ProteinView for RBP4
41
Fig. 17.10 Page 620 Ensembl: MapView for chromosome 10
42
Fig. 17.11 Page 621 Ensembl: SyntenyView for chromosome 10
43
The University of California at Santa Cruz (UCSC) offers a genome browser with the “golden path” annotation of the human genome. The browser features searches by keyword, gene name, or other text searches. UCSC offers the lightning fast BLAT BLAST-like tool (see Chapter 5). A key feature of this browser is its customizable annotation tracks. About half of these tracks are offered by users of the site throughout the world. Visit http://genome.ucsc.edu The UCSC human genome browser Page 614
44
Fig. 17.12 Page 622
45
Fig. 17.13 Page 623
46
This lecture continues with part 2 of 2…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.