Download presentation
Published byJosephine Ray Modified over 9 years ago
1
Genome organization Eukaryotic genomes are complex and DNA amounts and organization vary widely between species.
2
Genome Organization G
3
C value paradox: The amount of DNA in the haploid cell of an organism is not related to its evolutionary complexity or number of genes.
4
Highly Repeated Sequences
7
There are different classes of eukaryotic DNA based on sequence complexity.
8
Amount of DNA in a Genome Does Not Correlate with Complexity
basepairs
9
How many genes do humans have?
Original estimate was between 50,000 to 100,000 genes We now think humans have ~ 20,000 genes How does this compare to other organisms? Mice have ~30,000 genes Pufferfish have ~35,000 Nematodes (C. elegans), have ~19,000 Yeast (S. cerevisiae) has ~6,000 The microbe responsible for tuberculosis has ~4,000
11
Single Copy Sequences Exome
12
Even the Amount of DNA a Gene Spans Differs Among Species
13
Problems? Some gene products are RNA (tRNA, rRNA, others) instead of protein Some nucleic acid sequences that do not encode gene products (noncoding regions) are necessary for production of the gene product (protein or RNA). Eukaryotic genes are complex!
14
Gene Identification Open reading frames Sequence conservation
Database searches Synteny Sequence features CpG islands Evidence for transcription ESTs, microarrays Gene inactivation Transformation, RNAi
15
Unique genes
16
Noncoding regions Regulatory regions Introns
RNA polymerase binding site Transcription factor binding sites Introns Polyadenylation [poly(A)] sites
17
Splice Sites Eukaryotes only
Removal of internal parts of the newly transcribed RNA. Takes place in the cell nucleus Splice sites difficult to predict
18
One gene, many proteins via alternative splicing , 3’ cleavage and polyadenlyation
20
Exon Shuffling
21
Trans-Splicing in Higher Eukaryotes
21 Gingeras, Nature (2009) 461,
22
Non-contiguous Transcription Generates An Enormous Number of Possible Transcripts
• Trans-splicing exists in higher eukaryotes as well as in lower ones like Trypanosomes Six 2-exon co-linear combinations from four exons Blue: only co-linear Red: all combinations 325 combinations of 3-exons, non-colinear • Reassortment of exons coding for ncRNA or protein domains could dramatically increase number of functional products beyond the number of ‘genes’ 22 Gingeras, Nature (2009) 461,
23
Why genome size isn’t the only concern (size doesn’t matter?)
More sophisticated regulation of expression? Proteome vastly larger than genome? Alternate splicing RNA editing Postranslational modifications? Cellular location? Moonlighting
24
Gene families E.g. globins, actin, myosin Clustered or dispersed
Pseudogenes
26
Pseudogenes Nonfunctional copies of genes
Formed by duplication of ancestral gene, or reverse transcription (and integration) Not expressed due to mutations that produce a stop codon (nonsense or frameshift) or prevent mRNA processing, or due to lack of regulatory sequences
27
Duplicated genes Encode closely related (homologous) proteins
Formed by duplication of an ancestral gene followed by mutation Five functional genes and two pseudogenes
28
Paralogs vs Orthologs Different members of the globin gene family are paralogs, having evolved one from another through gene duplication. Paralogs are separated by a gene duplication event. Each specific gene family member (e.g. a specific gene in human) is an ortholog of the same family member in another species (e.g. mouse). Both evolved from an ancestral globin gene. Orthologs are separated by a speciation event. It is not always easy to distinguish true orthologs from paralogs , especially in polyploid organisms!
29
Protein - coding sequences less than 1.5% of the genome in humans!
30
Noncoding RNAs (ncRNA)
Do not have translated ORFs Small Not polyadenylated
34
Functions of Known lncRNAs
• Transcriptional interference -lncRNA transcription turns off transcription of nearby gene • Initiation of chromatin remodeling - lncRNA transcription turns on transcription of nearby gene • Promoter inactivation - lncRNA binds to TFIIB and to promoter DNA • Activation of an accessory protein - lncRNA binds to allosteric effector protein TLS and inhibits histone acetyltransferase, decreasing transcription 34 Ponting et al, Cell (2009) 136,
35
Functions of Known lncRNAs
• Activation of transcription factors - binding of lncRNA to Dlx2 activates Dlx5/6 activity • Oligomerization of an accessory protein - lncRNA induces heat shock factor trimerization • Transport of transcription factors -lnRNA NRON keeps NFAT out of nucleus • Epigenetic silencing of gene clusters -Xist RNA inactivates X chromosome • Epigenetic repression of genes in trans -HOTAIR binds PRC2, leading to methylation and silencing of several genes in HOXD locus Ponting et al, Cell (2009) 136, 35
36
ncRNA ~97-98% of the transcriptional output of the human genome is ncRNA Introns Transfer RNAs (tRNA) ~ 500 tRNA genes in human genome Ribosomal RNAs Tandem arrays on several chromosomes copies of 28S – 5.8S – 18S cluster copies of 5S cluster
38
Genome Organization - ncRNA
The level of transcription from human chromosomes 21 and 22 is an order of magnitude higher than can be accounted for by known or predicted exons Almost half of all transcripts from well-constructed mouse cDNA libraries are ncRNAs (identified because they do not code for an open reading frame of larger than 100 codons)
39
Repeat sequences – 50% or more of the genome
40
Repetitive DNA Moderately repeated DNA Simple-sequence DNA
Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts) Large duplicated gene families Mobile DNA Simple-sequence DNA Tandemly repeated short sequences Found in centromeres and telomeres (and others) Used in DNA fingerprinting to identify individuals
41
Segmental duplications
Found especially around centromeres and telomeres Often come from nonhomologous chromosomes Many can come from the same source Tend to be large (10 to 50 kb) Unique to humans?
42
Repetitive DNA - Segmental duplications
43
Mobile DNA Moves within genomes
Most of the moderately repeated DNA sequences found throughout higher eukaryotic genomes L1 LINE is ~5% of human DNA (~50,000 copies) Alu is ~5% of human DNA (>500,000 copies) Some encode enzymes that catalyze movement
45
Repetitive DNA – Highly repetitive satellite DNA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.