Download presentation
1
Genome Organization & Protein Synthesis and Processing in Plants
2
Viral genomes Viral genomes: ssRNA, dsRNA, ssDNA, dsDNA, linear or ciruclar Viruses with RNA genomes: Almost all plant viruses and some bacterial and animal viruses Genomes are rather small (a few thousand nucleotides) Viruses with DNA genomes (e.g. lambda = 48,502 bp): Often a circular genome. Replicative form of viral genomes all ssRNA viruses produce dsRNA molecules many linear DNA molecules become circular Molecular weight and contour length: duplex length per nucleotide = 3.4 Å Mol. Weight per base pair = ~ 660
3
Procaryotic genomes Generally 1 circular chromosome (dsDNA)
Usually without introns Relatively high gene density (~2500 genes per mm of E. coli DNA) Contour length of E.coli genome: 1.7 mm Often indigenous plasmids are present
4
Plasmids Extra chromosomal circular DNAs -lactamase ori
Found in bacteria, yeast and other fungi Size varies form ~ 3,000 bp to 100,000 bp. Replicate autonomously (origin of replication) May contain resistance genes May be transferred from one bacterium to another May be transferred across kingdoms Multicopy plasmids (~ up to 400 plasmids/per cell) Low copy plasmids (1 –2 copies per cell) Plasmids may be incompatible with each other Are used as vectors that could carry a foreign gene of interest (e.g. insulin) foreign gene
5
Eukaryotic genome Moderately repetitive
Functional (protein coding, tRNA coding) Unknown function SINEs (short interspersed elements) bp 100,000 copies LINEs (long interspersed elements) 1-5 kb 10-10,000 copies
6
Eukaryotic genome Highly repetitive Minisatellites Microsatellites
Repeats of bp 1-5 kb long Scattered throughout genome Microsatellites Repeats up to 13 bp 100s of kb long, 106 copies Around centromere Telomeres Short repeats (6 bp) 250-1,000 at ends of chromosomes
7
Eucaryotic genomes Located on several chromosomes
Relatively low gene density (50 genes per mm of DNA in humans) Contour length of DNA from a single human cell = 2 meters Approximately 1011 cells = total length 2 x 1011 km Distance between sun and earth (1.5 x 108 km) Human chromosomes vary in length over a 25 fold range Carry organelles genome as well
8
Mitochondrial genome (mtDNA)
Multiple identical circular chromosomes Size ~15 Kb in animals Size ~ 200 kb to 2,500 kb in plants Over 95% of mitochondrial proteins are encoded in the nuclear genome. Often A+T rich genomes. Mt DNA is replicated before or during mitosis
9
Chloroplast genome (cpDNA)
Multiple circular molecules Size ranges from 120 kb to 160 kb Similar to mtDNA Many chloroplast proteins are encoded in the nucleus (separate signal sequence)
10
“Cellular” Genomes Viruses Procaryotes Eucaryotes
Nucleus Capsid Plasmids Viral genome Bacterial chromosome Chromosomes (Nuclear genome) Mitochondrial genome Chloroplast genome Genome: all of an organism’s genes plus intergenic DNA Intergenic DNA = DNA between genes
11
Estimated genome sizes
mammals plants fungi bacteria (>100) mitochondria (~ 100) viruses (1024) 1e1 1e2 1e e4 1e5 1e6 1e7 1e8 1e9 1e10 1e11 1e12 Size in nucleotides. Number in ( ) = completely sequenced genomes
12
Size of genomes Epstein-Barr virus 0.172 x 106 E. coli 4.6 x 106
S. cerevisiae 12.1 x 106 C. elegans 95.5 x 106 A. thaliana 117 x 106 D. melanogaster 180 x 106 H. sapiens 3200 x 106
13
Chromosome organization
Eucaryotic chromosome Telomere Centromere Telomere p-arm q-arm Centromere: DNA sequence that serve as an attachment for protein during mitosis. In yeast these sequences (~ 130 nts) are very A+T rich. In higher eucaryotes centromers are much longer and contain “satellite DNA” Telomeres: At the end of chromosomes; help stabilize the chromosome In yeast telomeres are ~ 100 bp long (imperfect repeats) Repeats are added by a specific telomerase 5’ – (TxGy)n 3’ – (AxCy)n x and y = 1 - 4 n = 20 to 100; (1500 in mammals)
14
Gene classification intergenic region non-coding genes coding genes
Chromosome (simplified) Messenger RNA Structural RNA Proteins transfer RNA ribosomal RNA other RNA Structural proteins Enzymes
15
What is a gene ? Definitions
Classical definition: Portion of a DNA that determines a single character (phenotype) One gene – one enzyme (Beadle & Tatum 1940): “Every gene encodes the information for one enzyme” One gene – one protein: “One gene contains information for one protein (structural proteins included) one gene – one polypeptide Current definition: A piece of DNA (or in some cases RNA) that contains the primary sequence to produce a functional biological gene product (RNA, protein).
16
Coding region Nucleotides (open reading frame) encoding the amino acid sequence of a protein The molecular definition of gene includes more than just the coding region
17
Noncoding regions Regulatory regions Introns
RNA polymerase binding site Transcription factor binding sites Introns Polyadenylation [poly(A)] sites
18
Gene Molecular definition:
Entire nucleic acid sequence necessary for the synthesis of a functional polypeptide (protein chain) or functional RNA
19
Anatomy of a gene ORF. From start (ATG) to stop (TGA, TAA, TAG)
Upstream region with binding site. (e.g. TATA box). Poly-a ‘tail’ Splices. Bounded by AG and GT splice signals.
20
Bacterial genes Most do not have introns
Many are organized in operons: contiguous genes, transcribed as a single polycistronic mRNA, that encode proteins with related functions Polycistronic mRNA encodes several proteins
21
Bacterial operon What would be the effect of a mutation in the control region (a) compared to a mutation in a structural gene (b)?
22
Eucaryotic genes Hemoglobin beta subunit gene Splicing
Exon 1 90 bp Intron A 131 bp Exon 2 222 bp Intron B 851 bp Exon 3 126 bp Splicing Introns: intervening sequences within a gene that are not translated into a protein sequence. Collagen has 50 introns. Exons: sequences within a gene that encode protein sequences Splicing: Removal of introns from the mRNA molecule.
23
Regulatory mechanisms
‘organize expression of genes’ (function calls) Promoter region (binding site), usually near coding region Binding can block (inhibit) expression Computational challenges Identify binding sites Correlate sequence to expression
24
Eukaryotic genes Most have introns
Produce monocistronic mRNA: only one encoded protein Large
25
Alternative splicing Splicing is the removal of introns
mRNA from some genes can be spliced into two or more different mRNAs
26
“Nonfunctional” DNA Higher eukaryotes have a lot of noncoding DNA
80 kb Higher eukaryotes have a lot of noncoding DNA Some has no known structural or regulatory function (no genes)
27
Types of eukaryotic DNA
28
Duplicated genes Encode closely related (homologous) proteins
Clustered together in genome Formed by duplication of an ancestral gene followed by mutation Five functional genes and two pseudogenes
29
Pseudogenes Nonfunctional copies of genes
Formed by duplication of ancestral gene, or reverse transcription (and integration) Not expressed due to mutations that produce a stop codon (nonsense or frameshift) or prevent mRNA processing, or due to lack of regulatory sequences
30
Repetitive DNA Moderately repeated DNA Simple-sequence DNA
Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts) Large duplicated gene families Mobile DNA Simple-sequence DNA Tandemly repeated short sequences Found in centromeres and telomeres (and others) Used in DNA fingerprinting to identify individuals
31
Types of DNA repeats Perfect repeats vs degenerate repeats
Tandem repeats (e.g. satellite DNA) 5’-CATGTGCTGAAGGCTATGTGCTGCGACG- 3’ 3’-GTACACGACTTCCGATACACGACGCTGC- 5’ Inverted repeats (e.g. in transposons) Loop 5’-CATGTGCTGAAGGCTCAGCACATCGACG- 3’ 3’-GTACACGACTTCCGAGTCGTGTAGCTGC- 5’ Stem Form stem-loop structures Palindroms = adjacent inverted repeats (e.g. restriction sites) Form hairpin structures Hairpin
32
Moderately repetitive
Repetitive sequences Satellite DNA Chromosomal DNA Caesium chloride density gradient Repeats in the mouse genome Type No. of Repeats Size Percent of genome Highly repetitive > 1 Mill < 10 bp 10 % Moderately repetitive > 1000 ~ ~300 bp 20 %
33
DNA repeats and forensics
AluSTXa Gender determination Standard technique: PCR amplification of the amelogenin locus (Males = XY => bp) AluSTXa Alu insertion on X AluSTYa Alu insertion on Y M F Suspect 878 bp 556 bp AluSTYa X-Y homologous regions AluSTYa M F Suspect X 528 bp 199 bp Y Alu sequence
34
Mobile DNA Move within genomes
Most of moderately repeated DNA sequences found throughout higher eukaryotic genomes L1 LINE is ~5% of human DNA (~50,000 copies) Alu is ~5% of human DNA (>500,000 copies) Some encode enzymes that catalyze movement
35
Transposition Movement of mobile DNA
Involves copying of mobile DNA element and insertion into new site in genome
36
Why? Molecular parasite: “selfish DNA”
Probably have significant effect on evolution by facilitating gene duplication, which provides the fuel for evolution, and exon shuffling
37
RNA or DNA intermediate
Transposon moves using DNA intermediate Retrotransposon moves using RNA intermediate
38
Types of mobile DNA elements
39
LTR (long terminal repeat)
Flank viral retrotransposons and retroviruses Contain regulatory sequences Transcription start site and poly (A) site
41
LINES and SINES Non-viral retro-transposons
RNA intermediate Lack LTR LINES (long interspersed elements) ~6000 to 7000 base pairs L1 LINE (~5% of human DNA) Encode enzymes that catalyze movement SINES (short interspersed elements) ~300 base pairs Alu (~5% of human DNA)
42
Proteins Most protein sequences (today) are inferred
What’s wrong with this? Proteins (and nucleic acids) are modified ‘mature’ Rna Computational challenges Identify (possible) aspects of molecular life cycle Identify protein-protein and protein-nucleic acid interactions
43
Genetic variation Variable number tandem repeats (minisatellites) bp. Forensic applications. Short tandem repeat polymorphisms (microsatellites). 2-5 bp, consecutive copies. Single nucleotide polymorphisms
44
Single nucleotide polymorphisms
1/2000 bp. Types Silent Truncating Shifting Significance: much of individual variation. Challenge: correlation to disease
45
Yeast genome 4.6 x 106 bp. One chromosome. Published 1997.
4,285 protein-coding genes 122 structural RNA genes Repeats. Regulatory elements. Transposons. Lateral transfers.
46
Yeast protein functions
Regulatory 45 1.05% Cell structure 182 4.24 Transposons,etc 87 2.03 Transport & binding 281 6.55 Putative transport 146 3.40 Replication, repair 115 2.68 Transcription 55 1.28 Translation Enzymes 251 5.85 Unknown 1632 38.06
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.