Download presentation
Presentation is loading. Please wait.
Published byMolly Joseph Modified over 8 years ago
2
Genome Organization
3
Genome Complete set of instructions for making an organism master blueprints for all enzymes, cellular structures & activitiesmaster blueprints for all enzymes, cellular structures & activities an organism‘s complete set of DNA The total genetic information carried by a single set of chromosomes in a haploid nucleus Located in every nucleus of trillions of cells Consists of tightly coiled threads of DNA organized into chromosomes
4
Typical viral genome DNA or RNA 4-200 genes
5
Viral genomes Viral genomes: ssRNA, dsRNA, ssDNA, dsDNA, linear or circular Viruses with RNA genomes: Almost all plant viruses and some bacterial and animal viruses Genomes are rather small (a few thousand nucleotides) Viruses with DNA genomes (e.g. lambda = 48,502 bp): Often a circular genome. Replicative form of viral genomes all ssRNA viruses produce dsRNA molecules many linear DNA molecules become circular Molecular weight and contour length: duplex length per nucleotide = 3.4 Å Mol. Weight per base pair = ~ 660
6
Procaryotic genomes Generally 1 circular chromosome (dsDNA) Generally 1 circular chromosome (dsDNA) Usually without introns Usually without introns Relatively high gene density (~2500 genes per mm of E. coli DNA) Relatively high gene density (~2500 genes per mm of E. coli DNA) Contour length of E.coli genome: 1.7 mm Contour length of E.coli genome: 1.7 mm Often indigenous plasmids are present Often indigenous plasmids are present
7
one circular double- stranded DNA chromosome often plasmid(s) Typical Procaryotic genome 500-12,000 genes
8
Bacterial genomes: E. coli 4288 protein coding genes: 4288 protein coding genes: Average ORF 317 amino acids Very compact: average distance between genes 118bp Numerous paralogous gene families: 38 – 45% of genes arisen through duplication Numerous paralogous gene families: 38 – 45% of genes arisen through duplication Homologues: Homologues: H. influenzae (1130 of 1703) Synechocystis (675 of 3168) M. jannaschii (231 of 1738) S. cerevisiae (254 of 5885)
9
Easy problem Bacterial Gene-finding Dense Genomes Dense Genomes Short intergenic regions Short intergenic regions Uninterrupted ORFs Uninterrupted ORFs Conserved signals Conserved signals Abundant comparative information Abundant comparative information
10
Plasmids Extra chromosomal circular DNAs Found in bacteria, yeast and other fungi Found in bacteria, yeast and other fungi Size varies form ~ 3,000 bp to 100,000 bp. Size varies form ~ 3,000 bp to 100,000 bp. Replicate autonomously (origin of replication) Replicate autonomously (origin of replication) May contain resistance genes May contain resistance genes May be transferred from one bacterium to another May be transferred from one bacterium to another May be transferred across kingdoms May be transferred across kingdoms Multipcopy plasmids (~ up to 400 plasmids/per cell) Multipcopy plasmids (~ up to 400 plasmids/per cell) Low copy plasmids (1 –2 copies per cell) Low copy plasmids (1 –2 copies per cell) Plasmids may be incompatible with each other Plasmids may be incompatible with each other Are used as vectors that could carry a foreign gene of interest (e.g. insulin) Are used as vectors that could carry a foreign gene of interest (e.g. insulin) -lactamase ori foreign gene
11
Agrobacterium tumefaciens Characteristics Characteristics Plant parasite that causes Crown Gall DiseasePlant parasite that causes Crown Gall Disease Encodes a large (~250kbp) plasmid called Tumor- inducing (Ti) plasmidEncodes a large (~250kbp) plasmid called Tumor- inducing (Ti) plasmid Portion of the Ti plasmid is transferred between bacterial cells and plant cells T-DNA (Tumor DNA ) Portion of the Ti plasmid is transferred between bacterial cells and plant cells T-DNA (Tumor DNA )
12
Agrobacterium tumefaciens T-DNA integrates stably into plant genome T-DNA integrates stably into plant genome Single stranded T-DNA fragment is converted to dsDNA fragment by plant cell Single stranded T-DNA fragment is converted to dsDNA fragment by plant cell Then integrated into plant genome Then integrated into plant genome 2 x 23bp direct repeats play an important role in the excision and integration process 2 x 23bp direct repeats play an important role in the excision and integration process
13
Agrobacterium tumefaciens Tumor formation = hyperplasia Tumor formation = hyperplasia Hormone imbalance Hormone imbalance Caused by A. tumefaciens Caused by A. tumefaciens Lives in intercellular spaces of the plantLives in intercellular spaces of the plant Plasmid contains genes responsible for the diseasePlasmid contains genes responsible for the disease Part of plasmid is inserted into plant DNA Part of plasmid is inserted into plant DNA Wound = entry point 10-14 days later, tumor forms Wound = entry point 10-14 days later, tumor forms
14
Agrobacterium tumefaciens What is naturally encoded in T-DNA? What is naturally encoded in T-DNA? Enzymes for auxin and cytokinin synthesisEnzymes for auxin and cytokinin synthesis Causing hormone imbalance tumor formation/undifferentiated callus Causing hormone imbalance tumor formation/undifferentiated callus Mutants in enzymes have been characterized Mutants in enzymes have been characterized Opine synthesis genes (e.g. octopine or nopaline)Opine synthesis genes (e.g. octopine or nopaline) Carbon and nitrogen source for A. tumefaciens growth Carbon and nitrogen source for A. tumefaciens growth Insertion genes Insertion genes Virulence (vir) genesVirulence (vir) genes Allow excision and integration into plant genomeAllow excision and integration into plant genome
15
Ti plasmid of A. tumefaciens
17
1.Auxin, cytokinin, opine synthetic genes transferred to plant 2.Plant makes all 3 compounds 3.Auxins and cytokines cause gall formation 4.Opines provide unique carbon/nitrogen source only A. tumefaciens can use!
18
Eucaryotic genomes Located on several chromosomes Relatively low gene density (50 genes per mm of DNA in humans) Contour length of DNA Carry organellar genome as well
19
Typical eukaryotic genome 4-224, linear chromosomes 5,000 - 125,000 genes
20
Fungal genomes: S. cerevisiae First completely sequenced eukaryote genome First completely sequenced eukaryote genome Very compact genome: Very compact genome: Short intergenic regions Scarcity of introns Lack of repetitive sequences Strong evidence of duplication: Strong evidence of duplication: Chromosome segments Single genes Redundancy: non-essential genes provide selective advantage Redundancy: non-essential genes provide selective advantage
21
Human Genomes Human 50,000 genes X 2 kbp=100 Mbp Introns=300 Mbp? Regulatory regions=300 Mbp? Only 5-10% of human genome codes for genes - function of other DNA (mostly repetitive sequences) unknown but it might serve structural or regulatory roles
22
Plant genomes It contains three genomes It contains three genomes The size of genomes is given in base pairs (bp) The size of genomes is given in base pairs (bp) The size of genomes is species dependent The size of genomes is species dependent The difference in the size of genome is mainly due to a different number of identical sequence of various size arranged in sequence The difference in the size of genome is mainly due to a different number of identical sequence of various size arranged in sequence The gene for ribosomal RNAs occur as repetitive sequence and together with the genes for some transfer RNAs in several thousand of copies The gene for ribosomal RNAs occur as repetitive sequence and together with the genes for some transfer RNAs in several thousand of copies Structural genes are present in only a few copies, sometimes just single copy. Structural genes encoding for structurally and functionally related proteins often form a gene family Structural genes are present in only a few copies, sometimes just single copy. Structural genes encoding for structurally and functionally related proteins often form a gene family Genetic information is divided in the chromosome Genetic information is divided in the chromosome The DNA in the genome is replicated during the interphase of mitosis The DNA in the genome is replicated during the interphase of mitosis
23
Size of the genome in plants and in human Genome Arabidopsis thaliana Zea mays Vicia faba Human Nucleus 70 Millions 3900 Millions 14500 Millions 2800 Millions Plastid 0.156 Millions 0.136 Millions 0.120 Millions Mitochondrion 0.370 Millions.570 Millions.290 Millions.017 Millions
24
Plant genomes: Arabidopsis thaliana A dicotyledonous plant A dicotyledonous plant A weed growing at the roadside of central Europe A weed growing at the roadside of central Europe It has only 2 x 5 chromosomes It has only 2 x 5 chromosomes It is just 70 Mbp It is just 70 Mbp It has a life cycle of only 6 weeks It has a life cycle of only 6 weeks A model plant for the investigation of plant function A model plant for the investigation of plant function Contains 25,498 structural genes from 11,000 families Contains 25,498 structural genes from 11,000 families The structural genes are present in only few copies sometimes just one protein The structural genes are present in only few copies sometimes just one protein Structural genes encoding for structurally and functionally related proteins often form a gene family Structural genes encoding for structurally and functionally related proteins often form a gene family
25
Plant genomes: Arabidopsis thaliana Cross-phylum matches: Cross-phylum matches: Vertebrates 12% Bacteria / Archaea 10% Fungi 8% 60% have no match in non-plant databases 60% have no match in non-plant databases Evolution involved whole genome duplication followed by subsequent gene loss and extensive local gene duplications Evolution involved whole genome duplication followed by subsequent gene loss and extensive local gene duplications
26
Global Increase in Genome Size Polyploidization (whole genome duplication): Allopolyploidy: combination of genetically distinct chromosome sets. (Wheat…) Autopolyploidy: multiplication of one basic set of chromosomes. (Goldfish, rose…) Regional duplication
27
Repetitive Structure of Eukaryotic Genome Eukaryotic genomes contain various degrees of repetitive structure: satellites, micro/mini- satellites, retrotransposons, retrovirus, etc. Eukaryotic genomes contain various degrees of repetitive structure: satellites, micro/mini- satellites, retrotransposons, retrovirus, etc. Repetitive sequence size correlates with genome size: Repetitive sequence size correlates with genome size: Genome size (*10 9 bp) Heterochromatin (*10 9 bp) Hylobates muelleri Homo sapiens Pan troglodites Symphalangus syndactylus Gorrila gorilla
28
Mechanisms for Regional Increase in Genome Size Duplicative transposition Duplicative transposition Unequal crossing-over Unequal crossing-over Replication slippage Replication slippage Gene amplification (rolling circle replication) Gene amplification (rolling circle replication)
29
Gene Duplication duplication of a part of the gene: duplication of a part of the gene: domain/internal sequence duplication enhance function, novel function by new combination duplication of a complete gene (gene family) duplication of a complete gene (gene family) invariant duplication: dose repetitions, variant duplication: new functions. duplication of a cluster of genes duplication of a cluster of genes
30
Internal Gene Duplication 123456 5’3’ Ancestral trypsinogen gene 16’ 5’3’ Thr Ala Ala Gly 16’ 5’3’ Deletion 4 fold duplication + addition of spacer sequence Internal duplications + addition of intron sequence 1 5’ 12345673738394041 3’ 6’ … Antifreeze glycoprotein gene Spacer: Gly
31
Complete Gene Duplication Invariant duplication: Invariant duplication: RNA specifying genes: Number of tRNA and rRNA correlates with genome size. Variant duplication: Variant duplication:
32
Gene Loss Duplicated genes unprocessed pseudogenes. Duplicated genes unprocessed pseudogenes. Single-copy genes devoid of selection pressure unitary pseudogenes. Single-copy genes devoid of selection pressure unitary pseudogenes. Loss of L-gulono--lactone oxidase in humans, guinea pigs, etc. comparing to other vertebrates: the enzyme at the terminal step of synthesizing L-ascorbic acid (vitamin C).
33
Genome organization
34
Protein Coding Gene Protein Coding Gene A segment of DNA which encodes protein synthesis A segment of DNA which encodes protein synthesis DNA sequence encoding protein DNA sequence encoding protein
35
Gene classification coding genes non-coding genes Messenger RNA Proteins Structural RNA Structural proteinsEnzymes transfer RNA ribosomal RNA other RNA Chromosome (simplified) intergenic region
36
Coding region Nucleotides (open reading frame) encoding the amino acid sequence of a protein The molecular definition of gene includes more than just the coding region
37
Noncoding regions Regulatory regions Regulatory regions RNA polymerase binding siteRNA polymerase binding site Transcription factor binding sitesTranscription factor binding sites Introns Introns Polyadenylation [poly(A)] sites Polyadenylation [poly(A)] sites
38
Eukaryotic genes Most have introns Most have introns Produce monocistronic mRNA: only one encoded protein Produce monocistronic mRNA: only one encoded protein Large Large
39
Appearance of genomes One to many chromosomes One to many chromosomes Repeat sequences common in some genomes e.g. 35% of human are transposable elements - 10% Alu, 14.6% LINE1 sequences Repeat sequences common in some genomes e.g. 35% of human are transposable elements - 10% Alu, 14.6% LINE1 sequences Gene structure varies – no. and length of introns Gene structure varies – no. and length of introns What does 50 kb of sequence look like? repeatPseudogene Intron-exon components of a gene Human – very few genes - repeats Yeast – many genes (~25) – few repeats Maize – mostly repeats
40
What do the genes encode? Genes for basic cellular functions such as translation, transcription, replication and repair share similarity among all organisms Basic functions Yeast – simplest eukaryote Worm – programmed development Fly – complex development Arabidopsis – plant life cycle + Microbes highly specialized Gene families expand to meet biological needs.
41
Repetitive DNA Moderately repeated DNA Moderately repeated DNA Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts)Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts) Large duplicated gene familiesLarge duplicated gene families Mobile DNAMobile DNA Simple-sequence DNA Simple-sequence DNA Tandemly repeated short sequencesTandemly repeated short sequences Found in centromeres and telomeres (and others)Found in centromeres and telomeres (and others) Used in DNA fingerprinting to identify individualsUsed in DNA fingerprinting to identify individuals
42
Types of DNA repeats Tandem repeats (e.g. satellite DNA) Inverted repeats (e.g. in transposons) 5’-CATGTGCTGAAGGCTATGTGCTGCGACG- 3’ 3’-GTACACGACTTCCGATACACGACGCTGC- 5’ 5’-CATGTGCTGAAGGCTCAGCACATCGACG- 3’ 3’-GTACACGACTTCCGAGTCGTGTAGCTGC- 5’ Stem Loop Palindroms = adjacent inverted repeats (e.g. restriction sites) Form hairpin structures Form stem-loop structures Hairpin
43
Repetitive sequences Chromosomal DNA Satellite DNA Caesium chloride density gradient Type No. of Repeats Size Percent of genome Highly repetitive > 1 Mill < 10 bp 10 % Moderately repetitive > 1000 ~ 150 - ~300 bp 20 % Repeats in the mouse genome
44
Mobile DNA Move within genomes Move within genomes Most of moderately repeated DNA sequences found throughout higher eukaryotic genomes Most of moderately repeated DNA sequences found throughout higher eukaryotic genomes L1 LINE is ~5% of human DNA (~50,000 copies)L1 LINE is ~5% of human DNA (~50,000 copies) Alu is ~5% of human DNA (>500,000 copies)Alu is ~5% of human DNA (>500,000 copies) Some encode enzymes that catalyze movement Some encode enzymes that catalyze movement
45
Transposition Movement of mobile DNA Movement of mobile DNA Involves copying of mobile DNA element and insertion into new site in genome Involves copying of mobile DNA element and insertion into new site in genome
46
Why? Molecular parasite: “selfish DNA” Molecular parasite: “selfish DNA” Probably have significant effect on evolution by facilitating gene duplication, which provides the fuel for evolution, and exon shuffling Probably have significant effect on evolution by facilitating gene duplication, which provides the fuel for evolution, and exon shuffling
47
RNA or DNA intermediate Transposon moves using DNA intermediate Transposon moves using DNA intermediate Retrotransposon moves using RNA intermediate Retrotransposon moves using RNA intermediate
48
Types of mobile DNA elements
49
LTR (long terminal repeat) Flank viral retrotransposons and retroviruses Flank viral retrotransposons and retroviruses Contain regulatory sequences Transcription start site and poly (A) site Contain regulatory sequences Transcription start site and poly (A) site
50
LINES and SINES Non-viral retro-transposons Non-viral retro-transposons RNA intermediateRNA intermediate Lack LTRLack LTR LINES (long interspersed elements) LINES (long interspersed elements) ~6000 to 7000 base pairs~6000 to 7000 base pairs L1 LINE (~5% of human DNA)L1 LINE (~5% of human DNA) Encode enzymes that catalyze movementEncode enzymes that catalyze movement SINES (short interspersed elements) SINES (short interspersed elements) ~300 base pairs~300 base pairs Alu (~5% of human DNA)Alu (~5% of human DNA)
51
Mitochondrial genome (mtDNA) Number of mitochondria in plants can be between 50- 2000 Number of mitochondria in plants can be between 50- 2000 One mitochondria consists of 1 – 100 genomes (multiple identical circular chromosomes. They are one large and several smaller One mitochondria consists of 1 – 100 genomes (multiple identical circular chromosomes. They are one large and several smaller Size ~15 Kb in animals Size ~15 Kb in animals Size ~ 200 kb to 2,500 kb in plants Size ~ 200 kb to 2,500 kb in plants Mt DNA is replicated before or during mitosis Mt DNA is replicated before or during mitosis Transcription of mtDNA yielded an mRNA which did not contain the correct information for the protein to be synthesized. RNA editing is existed in plant mitochondria Transcription of mtDNA yielded an mRNA which did not contain the correct information for the protein to be synthesized. RNA editing is existed in plant mitochondria Over 95% of mitochondrial proteins are encoded in the nuclear genome. Over 95% of mitochondrial proteins are encoded in the nuclear genome. Often A+T rich genomes Often A+T rich genomes
52
Chloroplast genome (ctDNA) Multiple circular molecules, similar to procaryotic cyanobacteria, although much smaller (0.001-0.1%of the size of nuclear genomes) Multiple circular molecules, similar to procaryotic cyanobacteria, although much smaller (0.001-0.1%of the size of nuclear genomes) Cells contain many copies of plastids and each plastid contains many genome copies Cells contain many copies of plastids and each plastid contains many genome copies Size ranges from 120 kb to 160 kb Size ranges from 120 kb to 160 kb Plastid genome has changed very little during evolution. Though two plants are very distantly related, their genomes are rather similar in gene composition and arrangement Plastid genome has changed very little during evolution. Though two plants are very distantly related, their genomes are rather similar in gene composition and arrangement Some of plastid genomes contain introns Some of plastid genomes contain introns Many chloroplast proteins are encoded in the nucleus (separate signal sequence) Many chloroplast proteins are encoded in the nucleus (separate signal sequence)
53
The family of plastids Buchannan et al. Fig. 1.44
54
Endosymbiosis Well accepted that chloroplasts and mitochondria were once free living bacteria Well accepted that chloroplasts and mitochondria were once free living bacteria Their metabolism is bacterial (e.g. photosynthesis) Their metabolism is bacterial (e.g. photosynthesis) Retain some DNA (circular chromosome) Retain some DNA (circular chromosome) Protein synthesis sensitive to chloramphenicolProtein synthesis sensitive to chloramphenicol Cytosolic P synthesis sensitive to cycloheximideCytosolic P synthesis sensitive to cycloheximide Most genes transferred from symbiont to nucleus Most genes transferred from symbiont to nucleus Requires protein tagetingRequires protein tageting
55
DNA for chloroplast proteins can be in the nucleus or chloroplast genome Buchannan et al. Fig. 4.4
56
Import of proteins into chloroplasts Buchannan et al. Fig. 4.6
57
Biochemistry inside plastids Photosynthesis – reduction of C, N, and S Photosynthesis – reduction of C, N, and S Amino acids, essential amino acid synthesis restricted to plastids Amino acids, essential amino acid synthesis restricted to plastids Phenylpropanoid amino acids and secondary compounds start in the plastids (shikimic acid pathway)Phenylpropanoid amino acids and secondary compounds start in the plastids (shikimic acid pathway) Site of action of several herbicides, including glyphosateSite of action of several herbicides, including glyphosate Branched-chain amino acidsBranched-chain amino acids Sulfur amino acidsSulfur amino acids Fatty acids – all fatty acids in plants made in plastids Fatty acids – all fatty acids in plants made in plastids
58
“Cellular” Genomes VirusesProcaryotesEucaryotes Viral genome Bacterial chromosome Plasmids Chromosomes (Nuclear genome) Mitochondrial genome Chloroplast genome Genome: all of an organism’s genes plus intergenic DNA Intergenic DNA = DNA between genes Capsid Nucleus
59
Methods of regulation Gene expression Gene expression Normally slow relative to metabolic control that will be discussed most of the time in this courseNormally slow relative to metabolic control that will be discussed most of the time in this course Allows metabolism to be changed in response to environmental factorsAllows metabolism to be changed in response to environmental factors Transcriptional control most commonTranscriptional control most common Sometimes variation in transcription rate not reflected in enzyme amount Sometimes variation in transcription rate not reflected in enzyme amount Translational control also foundTranslational control also found No change in mRNA levels but changes in protein amounts No change in mRNA levels but changes in protein amounts
60
Gene structure relevant to metabolic regulation
61
Promoters
62
Exploring metabolism by genetic methods Antisense – what happens when the amount of an enzyme is reduced Antisense – what happens when the amount of an enzyme is reduced not clear how antisense worksnot clear how antisense works Knockouts Knockouts Often more clear-cut since all of the enzyme is goneOften more clear-cut since all of the enzyme is gone Use of t-DNA, Salk linesUse of t-DNA, Salk lines Overexpression Overexpression Use an unregulated version of the protein or express on a strong promoterUse an unregulated version of the protein or express on a strong promoter Sometimes leads to cosuppressionSometimes leads to cosuppression RNA interference RNA interference 21 to 26 mers seem very effective in regulating translation21 to 26 mers seem very effective in regulating translation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.