The gene: structure, function and location
Evolution of knowledge about gene G. Mendel Hereditary factors W.Johannsen, 1909 The gene – hereditary unit located in chromosomes G.W.Beadle, E.L.Tatum, 1945 Hypotheses “One gene – one enzyme” Ingram, 1957 Hypotheses “One gene – one polypeptide” Actual concepts The gene – a sequence of DNA responsible for synthesis of macromolecules
DNA Contains information about RNAs and proteins. Double stranded molecule; Polynucleotide chains; Contains information about RNAs and proteins.
Genetic code - DNA: Letters: A, G, C, T Words (one codon one amino acid): AAG - lys AGC - ser GCA - ala TTC – phe TAG – stop Phrases: 5' AAGAGCGCATTCTAG 3' lys – ser – ala – phe – stop
Gene expression DNA mRNA Protein 5’-ATTGCAAGATTACCATGT-3’ Coding strand (untranscribed) 3’-TAACGTTCTAATGGTACA-5’ Template strand (transcribed) Transcription (RNA polymerase) 5’-AUUGCAAGAUUACCAUGU-3’ mRNA Translation (tRNA, ribosomes) Leu – Ala – Arg – Leu – Pro – Cys polypeptide
Definition: Gene – a fragment of polynucleotide chain of DNA which contains information about synthesis of: one polypeptide or several polypeptides or a functional RNA (rRNA, tRNA, snRNA)
Gene expression rRNA tRNA mRNA DNA Protein
2nd class genes = structural Classification: 1st class genes encode 5,8S, 18S and 28S rRNA; 2nd class genes = structural encode mRNA proteins; 3rd class genes encode tRNA, 5S rRNA.
Gene’s localization: Genes are located in DNA molecules; Genes consist of unique or repeated sequences; The genes from one molecule of DNA are separated by non-coding sequences – spacers; There are no morphological borders each gene has only functional frontiers; The length of genes is different.
- The dimensions of human genome – 3,164 x 109 bp - 2% of human genome encode for proteins - Number of genes - 30000-40000 - Chromosome 1 contains – 3380 genes - Chromosome Y contains – 397 genes - Known function – 50% human studied genes - Average length of gene – 3000 bp - gene for β-globin – 1,5 kb - gene for insulin – 1,7 kb - gene for catalase – 34 kb - gene for dystrophin - 2,4 Mb
Distribution of human genes by length Length, kb % Up to 10 23,3 10-25 35,6 25-50 20,2 51-100 13,0 101-500 6,7 over 500 1,2
General structure of the transcription unit Central region – coding region; Regulatory regions: proximal – PROMOTER distal – TERMINATOR ± Modulation sequences
Functions: At molecular level At cellular level At tissue level control of polypeptide’s synthesis functional protein At cellular level production of a normal cellular structure, metabolic chain, signaling chain, etc. At tissue level realization of a specific function (respiration, digestion, contraction, etc.) At organism level a specific trait (character)
Each cell contains a complete set of genes (30-40000 pairs of genes in all 46 molecules of DNA) Expression – only 10% of all genes Permanent expression rRNA genes tRNA genes House keeping genes Temporary expression depending on: - tissue; - ontogenetic period; - cell cycle period; - environment factors Absence of expression pseudogenes
The 2nd class genes = structural (25% of nuclear DNA) Encode one or several polypeptides; Form monocistronic transcription units; Have a mosaic structure (exon/intron); Could be transcribed: In all cells (house keeping genes) Specific, depending on type of cell, age, factors; Are transcribed by RNA-polymerase II in a primary transcript – pro-mRNA; Are numerous, usually unique and heterogeneous; May form repetitive or non-repetitive families of genes; Present individual polymorphisms.
Types of structural genes House keeping – genes that encode indispensable cell proteins, active in all cells, in all periods of life; Tissue-specific – genes that encode for proteins require for tissue specialization; Regulatory of ontogenesis; Dependent on environment factors.
Gene families Repetitive gene’s family: a family of identical genes Non-repetitive gene’s family: a family of genes of related structure and usually related function
Peculiarities of the 2nd class genes structure
Initiation of transcription of the 2nd class genes -10 -20 -30 -40 +1 +10 +20 +30 TAFs RNA-polymerase II TFIID TFIIF TFIIA TFIIE TFIIB
Regions of structural genes Promoter: TATA box (-20, -30) CAAT box (-70, - 100) Tissue-specific boxes( - ) Coding region: site +1, leader sequence exon1/intron/exon2/intron/.../exonn Terminator Site of polyadenilation Enhancers and silencers
Promoter of the 2nd class genes Controls the initiation of transcription: Activation of gene; Fixing of TF and RNA-polymerase II; Identification of (+1) and transcribed strand; Directing of RNA-polymerase II. Is not transcribed; In different genes promoters contain different specific boxes; Mutations in promoter may induce gene inactivation.
Conservative boxes in structure of eukaryotic promoters Sequence Position Length of bound DNA Transcription factors TATA-box TATAAAA - 30 10 p.n. TBP CAAT-box GGCCAATCT - 75 22 p.n. CTF/NF1 GC-box GGGCGG - 90 20 p.n. SP1 Octamer ATTTGCAT Oct1, Oct2
Structure of promoter in structural genes in eukaryotes (2nd class genes) Structure of promoter in structural genes in prokaryotes
Interaction promoter-enhancer
Exons Sequences of structural genes that encode polypeptide sequences; Are found in pro-mRNA and mRNA; Are transcribed and translated; Each exon encodes a region of protein; During alternative splicing some exons may be removed.
Introns Non-coding sequences of structural genes that separate exons; Are found in pro-mRNA but not in mRNA; Are transcribed but not translated; During splicing all introns are removed;
Structure of terminator
Structure of transcription unit which contains rRNA genes in eukaryotes (1st class genes) Promoter (-45 ... +20) Gene 18S Gene 5,8S Gene 28S Terminator n
Organization of the 3rd class genes Structure of 5S rRNA genes Promoter – A box (+ 55) and B box (+ 80) Genes for tRNA / rRNA 5S Terminator n Structure of 5S rRNA genes
Structure of operon in prokaryotes
Human mitochondrial genom
Mobile genetic elements = TRANSPOSONS Type Main gene Type of transposition Exemples DNA transposons Transposase Transposition through excision or replication Tn (bacterial) P (Drosophila) Retrovirus like transposons Revers-transcriptase (revertase) Transposition through RNA produced on the basis of promoters located in LTR THE-1 (human) Ty (yeast) Retro-transposons Transposition through RNA produced on the basis of neighbor promoters L1 - LINEs
Biological role of transposons Site-specific recombination Individual polymorphism of DNA Insertional mutagenesis Genome instability fragile sites in DNA Evolution of genomes
Medical importance of transposons Changes in structure / function of structural genes genetic diseases (hemophilia B, epilepsy, retinita pigmentosum, etc) Variability of pathogen agents resistance to antibiotics and immune system