Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein coding genes … & what is a gene

Similar presentations


Presentation on theme: "Protein coding genes … & what is a gene"— Presentation transcript:

1 Protein coding genes … & what is a gene
BS222 – Genome Science Lecture 3 Protein coding genes … & what is a gene Dr. Vladimir B. Teif

2 Module structure Genomes, sequencing projects and genomic databases (VT) (Oct 9, 2018) Sequencing technologies (VT) (Oct 11, 2018) Genome architecture I: protein coding genes (VT) (Oct 16, 2018) Genome architecture II: transcription regulation (VT) (Oct 18, 2018) Genome architecture III: 3D chromatin organisation (VT) (Oct 23, 2018) Epigenetics overview (PVW) (Oct 25, 2018) DNA methylation and other DNA modifications (VT) (Oct 30, 2018) NGS applications I: Experiments and basic analysis (VT) (Nov 1, 2018) NGS applications II: Data integration (VT) (Nov 8, 2018). Comparative genomics (JP, guest lecture) (Nov 13, 2018) SNPs, CNVs, population genomics (LS, guest lecture) (Nov 15, 2018) Histone modifications (PVW) (Nov 20, 2018) Non-coding RNAs (PVW) (Nov 22, 2018) Genome Stability (PVW) ) (Nov 27, 2018) Transcriptomics (PVW) (Nov 29, 2018) Year's best paper (PVW) (Dec 6, 2018) Revision lecture (all lecturers; spring term)

3 A DNA SEQUENCE THAT ENCODES FUNCTION...
WHAT IS A GENE? A UNIT OF HEREDITY TRANSFERRED TO OFFSPRING... A DNA SEQUENCE THAT ENCODES FUNCTION... THE BASIC PHYSICAL/FUNCTIONAL UNIT OF HEREDITY… Watch this *oversimplified* video: While watching try to formulate the definition of the gene Answer here:

4 “PRE-HISTORIC” GENE DEFINITIONS
Gerstein et al., Genome Res :

5 The central dogma genome {A,C,G,T} {A,C,G,U} {20 letters}
1 to 1 mapping {A,C,G,U} see next page {20 letters} Adapted from Gill Bejerano,

6 The genetic code T = Adapted from Gill Bejerano,

7 Genes can be on both strands
Watson strand Crick strand Adapted from Gill Bejerano,

8 THE CENTRAL DOGMA CORRECTED
Adapted from

9 Gene structure UTR = Untranslated Region CDS = Coding Sequence
Adapted from Gill Bejerano,

10 EXAMPLE: β-globin gene
01_19.jpg 01_19.jpg

11 ALTERNATIVE SPLICING

12

13 EXAMPLES OF ALTERNATIVE SPLICING
(A) Alternative splicing producing variant proteins. Alternative splicing results in the variable presence of a 17 amino acid (17aa) peptide near the middle of the WT1 Wilms tumor protein and of a Lys-Thr-Ser tripeptide (KTS) between the third and fourth zinc finger (ZF) domains. Four different isoforms exist for the human ERBB4 protein. Just before the transmembrane (TM) domain there is the alternative presence of a 23-amino-acid peptide or a 13-amino-acid peptide (JM-a and JM-b isoforms, respectively). And within the tyrosine kinase (TK) domain is the variable presence of a 16-amino-acid peptide that has a binding site for phosphatidylinositol-3-kinase (CYT-1 isoforms have the peptide; CYT-2 isoforms lack it).

14 10_15.jpg 10_15.jpg some diagrams of alternative splicing mechanisms

15 OVERLAPPING READING FRAMES
Alternative splicing of the CDKN2A gene produces two entirely different tumour suppressor proteins, p16-INK4A and p14-ARF, which work in cell cycle control. Exon 2, the one exon with coding sequence for both proteins, is translated in different reading frames. (B) This gene generates several transcript variants which differ in their first exons. At least three alternatively spliced variants encoding distinct proteins have been reported, two of which encode structurally related isoforms known to function as inhibitors of CDK4 kinase. The remaining transcript includes an alternate first exon located 20 Kb upstream of the remainder of the gene; this transcript contains an alternate open reading frame (ARF) that specifies a protein which is structurally unrelated to the products of the other variants. This ARF product functions as a stabilizer of the tumor suppressor protein p53 as it can interact with, and sequester, the E3 ubiquitin-protein ligase MDM2, a protein responsible for the degradation of p53. In spite of the structural and functional differences, the CDK inhibitor isoforms and the ARF product encoded by this gene, through the regulatory roles of CDK4 and p53 in cell cycle G1 progression, share a common functionality in cell cycle G1 control. This gene is frequently mutated or deleted in a wide variety of tumors, and is known to be an important tumor suppressor gene. (provided by RefSeq, Sep 2012)

16 RNA EDITING

17 GENE COPY NUMBER CAN VARY
Exome sequencing of osteosarcoma Nature Communications 6, 8940 (2015)

18 Retrogenes Retroogenes count as genes in annotations

19 Pseudogenes Although not fully functional, sometimes may be functional
Pseudogenes may be included in annotations, but marked as “pseudogenes” Pseudogenes have lost at least some of the ability their real gene relative has in gene expression within the cell or their ability to code protein. Pseudogenes often result from the accumulation of mutations.

20 MOBILE GENETIC ELEMENTS
Transposons Retrotransposons DNA transposons Plasmids Bacteriophage elements Group II introns Group I introns ~17% of human genome is formed by mobile elements Mobile elements do not count as genes Lodish et al, Molecular Cell Biology. 4th edition. New York: W. H. Freeman; 2000.

21 Protein splicing Liu, 2000, Annu. Rev. Genet. 34, 61-76

22 Protein splicing Known since 1990 Less common than RNA splicing
Found in all kingdoms of life Liu, 2000, Annu. Rev. Genet. 34, 61-76

23 REFINING THE CONCEPT OF A GENE
RNA splicing Overlapping reading frames Regulatory elements Copy-number variants Intronic genes RNA editing Mobile elements NOT pseudogenes Retrogenes Protein splicing Gerstein et al., Genome Res :

24 REFINING THE CONCEPT OF A GENE
“The gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products” Gerstein et al., Genome Res. 2007, 17,

25 https://padlet.com/Essex/BS222
HOW MANY GENES ARE HERE? Suggest your answer/explanation on Padlet:

26 HOW MANY GENES ARE THERE IN THE HUMAN GENOME?

27 “For the purposes of our study, genes will include any interval along the chromosomal DNA that is transcribed and then translated into a functional protein, or that is transcribed into a functional RNA molecule. By “functional” we mean to include any gene that appears to perform a biological function, even one that might not be essential. Our definition intentionally excludes pseudogenes… When multiple proteins or RNAs are produced from the same region through alternative splicing or alternative transcription initiation, we will count these variants as part of a single gene. Our total gene count, therefore, corresponds to the total number of distinct chromosomal intervals, or loci, that encode either proteins or noncoding RNAs”

28 “we observed over 30 million distinct transcripts in approximately 700,000 distinct genomic locations, of which only about 40,000 (5%) appear to represent functional gene loci”

29 = miscellaneous RNA Pertea et al., bioRxiv 332825

30 GENES IN THE GENOME BROWSER https://genome. ucsc
GENES IN THE GENOME BROWSER Gene direction is shown by >>>> or <<<<<

31 GENES IN THE GENOME BROWSER https://genome. ucsc
GENES IN THE GENOME BROWSER Gene direction is shown by >>>> or <<<<<

32 Prokaryotes vs eukaryotes
Adopted from

33 Prokaryotes vs eukaryotes

34 In prokaryotes the number of genes is ~proportional to the genome size
Is it also true for eukaryotes?

35

36 Prokaryotes vs eukaryotes
85-88% of the genome in coding regions Usually no introns Organised in polycistronic transcriptional units (operons) In total genes Well-defined promoters Eukaryotes: Just 2-4% of the genome in protein coding regions Excessive intron use and much longer genes Some genes are organised in clusters, but this is not very typical ~ protein genes Complex regulatory regions (next lecture)

37

38 Prokaryotes vs eukaryotes
Homo sapiens (human) Prokaryotes vs eukaryotes Saccharomyces cerevisiae (yeast) Drosophila melanogaster (fruit fly) Mais (plant) Escherichia coli (bacteria) Legend Gene Intron Pseudogene Repetitive elements Adopted from

39 definition of the gene TAKE HOME MESSAGE genome size, gene density
What is a gene? What looks like a gene but is not a gene? How many genes do we have? MUST KNOW: PROMOTER, TRANSCRIPTION START SITE Corrected central dogma; pro-, eukaryotic gene structure definition of the gene genome size, gene density OPEN READING FRAME, ALTERNATIVE PROMOTER TRANSCRIPT, PSEUDOGENE


Download ppt "Protein coding genes … & what is a gene"

Similar presentations


Ads by Google