Lecture 4. Topics in Gene Regulation and Epigenomics (Basics)

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Epigenetics Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Gene Regulation in Eukaryotic Cells. Gene regulation is complex Regulation, and therefore, expression of a gene is complex. Regulation of these genes.
Methylation, Acetylation and Epigenetics
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Hybridization Diagnostic tools Nucleic acid Basics PCR Electrophoresis
[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
Control of Gene Expression Eukaryotes. Eukaryotic Gene Expression Some genes are expressed in all cells all the time. These so-called housekeeping genes.
Introns and Exons DNA is interrupted by short sequences that are not in the final mRNA Called introns Exons = RNA kept in the final sequence.
Regulation of Gene Expression
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
Eukaryotic Gene Regulation
Regulation of Gene Expression Eukaryotes
Chapter 11 Regulation of Gene Expression. Regulation of Gene Expression u Important for cellular control and differentiation. u Understanding “expression”
Regulation of Gene Expression Chapter 18. Warm Up Explain the difference between a missense and a nonsense mutation. What is a silent mutation? QUIZ TOMORROW:
Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology.
Eukaryotic Genome & Gene Regulation The entire genome of the eukaryotic organism is present in every cell of the organism. Although all genes are present,
Eukaryotic Genomes  The Organization and Control of Eukaryotic Genomes.
Control of Gene Expression Chapter Proteins interacting w/ DNA turn Prokaryotic genes on or off in response to environmental changes  Gene Regulation:
GENE REGULATION RESULTS IN DIFFERENTIAL GENE EXPRESSION, LEADING TO CELL SPECIALIZATION Eukaryotic DNA.
CS173 Lecture 9: Transcriptional regulation III
AP Biology Eukaryotic Genome Control Mechanisms for Gene expression.
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
Lecture 8 Ch.7 (II) Eukaryotic Gene Regulation. Control of Gene Expression in Eukaryotes: an overview.
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
Molecules and mechanisms of epigenetics. Adult stem cells know their fate! For example: myoblasts can form muscle cells only. Hematopoetic cells only.
CAMPBELL BIOLOGY IN FOCUS © 2014 Pearson Education, Inc. Urry Cain Wasserman Minorsky Jackson Reece Lecture Presentations by Kathleen Fitzpatrick and Nicole.
Chapter 15. I. Prokaryotic Gene Control  A. Conserves Energy and Resources by  1. only activating proteins when necessary  a. don’t make tryptophan.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
Control of Gene Expression in Bacteria
Gene Regulation, Part 2 Lecture 15 (cont.) Fall 2008.
Gene Expression: Prokaryotes and Eukaryotes AP Biology Ch 18.
High-throughput data used in bioinformatics
Eukaryotic Gene Regulation
Thursday, March 2, 2017 GOALS: Finish Ghost in your Genes
Regulation of Gene Expression
Epigenetics 04/04/16.
Eukaryotic Genome Control Mechanisms for Gene Expression
Control of Gene Expression in Eukaryotes
Chapter 15 Gene Control.
Eukaryotic Genome & Gene Regulation
Regulation of Gene Expression
Regulation of Gene Expression
Regulation of Gene Expression
Chapter 15 Controls over Genes.
Introduction to Genetic Analysis
Regulation of Gene Expression by Eukaryotes
Gene Regulation Ability of an organisms to control which genes are present in response to the environment.
SGN22 Regulation of Eukaryotic Genomes (CH 15.2, 15.3)
Eukaryote Gene Expression/Regulation
1 Department of Engineering, 2 Department of Mathematics,
Regulation of Gene Expression
Eukaryotic Gene Expression
Concept 18.2: Eukaryotic gene expression can be regulated at any stage
1 Department of Engineering, 2 Department of Mathematics,
Chapter 18: Regulation of Gene Expression
Regulation of Gene Expression
1 Department of Engineering, 2 Department of Mathematics,
Regulation of Gene Expression
Agenda 3/16 Eukaryotic Control Introduction and Reading
Control of Eukaryotic Genes
Review Warm-Up What is the Central Dogma?
7.2 Transcription & Gene Expression
Review Warm-Up What is the Central Dogma?
Review Warm-Up What is the Central Dogma?
mRNA Degradation and Translation Control
Adam C. Wilkinson, Hiromitsu Nakauchi, Berthold Göttgens  Cell Systems 
Eukaryotic Gene Regulation
7.2 Transcription and gene expression
Presentation transcript:

Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

Lecture outline Introduction to gene regulation and epigenetics Experimental methods for epigenomics Relevant problems in computational biology and bioinformatics Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Introduction to Gene Regulation and Epigenetics Part 1 Introduction to Gene Regulation and Epigenetics

Gene regulation Here defined as the control of the amount and gene products Amount: Number of transcripts Number of proteins Products: RNAs Total A particular transcript isoform With a particular modification Proteins With a particular form (e.g., activated) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Gene “expression” Gene expression is a general term used to indicate the production of gene products More specific terms: Transcription rate (number of new transcripts per time) Transcript level (total number of transcripts in the cell) Translation rate Protein level All these are correlated but not identical, sometimes with only weak correlations Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Gene regulation Expression of genes needs to be tightly regulated Differentiation into different cell types Response to environmental conditions How are genes regulated? Transcriptional Post-transcriptional Translational Post-translational Analogy: lighting controlling Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

A simple illustration G3 G1 P7 P1 G2 P3 P5 P6 G4 Me Me Ac Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

A simple illustration G3 G1 P7 P1 G2 P3 P5 P6 G4 Me Me Ac miRNA-mRNA interactions Protein-RNA interactions P7 Transcription factor binding DNA methylation P1 Me G2 Protein-protein interactions and DNA long-range interactions Histone modifications P3 Me Ac P5 P6 Chromatin accessibility G4 Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

More details and other mechanisms Transcriptional regulation Transcription factors Binding to promoter vs. distal elements (e.g., enhancers) Activators vs. repressors Post-transcriptional regulation Capping Poly-adenylation Splicing RNA editing mRNA degradation Translation Translational repression Post-translational Protein modifications (e.g., phosphorylation) Image source: http://www.emunix.emich.edu/~rwinning/genetics/eureg.htm Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Epigenetics Wikipedia: “the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence” Heritable: Can pass on to offspring (daughter cells) Same DNA, different outcomes But how can these signals be inherited? Still based on DNA sequence (in some complex way) or not? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Heritability (D1, E1)  (D2, E2) D2 = f(D1)  D1 E2 = f(D1, E1) D: DNA E: Epigenetic signals 1, 2: proliferation, differentiation, fertilization D2 = f(D1)  D1 E2 = f(D1, E1) E2 = f(E1)? E0 = f(D0)? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Active and inactive epigenetic signals DNA methylation Chromatin remodeling Histone modifications RNA transcripts ... (And actually not that simple!) Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

DNA methylation Methyl group (-CH3) added to cytosine in eukaryotic DNA, usually next to a guanine (in a CpG dinucleotide) Less common, but could also be in CpHpG or CpHpH contexts (H = A, C or T) Could be converted to other forms during TET-mediated active de-methylation Image credit: Song et al., Nature Biotechnology 30(11):1107-1116, (2012) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

DNA methylation Hyper-methylation at CpG islands/promoters is associated with gene repression Regulatory function of DNA methylation at other regions is not as clear Recent studies have suggested links between DNA methylation and Protein binding Transcriptional elongation Splicing Histone modifications Gene imprinting: parent-specific expression Implications in diseases De novo vs. maintenance Image source: http://missinglink.ucsf.edu/lm/genes_and_genomes/methylation.html Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Trajectory of DNA global methylation changes Image credit: Saitou et al., Development 139:15-31, (2012) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Chromatin remodeling Chromatin: compact structure of DNA and proteins DNA wraps around histone proteins to form nucleosomes (~146bp of DNA around each histone octamer) Nuelceosome positioning can be changed dynamically, affecting DNA accessibility (e.g., to binding proteins) Image credit: wikipedia Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Histone modifications Modification of specific residues on histone proteins Acytelation, methylation, phosphorylation, ubiquination, etc. Nomenclature: H3K4me3 (Histone protein H3, lysine 4, tri-methylation) Image credit: Kato et al., IBMS BonKEy 7:314-324, (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Histone modifications and regulation Histone modifications give different types of signals in gene regulation (again, a simplified view): Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Non-coding RNA There are different types of functional RNA that do not translate into proteins Type Abbreviation Function Ribosomal RNA rRNA Translation Transfer RNA tRNA Small nuclear RNA snRNA Splicing Small nucleolar RNA snoRNA Nucleotide modifications MicroRNA miRNA Gene regulation Small interfering RNA siRNA … Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

MicroRNA Short (~22 nucleotides) RNAs that regulate gene expression by promoting mRNA degradation or repressing translation Image credit: wikipedia, Sun et al., Annual Review of Biomedical Engineering 12:1-27, (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Gene regulation and epigenetics Some mechanisms are known to regulate gene expression. For example: Transcription factor binding can activate or repress transcription miRNA-mRNA binding can promote mRNA cleavage or repress translation Some signals are correlated with expression, but the causal direction is not certain (or not fixed). For example: Promoter DNA methylation and transcriptional repression Histone modifications and gene activation/repression The different mechanisms are not independent. Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Experimental Methods for Epigenomics Part 2 Experimental Methods for Epigenomics

From epigenetics to epigenomics Not focusing on a single gene, but the whole genome Measuring signals genome-wide Studying general (statistical) phenomena / (biological) mechanisms Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

High-throughput methods (recap) Protein-DNA binding (ChIP-seq, ChIP-exo, ...) DNA long-range interactions (ChIA-PET, Hi-C, TCC, ...) DNA methylation (bisulfite sequencing, RRBS, MeDIP-seq, MBDCap-seq, ...) Open chromatin (DNase-seq, FAIRE-seq, ATAC-seq, ...) Histone modifications (ChIP-seq) Gene expression (RNA-seq, CAGE, ...), isoforms Protein-RNA binding (CLIP-Seq, HITS-CLIP, PAR-CLIP, RIP-seq, ...) ... Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

ChIP-seq Chromatin immunoprecipitation followed by sequencing Use antibody to “pull down” target DNA, such as DNA bound by a certain protein or with a certain chemical modification Image credit: Mardis, Nature Methods 4:613-614, (2007) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

ChIP-exo Higher data resolution by adding a digestion step Image credit: Wikipedia, Rhee et al., Cell 147(6):1408-1419, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Open chromatin ATAC-seq winning out due to its lower amount of cells required As low as 50,000 cells Up to 1,000 folds lower than MNase-seq or DNase-seq Image credit: Meyer and Liu., Nature Reviews Genetics 15(11):709-721, (2014) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

DNA methylation BS and oxBS DNA-seq BS oxBS C T 5mC 5hmC/5fC Image credit: Booth et al., Nature Protocols 8(10):1841-1851, (2013) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Problems in Computational Biology and Bioinformatics Part 3 Problems in Computational Biology and Bioinformatics

Some related CBB Problems Analysis of chromatin patterns Identification of regulatory elements [lecture] Reconstruction of transcription factor (TF) regulatory networks Identification of non-coding RNAs Prediction of miRNA targets Construction of gene expression models Inferring epigenetic signals Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Analysis of chromatin patterns Computational tasks: Segmentation of the human genome Single bases/fix-sized bins or based on annotation Unsupervised clustering or supervised classification Data aggregation and integration Large-scale correlations Learning of signal shapes Visualization Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Genome segmentation Using chromatin state to segment the genome Hidden Markov model Clustering Annotate identified states using biological knowledge Image credit: Ernst and Kellis, Nature Methods 9(3):215-216, (2012) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Global chromatin patterns Many recent findings that relate chromatin patterns with other features Global example: histone modifications, recombination rates and chromosome 1D and 3D structures in C. elegans Image credit: Gerstein et al., Science 330(6012):1775-1787, (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Local chromatin patterns Histone modifications and protein binding at promoters and enhancers in human Image credit: Heintzman et al., Nature Genetics 39(3):311-318, (2007) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Identifying regulatory elements There are different types of protein-binding regions in the DNA Promoters Enhancers Silencers Insulators ... How to locate them in the genome? Image credit: Raab and Kamakaka, Nature Reviews Genetics 11(6):439-446, (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Identifying regulatory elements Some useful information: Genomic location E.g., promoters are around transcription start sites Evolutionary conservation Functional regions are more conserved Protein binding signals and motifs E.g., EP300 at enhancers, CTCF at insulators Chromatin features E.g., DNase I hypersensitivity, H3K4me1 and H3k27ac at active enhancers Reporter assays ... Difficulty: integrating different types of information Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Reconstruction of TF network Goals: Identifying TF binding sites Determining the target genes of each TF In different cell types In different conditions Deducing how gene expression is regulated by TFs Studying how TFs interact with each other Methods: From expression data Sequence-based (motif analysis) From binding experiments Sign of regulation (activation vs. repression) usually not determined for #2 and #3 Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Expression-based methods Input: gene expression levels of genes Usually from microarrays Often time series data Output: a network (i.e., directed graph) Each node is a gene (and its protein product) An AB edge means A is a TF and it regulates B Types: Differential equations Probabilistic networks Boolean networks Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Expression-based methods Differential equations Models (yj: expression level of gene j, aji: influence of TF i on gene j): Linear Sigmoidal ... Methods: Solve system of equations to get best-fit parameter values Difficulties: Many parameters when there are many TFs Insufficient training data L1 (LASSO) regularization to control the number of non-zero variables Long running time Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Boolean networks Considering each gene to be either on or off Treat the gene regulatory network as a Boolean network (similar to a electric circuit) Expression of a gene at time t+1 depends on the expression of genes that regulate it at time t Goal: Find the logical relationships between genes Image credit: Akutsu and Miyano, Pacific Symposium on Biocomputing 4:17-28, (1999) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

From binding data Input: binding signals of transcription factors in the whole genome Usually from ChIP-chip or ChIP-seq Or from motifs (Best to combine both) Output: TF regulatory network Difficulties: Finding binding sites Peak calling Motif analysis Associating binding sites with target genes Promoters (e.g., 500bp upstream of transcription start site) More difficult for distal binding sites Expression patterns could help Evaluating functional effects of binding (strong vs. weak, transient binding) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Combining both types of data Use expression data to infer initial network Identify potential regulators Search for binding motifs of these regulators Incorporate global occurrence of these motifs at gene promoters to refine the network Image credit: Tamada et al., Bioinformatics 19(Suppl.2):ii227-ii236, (2003) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Identification of non-coding RNAs It has recently been shown that a vast amount of DNA is transcribed into RNA by high-throughput experiments What are they? Experimental artifacts? Unannotated protein-coding genes? Non-functional transcripts? Functional non-coding RNAs? Pseudogenes? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Prediction of miRNA targets Table credit: Peterson et al., Frontiers in Genetics 5:23, (2014) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Seed match Table source: TargetScan Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Construction of expression models Given the many different mechanisms involved in gene regulation, how are they related to each other? Are they redundant? Do they simply add to each other, or have synergistic effects? Which have more impacts to final expression levels? What are their time scales? When is each mechanism used? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Construction of expression models Modeling and prediction An indirect way to estimate how well a model is: evaluating the accuracy of predicted expression Prediction of: Expression level Regression: yi  f(xi) Classification: (yi > t)  f(xi) Differential expression Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Construction of expression models Chromatin features and expression Image credit: Cheng et al., Genome Biology 12(2):R15, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Construction of expression models Model construction and accuracy Image credit: Cheng et al., Genome Biology 12(2):R15, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

“Histone code” hypothesis The statistical models are good, but too complex for humans to interpret Is there a simple set of rules (i.e., a “code”) that can easily tell the expression level of a gene? Image credit: Cheng et al., Genome Biology 12(2):R15, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Inferring epigenetic signals Can we infer epigenetic signals (e.g., open chromatin, DNA methylation or histone modifications) from DNA sequence alone? For a specific cell type Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Deep learning of functional activity Image credit: Kelley et al., Genome Research 26(7):990-999, (2016) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

Summary “Gene expression” is a general term with several possible meanings Gene expression is regulated by many mechanisms, including (but not limited to) Transcription factor binding DNA long-range interactions DNA methylation Chromatin structure Histone modifications MicroRNA-mRNA binding A lot of new genome-wide data Many emerging research topics in CBB Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018