Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4. Topics in Gene Regulation and Epigenomics (Basics)

Similar presentations


Presentation on theme: "Lecture 4. Topics in Gene Regulation and Epigenomics (Basics)"— Presentation transcript:

1 Lecture 4. Topics in Gene Regulation and Epigenomics (Basics)
The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

2 Lecture outline Introduction to gene regulation and epigenetics
Experimental methods for epigenomics Relevant problems in computational biology and bioinformatics Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

3 Introduction to Gene Regulation and Epigenetics
Part 1 Introduction to Gene Regulation and Epigenetics

4 Gene regulation Here defined as the control of the amount and gene products Amount: Number of transcripts Number of proteins Products: RNAs Total A particular transcript isoform With a particular modification Proteins With a particular form (e.g., activated) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

5 Gene “expression” Gene expression is a general term used to indicate the production of gene products More specific terms: Transcription rate (number of new transcripts per time) Transcript level (total number of transcripts in the cell) Translation rate Protein level All these are correlated but not identical, sometimes with only weak correlations Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

6 Gene regulation Expression of genes needs to be tightly regulated
Differentiation into different cell types Response to environmental conditions How are genes regulated? Transcriptional Post-transcriptional Translational Post-translational Analogy: lighting controlling Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

7 A simple illustration G3 G1 P7 P1 G2 P3 P5 P6 G4 Me Me Ac
Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

8 A simple illustration G3 G1 P7 P1 G2 P3 P5 P6 G4 Me Me Ac
miRNA-mRNA interactions Protein-RNA interactions P7 Transcription factor binding DNA methylation P1 Me G2 Protein-protein interactions and DNA long-range interactions Histone modifications P3 Me Ac P5 P6 Chromatin accessibility G4 Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

9 More details and other mechanisms
Transcriptional regulation Transcription factors Binding to promoter vs. distal elements (e.g., enhancers) Activators vs. repressors Post-transcriptional regulation Capping Poly-adenylation Splicing RNA editing mRNA degradation Translation Translational repression Post-translational Protein modifications (e.g., phosphorylation) Image source: Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

10 Epigenetics Wikipedia: “the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence” Heritable: Can pass on to offspring (daughter cells) Same DNA, different outcomes But how can these signals be inherited? Still based on DNA sequence (in some complex way) or not? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

11 Heritability (D1, E1)  (D2, E2) D2 = f(D1)  D1 E2 = f(D1, E1)
D: DNA E: Epigenetic signals 1, 2: proliferation, differentiation, fertilization D2 = f(D1)  D1 E2 = f(D1, E1) E2 = f(E1)? E0 = f(D0)? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

12 Active and inactive epigenetic signals
DNA methylation Chromatin remodeling Histone modifications RNA transcripts ... (And actually not that simple!) Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

13 DNA methylation Methyl group (-CH3) added to cytosine in eukaryotic DNA, usually next to a guanine (in a CpG dinucleotide) Less common, but could also be in CpHpG or CpHpH contexts (H = A, C or T) Could be converted to other forms during TET-mediated active de-methylation Image credit: Song et al., Nature Biotechnology 30(11): , (2012) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

14 DNA methylation Hyper-methylation at CpG islands/promoters is associated with gene repression Regulatory function of DNA methylation at other regions is not as clear Recent studies have suggested links between DNA methylation and Protein binding Transcriptional elongation Splicing Histone modifications Gene imprinting: parent-specific expression Implications in diseases De novo vs. maintenance Image source: Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

15 Trajectory of DNA global methylation changes
Image credit: Saitou et al., Development 139:15-31, (2012) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

16 Chromatin remodeling Chromatin: compact structure of DNA and proteins
DNA wraps around histone proteins to form nucleosomes (~146bp of DNA around each histone octamer) Nuelceosome positioning can be changed dynamically, affecting DNA accessibility (e.g., to binding proteins) Image credit: wikipedia Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

17 Histone modifications
Modification of specific residues on histone proteins Acytelation, methylation, phosphorylation, ubiquination, etc. Nomenclature: H3K4me3 (Histone protein H3, lysine 4, tri-methylation) Image credit: Kato et al., IBMS BonKEy 7: , (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

18 Histone modifications and regulation
Histone modifications give different types of signals in gene regulation (again, a simplified view): Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

19 Non-coding RNA There are different types of functional RNA that do not translate into proteins Type Abbreviation Function Ribosomal RNA rRNA Translation Transfer RNA tRNA Small nuclear RNA snRNA Splicing Small nucleolar RNA snoRNA Nucleotide modifications MicroRNA miRNA Gene regulation Small interfering RNA siRNA Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

20 MicroRNA Short (~22 nucleotides) RNAs that regulate gene expression by promoting mRNA degradation or repressing translation Image credit: wikipedia, Sun et al., Annual Review of Biomedical Engineering 12:1-27, (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

21 Gene regulation and epigenetics
Some mechanisms are known to regulate gene expression. For example: Transcription factor binding can activate or repress transcription miRNA-mRNA binding can promote mRNA cleavage or repress translation Some signals are correlated with expression, but the causal direction is not certain (or not fixed). For example: Promoter DNA methylation and transcriptional repression Histone modifications and gene activation/repression The different mechanisms are not independent. Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

22 Experimental Methods for Epigenomics
Part 2 Experimental Methods for Epigenomics

23 From epigenetics to epigenomics
Not focusing on a single gene, but the whole genome Measuring signals genome-wide Studying general (statistical) phenomena / (biological) mechanisms Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

24 High-throughput methods (recap)
Protein-DNA binding (ChIP-seq, ChIP-exo, ...) DNA long-range interactions (ChIA-PET, Hi-C, TCC, ...) DNA methylation (bisulfite sequencing, RRBS, MeDIP-seq, MBDCap-seq, ...) Open chromatin (DNase-seq, FAIRE-seq, ATAC-seq, ...) Histone modifications (ChIP-seq) Gene expression (RNA-seq, CAGE, ...), isoforms Protein-RNA binding (CLIP-Seq, HITS-CLIP, PAR-CLIP, RIP-seq, ...) ... Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

25 ChIP-seq Chromatin immunoprecipitation followed by sequencing
Use antibody to “pull down” target DNA, such as DNA bound by a certain protein or with a certain chemical modification Image credit: Mardis, Nature Methods 4: , (2007) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

26 ChIP-exo Higher data resolution by adding a digestion step
Image credit: Wikipedia, Rhee et al., Cell 147(6): , (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

27 Open chromatin ATAC-seq winning out due to its lower amount of cells required As low as 50,000 cells Up to 1,000 folds lower than MNase-seq or DNase-seq Image credit: Meyer and Liu., Nature Reviews Genetics 15(11): , (2014) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

28 DNA methylation BS and oxBS DNA-seq BS oxBS C T 5mC 5hmC/5fC
Image credit: Booth et al., Nature Protocols 8(10): , (2013) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

29 Problems in Computational Biology and Bioinformatics
Part 3 Problems in Computational Biology and Bioinformatics

30 Some related CBB Problems
Analysis of chromatin patterns Identification of regulatory elements [lecture] Reconstruction of transcription factor (TF) regulatory networks Identification of non-coding RNAs Prediction of miRNA targets Construction of gene expression models Inferring epigenetic signals Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

31 Analysis of chromatin patterns
Computational tasks: Segmentation of the human genome Single bases/fix-sized bins or based on annotation Unsupervised clustering or supervised classification Data aggregation and integration Large-scale correlations Learning of signal shapes Visualization Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

32 Genome segmentation Using chromatin state to segment the genome
Hidden Markov model Clustering Annotate identified states using biological knowledge Image credit: Ernst and Kellis, Nature Methods 9(3): , (2012) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

33 Global chromatin patterns
Many recent findings that relate chromatin patterns with other features Global example: histone modifications, recombination rates and chromosome 1D and 3D structures in C. elegans Image credit: Gerstein et al., Science 330(6012): , (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

34 Local chromatin patterns
Histone modifications and protein binding at promoters and enhancers in human Image credit: Heintzman et al., Nature Genetics 39(3): , (2007) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

35 Identifying regulatory elements
There are different types of protein-binding regions in the DNA Promoters Enhancers Silencers Insulators ... How to locate them in the genome? Image credit: Raab and Kamakaka, Nature Reviews Genetics 11(6): , (2010) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

36 Identifying regulatory elements
Some useful information: Genomic location E.g., promoters are around transcription start sites Evolutionary conservation Functional regions are more conserved Protein binding signals and motifs E.g., EP300 at enhancers, CTCF at insulators Chromatin features E.g., DNase I hypersensitivity, H3K4me1 and H3k27ac at active enhancers Reporter assays ... Difficulty: integrating different types of information Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

37 Reconstruction of TF network
Goals: Identifying TF binding sites Determining the target genes of each TF In different cell types In different conditions Deducing how gene expression is regulated by TFs Studying how TFs interact with each other Methods: From expression data Sequence-based (motif analysis) From binding experiments Sign of regulation (activation vs. repression) usually not determined for #2 and #3 Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

38 Expression-based methods
Input: gene expression levels of genes Usually from microarrays Often time series data Output: a network (i.e., directed graph) Each node is a gene (and its protein product) An AB edge means A is a TF and it regulates B Types: Differential equations Probabilistic networks Boolean networks Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

39 Expression-based methods
Differential equations Models (yj: expression level of gene j, aji: influence of TF i on gene j): Linear Sigmoidal ... Methods: Solve system of equations to get best-fit parameter values Difficulties: Many parameters when there are many TFs Insufficient training data L1 (LASSO) regularization to control the number of non-zero variables Long running time Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

40 Boolean networks Considering each gene to be either on or off
Treat the gene regulatory network as a Boolean network (similar to a electric circuit) Expression of a gene at time t+1 depends on the expression of genes that regulate it at time t Goal: Find the logical relationships between genes Image credit: Akutsu and Miyano, Pacific Symposium on Biocomputing 4:17-28, (1999) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

41 From binding data Input: binding signals of transcription factors in the whole genome Usually from ChIP-chip or ChIP-seq Or from motifs (Best to combine both) Output: TF regulatory network Difficulties: Finding binding sites Peak calling Motif analysis Associating binding sites with target genes Promoters (e.g., 500bp upstream of transcription start site) More difficult for distal binding sites Expression patterns could help Evaluating functional effects of binding (strong vs. weak, transient binding) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

42 Combining both types of data
Use expression data to infer initial network Identify potential regulators Search for binding motifs of these regulators Incorporate global occurrence of these motifs at gene promoters to refine the network Image credit: Tamada et al., Bioinformatics 19(Suppl.2):ii227-ii236, (2003) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

43 Identification of non-coding RNAs
It has recently been shown that a vast amount of DNA is transcribed into RNA by high-throughput experiments What are they? Experimental artifacts? Unannotated protein-coding genes? Non-functional transcripts? Functional non-coding RNAs? Pseudogenes? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

44 Prediction of miRNA targets
Table credit: Peterson et al., Frontiers in Genetics 5:23, (2014) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

45 Seed match Table source: TargetScan Last update: 30-Jan-2018
CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

46 Construction of expression models
Given the many different mechanisms involved in gene regulation, how are they related to each other? Are they redundant? Do they simply add to each other, or have synergistic effects? Which have more impacts to final expression levels? What are their time scales? When is each mechanism used? Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

47 Construction of expression models
Modeling and prediction An indirect way to estimate how well a model is: evaluating the accuracy of predicted expression Prediction of: Expression level Regression: yi  f(xi) Classification: (yi > t)  f(xi) Differential expression Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

48 Construction of expression models
Chromatin features and expression Image credit: Cheng et al., Genome Biology 12(2):R15, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

49 Construction of expression models
Model construction and accuracy Image credit: Cheng et al., Genome Biology 12(2):R15, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

50 “Histone code” hypothesis
The statistical models are good, but too complex for humans to interpret Is there a simple set of rules (i.e., a “code”) that can easily tell the expression level of a gene? Image credit: Cheng et al., Genome Biology 12(2):R15, (2011) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

51 Inferring epigenetic signals
Can we infer epigenetic signals (e.g., open chromatin, DNA methylation or histone modifications) from DNA sequence alone? For a specific cell type Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

52 Deep learning of functional activity
Image credit: Kelley et al., Genome Research 26(7): , (2016) Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018

53 Summary “Gene expression” is a general term with several possible meanings Gene expression is regulated by many mechanisms, including (but not limited to) Transcription factor binding DNA long-range interactions DNA methylation Chromatin structure Histone modifications MicroRNA-mRNA binding A lot of new genome-wide data Many emerging research topics in CBB Last update: 30-Jan-2018 CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Spring 2018


Download ppt "Lecture 4. Topics in Gene Regulation and Epigenomics (Basics)"

Similar presentations


Ads by Google