Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology.

Similar presentations


Presentation on theme: "Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology."— Presentation transcript:

1 Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

2 Lecture outline 1.Introduction to gene regulation and epigenetics 2.Problems in computational biology and bioinformatics Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20152

3 INTRODUCTION TO GENE REGULATION AND EPIGENETICS Part 1

4 Gene regulation Here defined as the control of the amount and the products of a gene Amount: – Number of transcripts produced – Number of protein produced Products: – RNAs Isoforms Modifications – Proteins Modifications Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20154

5 Gene “expression” Gene expression is a general term used to indicate the production of gene products More specific terms: – Transcription rate (number of new transcripts per time) – Transcript level (total number of transcripts in the cell) – Translation rate – Protein level All these are correlated but not identical, sometimes only weakly correlated Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20155

6 Gene regulation Expression of genes needs to be tightly regulated – Differentiation into different cell types – Response to environmental conditions How are genes regulated? – Transcriptional – Post-transcriptional – Translational – Post-translational Analogy: lighting controlling Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20156

7 A simple illustration Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20157 G1 P1 G2 P3 Me Ac G3 P5 P6 P7 G4

8 Histone modifications Chromatin accessibility Protein-protein interactions and DNA long-range interactions Protein-RNA interactions miRNA-mRNA interactions DNA methylation Transcription factor binding A simple illustration Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20158 G1 P1 G2 P3 Me Ac G3 P5 P6 P7

9 More details and other mechanisms Transcriptional regulation – Transcription factors Binding to promoter vs. distal elements (e.g., enhancers) Activators vs. repressors Post-transcriptional regulation – Capping – Poly-adenylation – Splicing – RNA editing – mRNA degradation Translation – Translational repression Post-translational – Protein modifications (e.g., phosphorylation) Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20159 Image source: http://www.emunix.emich.edu/~rwinning/genetics/eureg.htm

10 Epigenetics Wikipedia: “the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence” – Heritable: Can pass on to offspring (daughter cells) – Mechanisms other than changes in DNA Same DNA, different outcomes Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201510

11 Active and inactive epigenetic signals DNA methylation Chromatin remodeling Histone modifications RNA transcripts... Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201511 Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011)

12 DNA methylation Methyl group (-CH 3 ) added to cytosine in eukaryotic DNA, usually next to a guanine (in a CpG dinucleotide) – Forming 5-methylcytosine Can be further modified into 5-hydroxymethycytosine Hypermethylation at promoter can cause gene repression Recent studies have suggested links between DNA methylation and – Protein binding – Transcriptional elongation – Splicing Gene imprinting: parent-specific expression Implications in diseases De novel vs. maintenance Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201512 Image source: http://www.zymoresearch.com/media/images/products/D5405-2.jpg, http://missinglink.ucsf.edu/lm/genes_and_genomes/methylation.html

13 Chromatin remodeling Chromatin: compact structure of DNA and proteins – DNA wraps around histone proteins to form nucleosomes – Nuelceosome positioning can be changed dynamically, affecting DNA accessibility (e.g., to binding proteins) Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201513 Image credit: wikipedia

14 Histone modifications Modification of specific residues on histone proteins – Acytelation, methylation, phosphorylation, ubiquination, etc. – Nomenclature: H3K4me3 (Histone protein H3, lysine 4, tri-methylation) – Histone modifications give different types of signals in gene regulation Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201514 Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011)

15 Non-coding RNA There are different types of functional RNA that do not translate into proteins Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201515 TypeAbbreviationFunction Ribosomal RNArRNATranslation Transfer RNAtRNATranslation Small nuclear RNAsnRNASplicing Small nucleolar RNAsnoRNANucleotide modifications MicroRNAmiRNAGene regulation Small interfering RNAsiRNAGene regulation ………

16 MicroRNA Short (~22 nucleotides) RNAs that regulate gene expression by promoting mRNA degradation or repressing translation Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201516 Image credit: wikipedia, Sun et al., Annual Review of Biomedical Engineering 12:1-27, (2010)

17 Gene regulation and epigenetics Some mechanisms are known to regulate gene expression. For example: – Transcription factor binding can activate or repress transcription – miRNA-mRNA binding can promote mRNA cleavage or repress translation Some signals are correlated with expression, but the causal direction is not certain (or not fixed). For example: – Promoter DNA methylation and transcriptional repression – Histone modifications with expression levels The different mechanisms are not independent. Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201517

18 High-throughput methods (recap) Protein-DNA binding (ChIP-seq, ChIP-exo,...) DNA long-range interactions (ChIA-PET, Hi-C, TCC,...) [project] DNA methylation (bisulfite sequencing, RRBS, MeDIP- seq, MBDCap-seq,...) [project] Open chromatin (DNase-seq, FAIRE-seq,...) Histone modifications (ChIP-seq) Gene expression (RNA-seq, CAGE,...), isoforms [project] Protein-RNA binding (CLIP-Seq, HITS-CLIP, PAR-CLIP, RIP-seq,...) [project]... Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201518

19 PROBLEMS IN COMPUTATIONAL BIOLOGY AND BIOINFORMATICS Part 2

20 Some related CBB Problems Analysis of chromatin patterns [project] Identification of regulatory elements [lecture, discussion paper] Reconstruction of transcription factor (TF) regulatory networks [project] Identification of non-coding RNAs [project] Prediction of miRNA targets [project] Construction of gene expression models [project] Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201520

21 Analysis of chromatin patterns Computational tasks: – Segmentation of the human genome Fix-sized bins Based on annotation Unsupervised clustering – Hidden Markov models Supervised classification – Data aggregation and integration – Large-scale correlations Learning of signal shapes – Visualization Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201521

22 Genome segmentation Using chromatin state to segment the genome – Hidden Markov model – Clustering Annotate identified states using biological knowledge Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201522 Image credit: Ernst and Kellis, Nature Methods 9(3):215-216, (2012)

23 Global chromatin patterns Many recent findings that relate chromatin patterns with other features – Global example: histone modifications, recombination rates and chromosome 1D and 3D structures in C. elegans Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201523 Image credit: Gerstein et al., Science 330(6012):1775-1787, (2010)

24 Local chromatin patterns – Histone modifications and protein binding at promoters and enhancers in human Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201524 Image credit: Heintzman et al., Nature Genetics 39(3):311-318, (2007)

25 Identifying regulatory elements There are different types of protein-binding regions in the DNA – Promoters – Enhancers – Silencers – Insulators –... How to locate them in the genome? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201525 Image credit: Raab and Kamakaka, Nature Reviews Genetics 11(6):439-446, (2010)

26 Identifying regulatory elements Some useful information: – Genomic location E.g., promoters are around transcription start sites – Evolutionary conservation Functional regions are more conserved – Protein binding signals and motifs E.g., EP300 at enhancers, CTCF at insulators – Chromatin features E.g., DNase I hypersensitivity, H3K4me1 and H3k27ac at active enhancers – Reporter assays –... Difficulty: integrating different types of information Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201526

27 Reconstruction of TF network Goals: – Identifying TF binding sites – Determining the target genes of each TF In different cell types In different conditions – Deducing how gene expression is regulated by TFs – Studying how TFs interact with each other Methods: 1.From expression data 2.Sequence-based (motif analysis) 3.From binding experiments – Sign of regulation (activation vs. repression) usually not determined for #2 and #3 Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201527

28 Expression-based methods Input: gene expression levels of genes – Usually from microarrays – Often time series data Output: a network (i.e., directed graph) – Each node is a gene (and its protein product) – An A  B edge means A is a TF and it regulates B Types: – Differential equations – Probabilistic networks – Boolean networks Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201528

29 Expression-based methods Differential equations – Models (y j : expression level of gene j, a ji : influence of TF i on gene j): Linear Sigmoidal... – Methods: Solve system of equations to get best-fit parameter values – Difficulties: Many parameters when there are many TFs – Insufficient training data – L 1 (LASSO) regularization to control the number of non-zero variables Long running time Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201529

30 Boolean networks Considering each gene to be either on or off Treat the gene regulatory network as a Boolean network (similar to a electric circuit) – Expression of a gene at time t+1 depends on the expression of genes that regulate it at time t – Goal: Find the logical relationships between genes Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201530 Image credit: Akutsu and Miyano, Pacific Symposium on Biocomputing 4:17-28, (1999)

31 From binding data Input: binding signals of transcription factors in the whole genome – Usually from ChIP-chip or ChIP-seq – Or from motifs – (Best to combine both) Output: TF regulatory network Difficulties: – Finding binding sites Peak calling Motif analysis – Associating binding sites with target genes Promoters (e.g., 500bp upstream of transcription start site) More difficult for distal binding sites Expression patterns could help – Evaluating functional effects of binding (strong vs. weak, transient binding) Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201531

32 Combining both types of data 1.Use expression data to infer initial network 2.Identify potential regulators 3.Search for binding motifs of these regulators 4.Incorporate global occurrence of these motifs at gene promoters to refine the network Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201532 Image credit: Tamada et al., Bioinformatics 19(Suppl.2):ii227-ii236, (2003)

33 Identification of non-coding RNAs It has recently been shown that a vast amount of DNA is transcribed into RNA by high- throughput experiments What are they? – Experimental artifacts? – Unannotated protein-coding genes? – Non-functional transcripts? – Functional non-coding RNAs? – Pseudogenes? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201533

34 Construction of expression models Given the many different mechanisms involved in gene regulation, how are they related to each other? – Are they redundant? – Do they simply add to each other, or have synergistic effects? – Which have more impacts to final expression levels? – What are their time scales? – When is each mechanism used? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201534

35 Construction of expression models Modeling and prediction – An indirect way to estimate how well a model is: evaluating the accuracy of predicted expression Prediction of: – Expression level Regression: y i  f(x i ) Classification:  (y i > t)  f(x i ) – Differential expression Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201535

36 Construction of expression models Chromatin features and expression Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201536 Image credit: Cheng et al., Genome Biology 12(2):R15, (2011)

37 Construction of expression models Model construction and accuracy Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201537 Image credit: Cheng et al., Genome Biology 12(2):R15, (2011)

38 “Histone code” hypothesis The statistical models are good, but too complex for humans to interpret Is there a simple set of rules (i.e., a “code”) that can easily tell the expression level of a gene? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201538 Image credit: Cheng et al., Genome Biology 12(2):R15, (2011)

39 Summary “Gene expression” is a general term with several possible meanings Gene expression is regulated by many mechanisms, including (but not limited to) – Transcription factor binding – DNA long-range interactions – DNA methylation – Chromatin structure – Histone modifications – MicroRNA-mRNA binding A lot of new genome-wide data Many emerging research topics in CBB Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 201539


Download ppt "Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology."

Similar presentations


Ads by Google