Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology.

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
Functional Non-Coding DNA Part II DNA Regulatory Elements BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG.
Control of Gene Expression
Gene Regulation in Eukaryotic Cells. Gene regulation is complex Regulation, and therefore, expression of a gene is complex. Regulation of these genes.
Methylation, Acetylation and Epigenetics
Regulation of Gene Expression
Lecture #8Date _________ n Chapter 19~ The Organization and Control of Eukaryotic Genomes.
Gene Regulation in Eukaryotes Same basic idea, but more intricate than in prokaryotes Why? 1.Genes have to respond to both environmental and physiological.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
Control of Gene Expression Eukaryotes. Eukaryotic Gene Expression Some genes are expressed in all cells all the time. These so-called housekeeping genes.
Introns and Exons DNA is interrupted by short sequences that are not in the final mRNA Called introns Exons = RNA kept in the final sequence.
Regulation of Gene Expression
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
Regulation of Gene Expression Eukaryotes
Eukaryotic Gene Expression Managing the Complexities of Controlling Eukaryotic Genes.
Chapter 11 Regulation of Gene Expression. Regulation of Gene Expression u Important for cellular control and differentiation. u Understanding “expression”
Regulation of Gene Expression Chapter 18. Warm Up Explain the difference between a missense and a nonsense mutation. What is a silent mutation? QUIZ TOMORROW:
Eukaryotic Genome & Gene Regulation The entire genome of the eukaryotic organism is present in every cell of the organism. Although all genes are present,
AP Biology Control of Eukaryotic Genes.
Copyright © 2009 Pearson Education, Inc. Regulation of Gene Expression in Eukaryotes Chapter 17 Lecture Concepts of Genetics Tenth Edition.
Eukaryotic Gene Expression Managing the Complexities of Controlling Eukaryotic Genes.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Eukaryotic Genomes  The Organization and Control of Eukaryotic Genomes.
Control of Gene Expression Chapter Proteins interacting w/ DNA turn Prokaryotic genes on or off in response to environmental changes  Gene Regulation:
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Introduction to biological molecular networks
Overview of ENCODE Elements
GENE REGULATION RESULTS IN DIFFERENTIAL GENE EXPRESSION, LEADING TO CELL SPECIALIZATION Eukaryotic DNA.
CS173 Lecture 9: Transcriptional regulation III
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
Outline Molecular Cell Biology Assessment Review from last lecture Role of nucleoporins in transcription Activators and Repressors Epigenetic mechanisms.
Eukaryotic Gene Expression
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
Molecules and mechanisms of epigenetics. Adult stem cells know their fate! For example: myoblasts can form muscle cells only. Hematopoetic cells only.
Regulation of Eukaryotic Gene Expression Key concepts in Expression of Eukaryotic Genomes EACH CELL IN YOUR BODY CONTAINS ALL OF THE SAME DNA ;
Chapter 15. I. Prokaryotic Gene Control  A. Conserves Energy and Resources by  1. only activating proteins when necessary  a. don’t make tryptophan.
CAMPBELL BIOLOGY IN FOCUS © 2014 Pearson Education, Inc. Urry Cain Wasserman Minorsky Jackson Reece Lecture Presentations by Kathleen Fitzpatrick and Nicole.
Advances and challenges in computational modeling and statistical learning of biological systems Qi Liu Department of Biomedical Informatics Vanderbilt.
Chapter 15. I. Prokaryotic Gene Control  A. Conserves Energy and Resources by  1. only activating proteins when necessary  a. don’t make tryptophan.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
Control of Gene Expression in Bacteria
YOUR FUTURE STARTS WITH HOPE YOUR FUTURE STARTS WITH HOPE Genome Biology & Applied Bioinformatics Human Genome Mehmet Tevfik DORAK, MD PhD.
Gene Regulation, Part 2 Lecture 15 (cont.) Fall 2008.
High-throughput data used in bioinformatics
Eukaryotic Genome & Gene Regulation
Regulation of Gene Expression
Regulation of Gene Expression
Lecture 4. Topics in Gene Regulation and Epigenomics (Basics)
Regulation of Gene Activity
Introduction to Genetic Analysis
Regulation of Gene Expression by Eukaryotes
Control of Gene Expression
1 Department of Engineering, 2 Department of Mathematics,
Concept 18.2: Eukaryotic gene expression can be regulated at any stage
1 Department of Engineering, 2 Department of Mathematics,
Chapter 18: Regulation of Gene Expression
Regulation of Gene Expression
1 Department of Engineering, 2 Department of Mathematics,
Regulation of Gene Expression
Review Warm-Up What is the Central Dogma?
7.2 Transcription & Gene Expression
Review Warm-Up What is the Central Dogma?
Eukaryotic Gene Regulation
Presentation transcript:

Lecture 4. Topics in Gene Regulation and Epigenomics (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

Lecture outline 1.Introduction to gene regulation and epigenetics 2.Problems in computational biology and bioinformatics Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20152

INTRODUCTION TO GENE REGULATION AND EPIGENETICS Part 1

Gene regulation Here defined as the control of the amount and the products of a gene Amount: – Number of transcripts produced – Number of protein produced Products: – RNAs Isoforms Modifications – Proteins Modifications Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20154

Gene “expression” Gene expression is a general term used to indicate the production of gene products More specific terms: – Transcription rate (number of new transcripts per time) – Transcript level (total number of transcripts in the cell) – Translation rate – Protein level All these are correlated but not identical, sometimes only weakly correlated Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20155

Gene regulation Expression of genes needs to be tightly regulated – Differentiation into different cell types – Response to environmental conditions How are genes regulated? – Transcriptional – Post-transcriptional – Translational – Post-translational Analogy: lighting controlling Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 20156

A simple illustration Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall G1 P1 G2 P3 Me Ac G3 P5 P6 P7 G4

Histone modifications Chromatin accessibility Protein-protein interactions and DNA long-range interactions Protein-RNA interactions miRNA-mRNA interactions DNA methylation Transcription factor binding A simple illustration Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall G1 P1 G2 P3 Me Ac G3 P5 P6 P7

More details and other mechanisms Transcriptional regulation – Transcription factors Binding to promoter vs. distal elements (e.g., enhancers) Activators vs. repressors Post-transcriptional regulation – Capping – Poly-adenylation – Splicing – RNA editing – mRNA degradation Translation – Translational repression Post-translational – Protein modifications (e.g., phosphorylation) Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image source:

Epigenetics Wikipedia: “the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence” – Heritable: Can pass on to offspring (daughter cells) – Mechanisms other than changes in DNA Same DNA, different outcomes Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Active and inactive epigenetic signals DNA methylation Chromatin remodeling Histone modifications RNA transcripts... Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011)

DNA methylation Methyl group (-CH 3 ) added to cytosine in eukaryotic DNA, usually next to a guanine (in a CpG dinucleotide) – Forming 5-methylcytosine Can be further modified into 5-hydroxymethycytosine Hypermethylation at promoter can cause gene repression Recent studies have suggested links between DNA methylation and – Protein binding – Transcriptional elongation – Splicing Gene imprinting: parent-specific expression Implications in diseases De novel vs. maintenance Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image source:

Chromatin remodeling Chromatin: compact structure of DNA and proteins – DNA wraps around histone proteins to form nucleosomes – Nuelceosome positioning can be changed dynamically, affecting DNA accessibility (e.g., to binding proteins) Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: wikipedia

Histone modifications Modification of specific residues on histone proteins – Acytelation, methylation, phosphorylation, ubiquination, etc. – Nomenclature: H3K4me3 (Histone protein H3, lysine 4, tri-methylation) – Histone modifications give different types of signals in gene regulation Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Zhou et al., Nature Reviews Genetics 12(1):7-18, (2011)

Non-coding RNA There are different types of functional RNA that do not translate into proteins Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall TypeAbbreviationFunction Ribosomal RNArRNATranslation Transfer RNAtRNATranslation Small nuclear RNAsnRNASplicing Small nucleolar RNAsnoRNANucleotide modifications MicroRNAmiRNAGene regulation Small interfering RNAsiRNAGene regulation ………

MicroRNA Short (~22 nucleotides) RNAs that regulate gene expression by promoting mRNA degradation or repressing translation Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: wikipedia, Sun et al., Annual Review of Biomedical Engineering 12:1-27, (2010)

Gene regulation and epigenetics Some mechanisms are known to regulate gene expression. For example: – Transcription factor binding can activate or repress transcription – miRNA-mRNA binding can promote mRNA cleavage or repress translation Some signals are correlated with expression, but the causal direction is not certain (or not fixed). For example: – Promoter DNA methylation and transcriptional repression – Histone modifications with expression levels The different mechanisms are not independent. Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

High-throughput methods (recap) Protein-DNA binding (ChIP-seq, ChIP-exo,...) DNA long-range interactions (ChIA-PET, Hi-C, TCC,...) [project] DNA methylation (bisulfite sequencing, RRBS, MeDIP- seq, MBDCap-seq,...) [project] Open chromatin (DNase-seq, FAIRE-seq,...) Histone modifications (ChIP-seq) Gene expression (RNA-seq, CAGE,...), isoforms [project] Protein-RNA binding (CLIP-Seq, HITS-CLIP, PAR-CLIP, RIP-seq,...) [project]... Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

PROBLEMS IN COMPUTATIONAL BIOLOGY AND BIOINFORMATICS Part 2

Some related CBB Problems Analysis of chromatin patterns [project] Identification of regulatory elements [lecture, discussion paper] Reconstruction of transcription factor (TF) regulatory networks [project] Identification of non-coding RNAs [project] Prediction of miRNA targets [project] Construction of gene expression models [project] Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Analysis of chromatin patterns Computational tasks: – Segmentation of the human genome Fix-sized bins Based on annotation Unsupervised clustering – Hidden Markov models Supervised classification – Data aggregation and integration – Large-scale correlations Learning of signal shapes – Visualization Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Genome segmentation Using chromatin state to segment the genome – Hidden Markov model – Clustering Annotate identified states using biological knowledge Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Ernst and Kellis, Nature Methods 9(3): , (2012)

Global chromatin patterns Many recent findings that relate chromatin patterns with other features – Global example: histone modifications, recombination rates and chromosome 1D and 3D structures in C. elegans Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Gerstein et al., Science 330(6012): , (2010)

Local chromatin patterns – Histone modifications and protein binding at promoters and enhancers in human Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Heintzman et al., Nature Genetics 39(3): , (2007)

Identifying regulatory elements There are different types of protein-binding regions in the DNA – Promoters – Enhancers – Silencers – Insulators –... How to locate them in the genome? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Raab and Kamakaka, Nature Reviews Genetics 11(6): , (2010)

Identifying regulatory elements Some useful information: – Genomic location E.g., promoters are around transcription start sites – Evolutionary conservation Functional regions are more conserved – Protein binding signals and motifs E.g., EP300 at enhancers, CTCF at insulators – Chromatin features E.g., DNase I hypersensitivity, H3K4me1 and H3k27ac at active enhancers – Reporter assays –... Difficulty: integrating different types of information Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Reconstruction of TF network Goals: – Identifying TF binding sites – Determining the target genes of each TF In different cell types In different conditions – Deducing how gene expression is regulated by TFs – Studying how TFs interact with each other Methods: 1.From expression data 2.Sequence-based (motif analysis) 3.From binding experiments – Sign of regulation (activation vs. repression) usually not determined for #2 and #3 Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Expression-based methods Input: gene expression levels of genes – Usually from microarrays – Often time series data Output: a network (i.e., directed graph) – Each node is a gene (and its protein product) – An A  B edge means A is a TF and it regulates B Types: – Differential equations – Probabilistic networks – Boolean networks Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Expression-based methods Differential equations – Models (y j : expression level of gene j, a ji : influence of TF i on gene j): Linear Sigmoidal... – Methods: Solve system of equations to get best-fit parameter values – Difficulties: Many parameters when there are many TFs – Insufficient training data – L 1 (LASSO) regularization to control the number of non-zero variables Long running time Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Boolean networks Considering each gene to be either on or off Treat the gene regulatory network as a Boolean network (similar to a electric circuit) – Expression of a gene at time t+1 depends on the expression of genes that regulate it at time t – Goal: Find the logical relationships between genes Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Akutsu and Miyano, Pacific Symposium on Biocomputing 4:17-28, (1999)

From binding data Input: binding signals of transcription factors in the whole genome – Usually from ChIP-chip or ChIP-seq – Or from motifs – (Best to combine both) Output: TF regulatory network Difficulties: – Finding binding sites Peak calling Motif analysis – Associating binding sites with target genes Promoters (e.g., 500bp upstream of transcription start site) More difficult for distal binding sites Expression patterns could help – Evaluating functional effects of binding (strong vs. weak, transient binding) Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Combining both types of data 1.Use expression data to infer initial network 2.Identify potential regulators 3.Search for binding motifs of these regulators 4.Incorporate global occurrence of these motifs at gene promoters to refine the network Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Tamada et al., Bioinformatics 19(Suppl.2):ii227-ii236, (2003)

Identification of non-coding RNAs It has recently been shown that a vast amount of DNA is transcribed into RNA by high- throughput experiments What are they? – Experimental artifacts? – Unannotated protein-coding genes? – Non-functional transcripts? – Functional non-coding RNAs? – Pseudogenes? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Construction of expression models Given the many different mechanisms involved in gene regulation, how are they related to each other? – Are they redundant? – Do they simply add to each other, or have synergistic effects? – Which have more impacts to final expression levels? – What are their time scales? – When is each mechanism used? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Construction of expression models Modeling and prediction – An indirect way to estimate how well a model is: evaluating the accuracy of predicted expression Prediction of: – Expression level Regression: y i  f(x i ) Classification:  (y i > t)  f(x i ) – Differential expression Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall

Construction of expression models Chromatin features and expression Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Cheng et al., Genome Biology 12(2):R15, (2011)

Construction of expression models Model construction and accuracy Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Cheng et al., Genome Biology 12(2):R15, (2011)

“Histone code” hypothesis The statistical models are good, but too complex for humans to interpret Is there a simple set of rules (i.e., a “code”) that can easily tell the expression level of a gene? Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall Image credit: Cheng et al., Genome Biology 12(2):R15, (2011)

Summary “Gene expression” is a general term with several possible meanings Gene expression is regulated by many mechanisms, including (but not limited to) – Transcription factor binding – DNA long-range interactions – DNA methylation – Chromatin structure – Histone modifications – MicroRNA-mRNA binding A lot of new genome-wide data Many emerging research topics in CBB Last update: 26-Sep-2015CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall