Integrative fly analysis: specific aims Aim 1: Comprehensive data collection – Data QC / data standards / – consistent pipelines Aim 2: Integrative annotation.

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

Methods to read out regulatory functions
Periodic clusters. Non periodic clusters That was only the beginning…
Gene regulation /function card Anatomical network card Tassy et al., Figure S1: Navigation diagram of ANISEED Anatomical structure card Expression card.
Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Manolis Kellis: Research synopsis Brief overview 1 slide each vignette Why biology in a computer science group? Big biological questions: 1.Interpreting.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Fly ModENCODE data integration update Manolis Kellis, MIT MIT Computer Science & Artificial Intelligence Laboratory Broad Institute of MIT and Harvard.
Current Topics of Genomics and Epigenomics. Outline  Motivation for analysis of higher order chromatin structure  Methods for studying long range chromatin.
Speaker: HU Xue-Jia Supervisor: WU Yun-Dong Date: 19/12/2013.
Reconstructing Transcription Network in S.cerevisiae WANG Chao Oct. 4, 2004.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al
[Bejerano Spr06/07] 1 TTh 11:00-12:15 in Clark S361 Profs: Serafim Batzoglou, Gill Bejerano TAs: George Asimenos, Cory McLean.
1 Predicting Gene Expression from Sequence Michael A. Beer and Saeed Tavazoie Cell 117, (16 April 2004)
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
Finding Regulatory Motifs in DNA Sequences
Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
An Introduction to ENCODE Mark Reimers, VIPBG (borrowing heavily from John Stamatoyannopoulos and the ENCODE papers)
Computational personal genomics: selection, regulation, epigenomics, disease Manolis Kellis MIT Computer Science & Artificial Intelligence Laboratory Broad.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Small RNAs and their regulatory roles. Presented by: Chirag Nepal.
Inferring transcriptional and microRNA-mediated regulatory programs in glioblastma Setty, M., et al.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Protein and RNA Families
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Drosophila modENCODE Data Integration Manolis Kellis on behalf of: modEncode Analysis Working Group (AWG) modEncode Data Analysis Center (DAC) MIT Computer.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Manolis Kellis Broad Institute of MIT and Harvard
Comparative Genomics Methods for Alternative Splicing of Eukaryotic Genes Liliana Florea Department of Computer Science Department of Biochemistry GWU.
Thoughts on ENCODE Annotations Mark Gerstein. Simplified Comprehensive (published annotation, mostly in '12 & '14 rollouts)
Overview of ENCODE Elements
Jason Ernst Broad Institute of MIT and Harvard
Motif Search and RNA Structure Prediction Lesson 9.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Genomics 2015/16 Silvia del Burgo. + Same genome for all cells that arise from single fertilized egg, Identity?  Epigenomic signatures + Epigenomics:
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Transcriptional Enhancers Looking out for the genes and each other Sridhar Hannenhalli Department of Cell Biology and Molecular Genetics Center for Bioinformatics.
Integrative Genomics. Double-helix DNA strands are separated in the gene coding region Which enzyme detects the beginning of a gene ? RNA Polymerase (multi-subunit.
Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE by, Sushmita Roy, Jason Ernst, Peter V. Kharchenko, Pouya Kheradpour,
Regulation of Gene Expression
Epigenetics Continued
Functional Mapping and Annotation of GWAS: FUMA
Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology.
Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states by Thu-Hang Pham, Christopher.
Interpreting the human genome
Volume 63, Issue 2, Pages (July 2016)
In collaboration with Mikkelsen Lab
Mapping Global Histone Acetylation Patterns to Gene Expression
Hannah K. Long, Sara L. Prescott, Joanna Wysocka  Cell 
Presented by, Jeremy Logue.
Diverse patterns, similar mechanism
Drosophila modENCODE Data Integration
Volume 21, Issue 6, Pages e6 (December 2017)
Transcription Factor Networks in Drosophila melanogaster
Predicting Gene Expression from Sequence
Presented by, Jeremy Logue.
Integrative analysis of 111 reference human epigenomes
The 3D Genome in Transcriptional Regulation and Pluripotency
IMPACT: Genomic Annotation of Cell-State-Specific Regulatory Elements Inferred from the Epigenome of Bound Transcription Factors  Tiffany Amariuta, Yang.
Presentation transcript:

Integrative fly analysis: specific aims Aim 1: Comprehensive data collection – Data QC / data standards / – consistent pipelines Aim 2: Integrative annotation – Systematically annotate functional elements based on combined experimental information Aim 3: Clusters of activity – Find genes / enhancers / chromatin regions / domains of coordinated activity across conditions 1 Aim 4: Predictive models of gene expression – How do motifs -> binding -> chromatin -> expr/splicing, where ‘->’ = ‘predicts’ Aim 5: Regulatory and functional networks – Regulatory network inference – Functional network validation Aim 6 : Comparative / evolutionary analysis – Using conservation to assess: Function / coverage

1. Supervised learning for enhancer annotation 2 Logistic regression classifier recovers known CRMs Combinations of features in each class outperform individual members of that class Combinations of features across classes even stronger

2. Functions of 20 distinct chromatin states in fly DV enhancersAP enhancersGeneral TFsInsulatorsReplicationMotifs Chromatin marks

3. Clusters of activity (e.g. CBP binding vs. TFs) Confirmed by distinct enrichments for – Chromatin mark combinations – Regulatory motifs – GO functional categories – Developmental anatomical terms Component parameters Trx Polycomb Early regulators (kr, cad, hb)

Clusters of TFs vs. chromatin states Polycomb states enriched for enhancers AP-state 60-fold enriched in enhancers Ubiquitous genes enriched for multiple states Trx in enhancer states BEAF/Chro in TSS for ubiquitous genes Strong Su(Hw) in Negative outside promoter states

4. Motif combinations for TF binding prediction 6 Many motifs enriched in binding of corresponding TF (diagonal) However, extensive cross- enrichment suggests extensive cross-talk across binding of factors Fold enrichment Motif enrichment Transcription factor binding Indeed, predictive power for binding increases with motif combinations Both synergistic and antagonistic effects

5. Data integration for stage-specific regulators 7 Fold enrichment or over expression abd-A motif is enriched in new H3K27me3 regions at L2 – Coincides with a drop in the expression of abd-A – Model: sites gain H3K27me3 as abd-A binding lost Additional intriguing stories found, to be explored H3K27me3

6. Evolutionary signatures for diverse functions Protein-coding genes - Codon Substitution Frequencies - Reading Frame Conservation RNA structures - Compensatory changes - Silent G-U substitutions microRNAs - Shape of conservation profile - Structural features: loops, pairs - Relationship with 3’UTR motifs Regulatory motifs - Mutations preserve consensus - Increased Branch Length Score - Genome-wide conservation Stark et al, Nature 2007; Clark et al, Nature 2007

Assessing fraction of conserved bases ‘explained’ Cumulative Per element +CNV +CDS +Pol2 +TF +Marks +ORC +3’UTR +new3’UTR +newCDS +new5’UTR Fly % of conserved bases 40% 80%

The challenge ahead Anterior-Posterior Dorsal-Ventral Annotations & images for all expression patterns Expression domain primitives reveal underlying logic Binding sites of every developmental regulator GAF, check Su(Hw), check BEAF-32, variant Mod(mdg4), novel CP190, novel CTCF, check Sequence motifs for every regulator Understand regulatory logic specifying development

Fly AWG team Sue Celniker Brenton Graveley Steve Brenner Michael Brent Gary Karpen Sarah Elgin Mitzi Kuroda Vince Pirrotta Peter Park Peter Kharchenko Michael Tolstorukov Eric Bishop Kevin White Casey Brown Nicolas Negre Nick Bild Bob Grossman 11 Eric Lai Nicolas Robine David MacAlpine Matthew Eaton Steve Henikoff Peter Bickel Ben Brown Lincoln Stein Group Suzanna Lewis Gos Micklem Nicole Washington EO Stinson Marc Perry Peter Ruzanov AWG Fly modEncode MIT CompBio Group Chris Bristow Pouya Kheradpour Mike Lin Rachel Sealfon Rogerio Candeias compbio.mit.edu