Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integrative fly analysis: specific aims Aim 1: Comprehensive data collection – Data QC / data standards / – consistent pipelines Aim 2: Integrative annotation.

Similar presentations


Presentation on theme: "Integrative fly analysis: specific aims Aim 1: Comprehensive data collection – Data QC / data standards / – consistent pipelines Aim 2: Integrative annotation."— Presentation transcript:

1 Integrative fly analysis: specific aims Aim 1: Comprehensive data collection – Data QC / data standards / – consistent pipelines Aim 2: Integrative annotation – Systematically annotate functional elements based on combined experimental information Aim 3: Clusters of activity – Find genes / enhancers / chromatin regions / domains of coordinated activity across conditions 1 Aim 4: Predictive models of gene expression – How do motifs -> binding -> chromatin -> expr/splicing, where ‘->’ = ‘predicts’ Aim 5: Regulatory and functional networks – Regulatory network inference – Functional network validation Aim 6 : Comparative / evolutionary analysis – Using conservation to assess: Function / coverage

2 1. Supervised learning for enhancer annotation 2 Logistic regression classifier recovers known CRMs Combinations of features in each class outperform individual members of that class Combinations of features across classes even stronger

3 2. Functions of 20 distinct chromatin states in fly DV enhancersAP enhancersGeneral TFsInsulatorsReplicationMotifs Chromatin marks

4 3. Clusters of activity (e.g. CBP binding vs. TFs) Confirmed by distinct enrichments for – Chromatin mark combinations – Regulatory motifs – GO functional categories – Developmental anatomical terms Component parameters Trx Polycomb Early regulators (kr, cad, hb)

5 5 1.30.71.11.30.80.61.5 2.40.60.90.10.30.20.11.31.41.30.91.0 2.21.80.40.30.60.10.40.10.00.20.10.0 0.3 5.40.3 0.72.60.80.20.10.30.00.10.0 0.10.26.40.1 0.80.9 0.81.81.10.70.4 0.21.10.1 0.0 0.51.52.21.41.1 0.415.50.90.21.11.40.10.0 0.10.30.10.0 0.1 0.61.80.2 1.22.03.03.68.27.92.30.5 0.6 0.73.20.50.11.20.40.10.20.1 0.83.83.52.65.08.91.90.30.20.31.43.85.20.50.10.60.30.1 2.02.92.73.62.72.41.40.7 0.21.40.30.60.20.11.00.90.70.80.6 2.02.9 3.34.35.21.00.50.10.31.72.82.90.20.10.60.70.5 0.4 1.91.32.01.71.00.70.30.7 0.10.20.00.20.10.00.30.60.53.10.6 2.01.22.52.62.71.60.6 0.50.30.70.10.50.1 0.6 0.81.50.8 2.01.01.81.71.10.80.51.00.80.10.80.00.20.0 0.50.91.01.1 0.50.90.60.75.02.71.90.70.64.60.93.46.15.04.01.50.70.50.1 0.01.00.50.41.61.80.70.30.01.41.713.614.41.82.70.60.20.10.00.1 0.21.30.70.83.74.01.20.20.11.60.77.314.52.92.31.30.30.10.0 0.20.91.00.95.06.52.20.40.62.60.73.010.33.51.72.00.50.40.00.1 0.20.40.20.10.50.63.01.23.712.30.51.82.56.35.83.50.80.70.00.1 0.70.80.9 1.7 3.11.63.64.81.51.01.52.01.24.21.61.30.30.4 0.21.00.80.10.30.51.8 1.35.21.92.81.55.44.52.71.20.70.4 0.10.70.20.1 0.20.71.20.24.43.69.22.06.79.61.60.60.30.00.2 0.00.20.1 0.40.20.80.50.36.20.63.23.711.011.71.80.5 0.0 0.10.0 0.1 0.20.30.03.10.88.14.611.612.20.60.10.20.0 0.20.0 0.1 0.2 0.02.61.415.16.56.310.30.40.20.10.0 0.10.80.10.20.3 0.40.60.01.13.618.28.12.56.20.50.1 0.00.1 0.21.80.60.30.51.10.71.20.12.55.38.63.12.73.80.8 0.30.20.6 0.31.20.30.20.40.91.01.10.12.73.48.54.45.67.20.90.60.30.10.3 1.11.61.10.81.01.3 1.10.61.14.81.40.60.7 2.11.71.20.51.0 0.82.21.20.60.81.6 1.80.30.92.31.51.31.10.70.50.90.5 1.1 1.41.51.31.81.21.30.30.90.50.10.70.1 0.00.81.41.01.1 0.94.11.32.11.21.10.30.20.0 0.5 0.60.10.00.3 0.43.50.5 1.11.30.81.10.60.8 0.9 0.20.70.2 0.00.81.41.11.41.1 0.81.20.5 0.40.20.60.90.10.00.50.20.0 0.71.71.00.81.5 0.82.91.60.40.60.91.11.30.20.02.20.90.40.00.10.61.80.80.31.4 1.51.01.41.80.90.81.1 0.80.31.20.10.30.1 1.01.31.50.71.1 0.83.01.40.40.82.21.71.10.51.73.15.82.91.52.52.02.30.70.10.5 1.41.30.9 0.30.7 1.50.30.01.00.2 0.0 0.51.20.70.61.5 1.72.92.12.01.11.70.60.70.20.30.60.5 0.1 0.81.00.51.70.8 1.70.40.50.80.20.10.31.10.30.00.50.0 0.31.0 0.81.6 0.80.50.40.20.1 0.30.90.20.00.40.0 0.20.70.61.31.7 1.00.60.9 0.5 0.91.1 0.21.00.10.20.10.00.91.21.30.81.4 3. Clusters of TFs vs. chromatin states Polycomb states enriched for enhancers AP-state 60-fold enriched in enhancers Ubiquitous genes enriched for multiple states Trx in enhancer states BEAF/Chro in TSS for ubiquitous genes Strong Su(Hw) in Negative outside promoter states

6 4. Motif combinations for TF binding prediction 6 Many motifs enriched in binding of corresponding TF (diagonal) However, extensive cross- enrichment suggests extensive cross-talk across binding of factors 2 -4 2424 Fold enrichment Motif enrichment Transcription factor binding Indeed, predictive power for binding increases with motif combinations Both synergistic and antagonistic effects

7 5. Data integration for stage-specific regulators 7 Fold enrichment or over expression abd-A motif is enriched in new H3K27me3 regions at L2 – Coincides with a drop in the expression of abd-A – Model: sites gain H3K27me3 as abd-A binding lost Additional intriguing stories found, to be explored H3K27me3

8 6. Evolutionary signatures for diverse functions Protein-coding genes - Codon Substitution Frequencies - Reading Frame Conservation RNA structures - Compensatory changes - Silent G-U substitutions microRNAs - Shape of conservation profile - Structural features: loops, pairs - Relationship with 3’UTR motifs Regulatory motifs - Mutations preserve consensus - Increased Branch Length Score - Genome-wide conservation Stark et al, Nature 2007; Clark et al, Nature 2007

9 Assessing fraction of conserved bases ‘explained’ Cumulative Per element +CNV +CDS +Pol2 +TF +Marks +ORC +3’UTR +new3’UTR +newCDS +new5’UTR Fly % of conserved bases 40% 80%

10 The challenge ahead Anterior-Posterior Dorsal-Ventral Annotations & images for all expression patterns Expression domain primitives reveal underlying logic Binding sites of every developmental regulator GAF, check Su(Hw), check BEAF-32, variant Mod(mdg4), novel CP190, novel CTCF, check Sequence motifs for every regulator Understand regulatory logic specifying development

11 Fly AWG team Sue Celniker Brenton Graveley Steve Brenner Michael Brent Gary Karpen Sarah Elgin Mitzi Kuroda Vince Pirrotta Peter Park Peter Kharchenko Michael Tolstorukov Eric Bishop Kevin White Casey Brown Nicolas Negre Nick Bild Bob Grossman 11 Eric Lai Nicolas Robine David MacAlpine Matthew Eaton Steve Henikoff Peter Bickel Ben Brown Lincoln Stein Group Suzanna Lewis Gos Micklem Nicole Washington EO Stinson Marc Perry Peter Ruzanov AWG Fly modEncode MIT CompBio Group Chris Bristow Pouya Kheradpour Mike Lin Rachel Sealfon Rogerio Candeias compbio.mit.edu


Download ppt "Integrative fly analysis: specific aims Aim 1: Comprehensive data collection – Data QC / data standards / – consistent pipelines Aim 2: Integrative annotation."

Similar presentations


Ads by Google