Comparative genomics, ChIP-chip and transfections to find cis-regulatory modules Penn State University, Center for Comparative Genomics and Bioinformatics:

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Genome Sequence & Gene Expression Chromatin & Nuclear Organization Chromosome Inheritance & Genome Stability.
Epigenetics Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
A Genomic Code for Nucleosome Positioning Authors: Segal E., Fondufe-Mittendorfe Y., Chen L., Thastrom A., Field Y., Moore I. K., Wang J.-P. Z., Widom.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Figure S1: Genome-wide distribution of positions of TAL1 OSs relative to the transcription start sites (TSSs) of RefSeq genes [110].
Understanding the Human Genome: Lessons from the ENCODE project
Gene regulation in cancer 11/14/07. Overview The hallmark of cancer is uncontrolled cell proliferation. Oncogenes code for proteins that help to regulate.
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
[Bejerano Fall10/11] 1 Thank you for the midterm feedback! Projects will be assigned shortly.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Defining the Regulatory Potential of Highly Conserved Vertebrate Non-Exonic Elements Rachel Harte BME230.
[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA Jacob Biesinger Dr.
ChIP-seq QC Xiaole Shirley Liu STAT115, STAT215. Initial QC FASTQC Mappability Uniquely mapped reads Uniquely mapped locations Uniquely mapped locations.
[Bejerano Fall09/10] 1 Thank you for the midterm feedback!
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Cory McLean, Aaron Wenger.
A Computational Analysis of the H Region of Mouse Olfactory Receptor Locus 28 Deanna Mendez SoCalBSI August 2004.
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
Evolutionary and genomic approaches to find gene regulatory sequences Penn State University, Center for Comparative Genomics and Bioinformatics: Webb Miller,
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007.
ENCODE enhancers 12/13/2013 Yao Fu Gerstein lab. ‘Supervised’ enhancer prediction Yip et al., Genome Biology (2012) Get enhancer list away to genes DNase.
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
Fine Structure and Analysis of Eukaryotic Genes
Galaxy: Integrative, Reproducible Analysis of Genomics Data Genomic and Proteomic Approaches to Heart, Lung, Blood and Sleep Disorders Jackson Laboratories.
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Gene Regulatory Elements Discovered by Vertebrate Genome Comparisons Laboratory Heads Penn State University: Center Comparative Genomics and Bioinform.
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
Comparative Genomics Ross Hardison, Penn State University Major collaborators: Webb Miller, Francesca Chiaromonte, Laura Elnitski, David King, et al.,
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
ChIP-chip Data. DNA-binding proteins Constitutive proteins (mostly histones) –Organize DNA –Regulate access to DNA –Have many modifications Acetylation,
Dynamics of epigenetic states during erythroid differentiation Ross Hardison July 20, 2010 Barcelona.
Identification of Compositionally Similar Cis-element Clusters in Coordinately Regulated Genes Anil G Jegga, Ashima Gupta, Andrew T Pinski, James W Carman,
Alistair Chalk, Elisabet Andersson Stem Cell Biology and Bioinformatic Tools, DBRM, Karolinska Institutet, September Day 5-2 What bioinformatics.
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
Genomics of Erythroid Regulation: G1E and G1E-ER4 January 20, 2010.
Genomics of Gene Regulation Genomic and Proteomic Approaches to Heart, Lung, Blood and Sleep Disorders Jackson Laboratories Ross Hardison September 9,
E14.5E16.5E18.5 Normalized mRNA level Get1 Nfix Smarcd3 A Supplementary Figure 1 (A) The microarray expression levels of bladder terminal differentiation.
Using vertebrate genome comparisons to find gene regulatory regions
Genomics of Gene Regulation
Overview of ENCODE Elements
Analysis of ChIP-Seq Data Biological Sequence Analysis BNFO 691/602 Spring 2014 Mark Reimers.
Comparative Genomics I: Tools for comparative genomics
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology.
Genome Comparisons and Gene Regulation Penn State University, Center for Comparative Genomics and Bioinformatics: Webb Miller, Francesca Chiaromonte,
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
Transcriptional Enhancers Looking out for the genes and each other Sridhar Hannenhalli Department of Cell Biology and Molecular Genetics Center for Bioinformatics.
Additional high-throughput sequencing techniques (finding all functional elements of genome) June 15, 2017.
Ross Hardison Department of Biochemistry and Molecular Biology
Ross Hardison Department of Biochemistry and Molecular Biology
Epigenetics Continued
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
ENCODE Pseudogenes and Transcription
High-Resolution Profiling of Histone Methylations in the Human Genome
by Holger Weishaupt, Mikael Sigvardsson, and Joanne L. Attema
High-Resolution Profiling of Histone Methylations in the Human Genome
Presented by, Jeremy Logue.
Fine-Resolution Mapping of TF Binding and Chromatin Interactions
Fine-Resolution Mapping of TF Binding and Chromatin Interactions
Volume 128, Issue 6, Pages (March 2007)
Human Promoters Are Intrinsically Directional
Volume 39, Issue 6, Pages (September 2010)
Volume 132, Issue 2, Pages (January 2008)
By Wenfei Jin Presenter: Peter Kyesmu
Presented by, Jeremy Logue.
Presentation transcript:

Comparative genomics, ChIP-chip and transfections to find cis-regulatory modules Penn State University, Center for Comparative Genomics and Bioinformatics: Webb Miller, Francesca Chiaromonte, Ross Hardison Children’s Hospital of Philadelphia: Mitch Weiss, Lou Dore NimbleGen: Roland Green, Xinmin Zhang Cold Spring Harbor, March 2007 What is conservation good for??

Ideal cases for interpretation by comparative genomics Neutral DNA Similarity Human vs mouse Position along chromosome DNA segments with a function common to divergent species. DNA segments in which change is beneficial to at least one of the two species. Negative selection (purifying) P (not neutral) Neutral DNA Similarity Positive selection (adaptive) Neutral DNA Human vs rhesus

Putative transcriptional regulatory regions = pTRRs Antibodies vs 10 sequence-specific factors: –Sp1, Sp3, E2F1, E2F4, cMyc, STAT1, cJun, CEBPe, PU1, RA Receptor A –High resolution ChIP-chip platforms: Affymetrix and NimbleGen –Data from several different labs in ENCODE consortium High likelihood hits for ChIP-chip –5% false discovery rate Supported by chromatin modification data –Modified histones in chromatin: H4Ac, H3Ac, H3K4me, H3K4me2, H3K4me3, etc. –DNase hypersensitive sites (DHSs) or nucleosome depleted sites Result: set of 1369 pTRRs

Functional classes show distinctive trends in phylogenetic depth of conservation

Genes likely regulated by clade-specific pTRRs are enriched for distinctive functions Millions of years Percentage of pTRRs that align no further than: Primates: 3% Eutherians: 71% Marsupials: 21% Tetrapods: 4% Vertebrates: 1% David King Enriched GO categories q-value for FDR Immune response Protease inhibition Mitosis and cell cycle Transcriptional regulation Ion transport

Regulatory potential (RP) captures pattern, composition and constraint in alignments Genome Research 16:1585 (2006) High RP for an aligned sequence means it contains patterns similar to those found in gene regulatory regions –Positive training set: Alignments of known regulatory regions –Negative training set: Alignments of likely neutral DNA (ancestral repeats) Human and mouse RP scores are on UCSC Genome Browser and PSU’s Galaxy

High RP plus conserved consensus motif is a good predictor of CRMs around GATA-1 regulated genes Genome Research 16:1480 (2006)

Genes Co-expressed in Late Erythroid Maturation G1E cells: proerythroblast line lacking the transcription factor GATA-1. G1E-ER cells: rescued by expressing an estrogen-responsive form of GATA-1 Rylski et al., Mol Cell Biol. 2003

Predict CRMs based on alignment and expression of nearby genes Gene is up- or down-regulated by GATA-1 Noncoding DNA sequence Aligns between mouse and other mammals and has a positive RP score Contains a conserved consensus binding site motif for GATA-1

preCRMs with conserved consensus GATA-1 BS tend to be active on transfected plasmids

DNA segments with positive RP and a GATA-1 binding motif validate as enhancers at a good rate RP consensus motifTested Validated Success Positiveconserved % Positivemouse % Negativeconserved % Negativenone %

Design of ChIP-chip for occupancy by GATA-1 1.Non-overlapping tiling array with 50bp probe and 100bp resolution (NimbleGen) 2.Cover range Mouse chr7: (~70Mbp) 3. Antibody against the ER portion of GATA-1-ER protein in rescued G1E-ER4 cells Yong Cheng (PSU), with Mitch Weiss & Lou Dore (CHoP), Roland Green, Xinmin Zhang(NimbleGen)

Signals in known occupied sites in Hbb LCR 1) Cluster of high signals 2) “hill” shape of the signals HS1 HS2 HS3

ChIP-chip hits are high quality and tend to have GATA-1 binding motifs Peak calling by Mpeak (Ren) and Tamalpais (Beida and Farnham) gave 321 ChIP-chip hits 19 hits were tested by qPCR –13 were validated: ~70% 267 out of the 321 (83%) have WGATAR motifs, binding site for GATA-1 –Random sampling on average gives 102 DNA segments with the motif –The ChIP-chip hits are 2.6-fold enriched for the GATA-1 binding site motif

Only HALF the GATA-1 binding site motifs are conserved outside rodents Of the GATA-1 binding motifs in those 249 hits, 112 (45%) are conserved between mouse and at least one non-rodent species.

Distribution of ChIP-chip hits on 70Mb of mouse chr7 Yong Cheng, Yuepin Zhou and Christine Dorman

GATA-1 occupied sites by ChIP-chipNo GATA-1 21 out of 59 ChIP-chip hits increase activity of HBGpr-Luc in K562 cells. 36% of ChIP-chip hits act as enhancers in K562 cells

GATA-1 occupied sites by ChIP-chip No GATA-1 15 out of 50 ChIP-chip hits increase activity of HBGpr-Luc in MEL cells. 30% of ChIP-chip hits act as enhancers in MEL cells

Validated ChIP hit, enhancer, deep conservation

Validated ChIP hit, enhancer, limited conservation

ChIP-chip hit, enhancer, rodent specific

Test of neutrality using polymorphism and divergence data

A promoter distal to the beta-like globin genes has a signal for recent purifying selection

The distal promoter is close to the locus control region for beta-globin genes

Evolutionary approaches to predicting and analyzing regulatory regions Sequence comparison alone will not detect all regulatory regions –Need comprehensive protein-binding data Comparative genomics can help interpret the binding data –Aspects of regulation of some functional groups are clade-specific –Depth of conservation may correlate with certain types of function Strong constraint on basal mechanisms? Lineage-specific “fine tuning”? A majority of sites occupied by GATA-1 in G1E-ER cells have some function other than enhancement (by our assays) Incorporation of pattern and composition information along with with conservation can lead to effective discrimination of functional classes (regulatory potential).

Many thanks … B:Yong Cheng, Ross, Yuepin Zhou, David King F:Ying Zhang, Joel Martin, Christine Dorman, Hao Wang PSU Database crew: Belinda Giardine, Cathy Riemer, Yi Zhang, Anton Nekrutenko Alignments, chains, nets, browsers, ideas, … Webb Miller, Jim Kent, David Haussler RP scores and other bioinformatic input: Francesca Chiaromonte, James Taylor, Shan Yang, Diana Kolbe, Laura Elnitski Funding from NIDDK, NHGRI, Huck Institutes of Life Sciences at PSU

Categories of Tested DNA Segments

Regulatory potential (RP) to distinguish functional classes

Examples of validated preCRMs

ChIP-chip hits for GATA-1 occupancy MpeakTAMALPAIS 275 hits in both276 hits in both total ChIP-chip hits Technical replicates of ChIP-chip with antibody against GATA1-ER 19 ChIP-chip hits were tested by qPCR: 13 were validated: ~70%