Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genomics of Gene Regulation ANSC 497B Ross Hardison Nov. 10, 2009.

Similar presentations


Presentation on theme: "Genomics of Gene Regulation ANSC 497B Ross Hardison Nov. 10, 2009."— Presentation transcript:

1 Genomics of Gene Regulation ANSC 497B Ross Hardison Nov. 10, 2009

2 DNA sequences involved in regulation of gene transcription Protein-DNA interactions Chromatin effects

3 Distinct classes of regulatory regions Maston G, Evans S and Green M (2006) Annu Rev Genomics Hum Genetics 7:29-59 Act in cis, affecting expression of a gene on the same chromosome. Cis-regulatory modules (CRMs)

4 General features of promoters A promoter is the DNA sequence required for correct initiation of transcription It affects the amount of product from a gene, but does not affect the structure of the product. Most promoters are at the 5’ end of the gene. Maston, Evans & Green (2006) Ann Rev Genomics & Human Genetics, 7:29-59 TATA box + Initiator: Core or minimal promoter. Site of assembly of preinitiation complex Upstream regulatory elements: Regulate efficiency of utilization of minimal promoter RNA polymerase II

5 Conventional view of eukaryotic gene promoters Maston, Evans & Green (2006) Ann Rev Genomics & Human Genetics, 7:29-59

6 Most promoters in mammals are CpG islands TATA, no CpG island About 10% of promoters CpG island, no TATA About 90% of promoters Carninci … Hayashizaki (2006) Nature Genetics 38:626

7 Differences in specificity of start sites for transcription for TATA vs CpG island promoters Carninci … Hayashizaki (2006) Nature Genetics 38:626 Fraction of mRNAs

8 Enhancers Cis-acting sequences that cause an increase in expression of a gene Act independently of position and orientation with respect to the gene. Pennacchio et al., http://enhancer.lbl. gov/ Tested UCE Over half of ultraconserved noncoding sequences are developmental enhancers Pennacchio et al. (2006) Nature 444:499-502 lacZUCEprluciferaseprCRM About half of the enhancers predicted by interspecies alignments are validated in erythroid cells Wang et al. (2006) Genome Research 16:1480- 1492

9 CRMs are clusters of specific binding sites for transcription factors Hardison (2002) on-line textbook Working with Molecular Genetics http://www.bx.psu.edu/~ross/

10 Enhancers can occur in a variety of positions with respect to genes Transcription unitP Ex1Ex2 Enhancer Adjacent Downstream Internal Distal Upstream

11 Silencer Cis-acting sequences that cause a decrease in gene expression Similar to enhancer but has an opposite effect on gene expression Gene repression - inactive chromatin structure (heterochromatin) SIR proteins (Silent Information Regulators) Nucleates assembly of multi-protein complex –hypoacetylated N-terminal tails of histones H3 and H4 –methylated N-terminal tail of H3 (Lys 9)

12 Insulators and boundaries A boundary in chromatin marks a transition from open to closed chromatin An insulator blocks activation of promoter by an enhancer –Requires CTCF Example: HS4 from chick HBB complex has both functions neoRPr Enhancer Insu- lator Neo-resistant colonies % of maximum 5010010 Silencer

13 Repression by PcG proteins via chromatin modification Polycomb Group (PcG) Repressor Complex 2: ESC, E(Z), NURF-55, and PcG repressor SU(Z)12 Methylates K27 of Histone H3 via the SET domain of E(Z) me3 H3 N-tailK27 OFF

14 trx group (trxG) proteins activate via chromatin changes SWI/SNF nucleosome remodeling Histone H3 and H4 acetylation Methylation of K4 in histone H3 –Trx in Drosophila, MLL in humans http://www.igh.cnrs.fr/equip/cavalli/link.PolycombTeaching.html#Part_ 3 Me1,2,3 H3 N-tail K4 ON

15 Histone modifications modulate chromatin structure http://www.imt.uni-marburg.de/bauer/images/fig2.jpg Uta-Maria Bauer H3K27me3H3K4me2, 3

16 Repressed and active chromatin Dustin Schones and Keiji Zhao (2008) Nature Reviews Genetics 9: 179

17 Biochemical features of DNA in CRMs Pol IIa Pol II Coactivators Accessible to cleavage: DNase hypersensitive site Bound by specific transcription factors Associated with RNA polymerase and general transcription factors Nucleosomes with histone modifications: Acetylation of H3 and H4 Methylation of H3K4 Lack of methylation at H3K27 or H3K9 … Clusters of binding site motifs

18 Methods in Genomics of Gene Regulation

19 Chromatin immunoprecipitation: Greatly enrich for DNA occupied by a protein Elaine Mardis (2007) Nature Methods 4: 613-614

20 ChIP-chip: High throughput mapping of DNA sequences occupied by protein http://www.chiponchip.org Bing Ren’s lab

21 Enrichment of sequence tags reveals function Barbara Wold & Richard M Myers (2008) “Sequence Census Methods” Nature Methods 5:19-21

22 Illumina (Solexa) short read sequencing - 8 lanes per run - 10 M to 20 M reads of 36 nucleotides (or longer) per run. - 1 lane can produce enough reads to map locations of a transcription factor in a mammalian genome.

23 Example of ChIP-seq ChIP vs NRSF = neuron-restrictive silencing factor Jurkat human lymphoblast line NPAS4 encodes neuronal PAS domain protein 4 Johnson DS, Mortazavi A, Myers RM, Wold B. (2007) Genome-Wide Mapping of in Vivo Protein-DNA Interactions. Science 316:1497-1502.

24 ChIP-seq for chromatin modifications Dustin Schones and Keiji Zhao (2008) Nature Reviews Genetics 9: 179

25 Histone modifications around HBB locus Known CRMs UCSC genes DNase hyper- sensitive sites Polycomb trithorax Transcription associated mark

26 Distributions at all GenCode TSSs Birney et al. (2007) Nature 477: 799-816 Symmetrical distribution of: - H3K4me3, H3K4me2 - H3Ac, H4Ac, DHS - E2F1, E2F4, Myc, Pol II

27 Distribution of histone modifications and factor binding around regulatory regions Promoters –H3K4me3, H3K4me2 –E2F1, E2F4, Myc, Pol II Distal HSs –H3K4me1: enhancers –CTCF: insulators Birney et al. (2007) Nature, 447:799-816

28 Enhancers predicted from chromatin signatures (2009) Nature 459: 108-112

29 Enhancer predictions in human cells

30 Characteristics and validation of predicted enhancers

31 Data Resources for Genomics of Gene Regulation

32 UCSC Genome Browser Visualize data described in publications, e.g. –Expression data Affymetrix gene arrays, GNF, Su et al. 2004 –Regulation Kim et al. 2005, PICs (TAF1) Kim et al., 2008, CTCF Boyle et al., 2008, DNase hypersensitive sites Heintzman et al., 2009, Enhancers predicted by H3K4me1 Mikkelsen et al., 2007, Chromatin modifications in pluripotent and lineage-committed cells ENCODE project, Production phase –Expression Affy high density tiling arrays RNA-seq from several sources (CSHL, Helicos) –Regulation Broad histone modifications HAIB DNA methylation Open Chromatin UW DNase HS HAIB TFBS Yale TFBS SUNY RBP

33 Factor occupancy and DNase hypersen- sitivity ENCODE Tracks: Broad histone modifications, Open chromatin, UW DHS, Yale TFBSs Locus control region HS5 4 3 2 1

34 Collated sets of published regulatory regions http://www.bx.psu.edu/~ross/dataset/Reguldata.html Noncoding DNA segments with high regulatory potential PRPs: Intersection of the High RP segments and the PReMods (clusters of conserved transcription factor binding site motifs) Most constrained DNA segments, phastCons DNase hypersensitive sites in CD4+ T cells DNA segments occupied by CTCF in primary fibroblasts Preinitiation complexes (TAF1) in IMR90 cells Predicted erythroid cis-regulatory modules

35 GeneTrack Genomic data analysis and integration –Istvan Albert, Frank Pugh, et al., PSU –http://genetrack.bx.psu.edu/ Install on your system Gallery of data for visualization –Yeast H2AZ nucleosome predictions, 454 sequencing –Drosophila H2AZ nucleosome predictions, 454 sequencing

36 Yeast nucleosome map

37 HIS3: nucleosome- free region

38 modENCODE http://www.modencode.org/ Worm and Fly Gene annotations Expression Chromatin modifications TFBs in vivo, etc.

39 Experimental Tests in the Genomics of Gene Regulation

40 G1E-ER4 cells GATA-1 is required for erythroid maturation Aria Rad, 2007 http://commons.wikimedia.org/wiki/Image:Hematopoiesis_(human)_diagram.png MEP Hematopoietic stem cell Common myeloid progenitor Myeloblast Basophil Common lymphoid progenitor Neutrophil Eosinophil Monocyte, macrophage GATA-1 G1E cells

41 GATA1-induced changes in gene expression and occupancy genome-wide Genes induced or repressed after restoration of GATA1 Occupancy by TFs and histone modifications along a 60 Mb region

42 High sensitivity and specificity of high throughput occupancy data

43 High throughput occupancy matches known CRMs at Hbb locus

44 Confirmed and novel regulatory regions for Gypa Known CRMs Gypa gene Response DHSs GATA1 TAL1 Trx: H3K4me1 Trx: H3K4me3 PcG: H3K27me3 Input DNA

45 Induced genes have GATA1 occupied segments close to their TSS

46 DNA segments occupied by GATA-1 were tested for enhancer activity on transfected plasmids Occupied segments

47 Some of the DNA segments occupied by GATA-1 are active as enhancers Cheng et al. (2008) Genome Research 18:1896-1905

48 Binding site motifs in occupied DNA segments can be deeply preserved during evolution Consensus binding site motif for GATA-1: WGATAR or YTATCW 5997 constrained 7308 not constrained 2055 no motif

49 All GATA1-occupied segments active as enhancers are also occupied by SCL and LDB1

50 Genetic Determinants of Variation in Gene Expression

51 Variation of gene expression among individuals Levels of expression of many genes vary in humans (and other species) Variation in expression is heritable Determinants of variability map to discrete genomic intervals Often multiple determinants This variation indicates an abundance of cis-regulatory variation in the human genome For example: –Microarray expression analyses of 3554 genes in 14 families Morley M … Cheung VG (2004) Nature 430:743-747 - Expression analysis of about 16 HapMap individuals Storey et al. (2007) AJHG 80: 502-509 –Expression analysis of all 270 individuals genotypes in HapMap Stranger BE … Dermitzakis E (2007) Nature Genetics 39:1217-1224

52 Variation in expression between populations Figure 5.Allele-specific qPCR analysis of SH2B3. a, Log2-fold change of SH2B3 expression for all CEU and YRI individuals, relative to the average expression level in the YRI sample obtained from allele-specific qPCR. The distribution of SH2B3 expression is significantly different between samples (t-test, P=.0157), which confirms the microarray results. b, Allele-specific qPCR of a coding polymorphism (rs1107853), which demonstrates that the log2-fold change of the G allele relative to the A allele is significantly different between heterozygous DNA (Het DNA) and heterozygous cDNA (Het cDNA) samples (t-test, P=.00118). Storey et al., 2007, AJHG 80:502-509

53 Mapping determinants of expression variation Stranger et al., 2007, Nature Genetics 39:1217-1224 Expression analysis of EBV-transformed lymphoblastoid cells from all 270 individuals genotypes in HapMap –30 Caucasian trios (90) of European descent in Utah (CEU) –30 Yoruba trios (90) from Ibadan, Nigeria (YRI) –45 unrelated Chinese individuals from Beijing Univ (CHB) –45 unrelated Japanese individuals from Tokyo (JPT) Measure levels of expression of 47,294 probes (about 24,000 genes) in each individual –Focus on 13,643 genes “selected on criteria of variance and population differentiation” Already know genotypes at about 2.2 million SNPs for each individual (HapMap) Test for significant association of variation at each SNP with variation in expression of each gene –Linear regression model –Spearman rank correlation test Evaluate significance of regression P values by 10,000 permutations of the data, focus on those associations above the 0.001 permutation threshold

54 Association of SNPs with expression Stranger et al., 2007, Nature Genetics 39:1217-1224 Significant association between expression and cis- SNPs (within 1 Mb) 831 genes in at least one population 310 genes in at least 2 populations 62 genes in all 4 populations Also find associated SNPs in trans: perhaps regulatory proteins

55 Location of expression-associated SNPs Most are “close” to transcription start site (TSS) Symmetrical arrangement (similar to biochemical features of promoters) Three of the SNPs have been shown to affect promoter activity in transfection assays (Hoogendoorn et al. (2004) Human Mutation 24: 35-42 Figure 4 Properties of significant cis associations as a function of SNP distance from the transcription start site. Stranger et al., 2007, Nature Genetics 39:1217-1224

56 Relevance to human health "We predict that variants in regulatory regions make a greater contribution to complex disease than do variants that affect protein sequence” – Manolis Dermitzakis, ScienceDaily

57 Risk loci in noncoding regions (2007) Science 316: 1336-1341

58 Biochemical features of DNA in CRMs Pol IIa Pol II Coactivators Accessible to cleavage: DNase hypersensitive site Bound by specific transcription factors Associated with RNA polymerase and general transcription factors Nucleosomes with histone modifications: Acetylation of H3 and H4 Methylation of H3K4 Clusters of binding site motifs

59 Candidate functions in T2D SNP intervals Overlap of SNP rs564398 with DHS suggests a role in transcriptional regulation, but overlap with an exon of a noncoding RNA suggests a role in post-transcriptional regulation. Different hypotheses to test in future work.


Download ppt "Genomics of Gene Regulation ANSC 497B Ross Hardison Nov. 10, 2009."

Similar presentations


Ads by Google