A genomic code for nucleosome positioning
Hierarchical DNA folding in eukaryotic chromosomes DNA double helix Nucleosomes Felsenfeld & Groudine,, Nature 421: 448-453 (2003)
Most eukaryotic DNA is sharply looped Luger et al., 1997
Physical selection for stable nucleosome formation on chemically synthetic random DNAs Random 220-mer pool (1 each of 5 x 1012 individuals) PCR amplify, HPLC purify Select best 10% (dialysis from 2.0M NaCl) 2x Sucrose gradient purification Extract DNA Population diversity analysis Free energy analysis Clone, sequence, analyze individuals Lowary & Widom, 1998
Differing DNA sequences exhibit a > 5,000-fold range of affinities for nucleosome formation Lowary & Widom, 1998 Thåström et al., 1999 Widom, 2001 Thåström et al., 2004
Basepair steps as fundamental units of DNA mechanics Zhurkin Olson
DNA sequence motifs that stabilize nucleosomes and facilitate spontaneous sharp looping AA TT TA GC Segal et al., 2006
Why DNA some sequences have especially high affinity for histone octamer More or better bonds Appropriately bent More easily bendable Appropriate twist More easily twistable Widom, 2001 Cloutier & Widom, 2004
Felsenfeld, G. & Groudine, M. (2003), Nature 421: 448-453
~10.2 bp periodicity of AA/TT steps in the C. elegans genome Widom, 1996
Understanding and predicting the genome’s nucleosome-forming potential Collect nucleosome-bound sequences Associate intrinsic encoding with biological function Predict intrinsically encoded nucleosome organization Construct nucleosome-DNA interaction model Compare with in vivo positions Validate model So we used a combined experimental and computational approach to study this problem and we started by experimentally collecting sequences that are wrapped in nucleosomes. We then construct a model that captures the commonalities that we find in these sequences so that we describe in this model the DNA preferences of nucleosomes, validate the model and then use it to computationally predict the organization of the entire genome into nucleosomes, and then compare this intrinsic organization with in vivo positions and study its biological implications. Segal et al., 2006
Isolation of natural nucleosome core DNA Alberts et al., 4th ed., Fig. 4–24 (2002)
Center alignment of yeast nucleosome DNAs ACGTAGCTGTAGTGTACTGACGTACGTCGTC ACTAGCTGATACGGAGACCCGCGCGATTTTGCGGTC ACTGTTCGTCGTGTGTGTGTGCTGCTGTAGACTTGTGTG TACTGTTTTTATTTTGCGGGCATGCTTGT Center align ~10bp periodicity of AA/TT/TA Same period for GC, out of phase with AA/TT/TA Signals important for DNA bending NO signal from alignment of randomly chosen genomic regions Yeast (in vivo nucleosomes) Segal et al., 2006
Location mixture model alignment vs center alignment Wang & Widom, 2005
Center alignment of chicken nucleosome DNAs Chicken + Yeast merge Chicken (in vivo) (Satchwell et al., 1986) Yeast (in vivo) Segal et al., 2006
Alignments of nucleosomes selected in vitro Chicken (in vivo) (Satchwell et al., 1986) Yeast (in vivo) Yeast (in vitro) Mouse (in vitro) (Widlund et al., 1997) Random DNA (in vitro) (Lowary & Widom, 1998) Segal et al., 2006
In vitro experimental validation of histone-DNA interaction model Adding key motifs increases nucleosome affinity Deleting motifs or disrupting their spacing decreases affinity Segal et al., 2006
In vitro experimental validation of histone-DNA interaction model Adding key motifs increases nucleosome affinity Deleting motifs or disrupting their spacing decreases affinity dyad 8 18 28 38 48 58 68 78 88 98 108 118 128 138 To make sure that these really are the rules of the game we did in vitro experiments, and what we found is that indeed if you start from a sequence and add key motifs at the right positions you can get sequences that have better binding affinities, whereas deleting these motifs or changing their spacing decreases their affinity. Segal et al., 2006
Summary Differing DNA sequences exhibit a > 5,000-fold range of affinities for nucleosome formation We have a predictive understanding of the DNA sequence motifs that are responsible
Nucleosome–DNA model Differences from TF model Signal is weak Information in dinucleotides Two-fold symmetry axis AA TT TA GC Position specific dinucleotide distributions Add reverse complement of each sequence Local (3 bp) averaging Segal et al., 2006
Summary Differing DNA sequences exhibit a > 5,000-fold range of affinities for nucleosome formation We have a predictive understanding of the DNA sequence motifs that are responsible Sequences matching these motifs are abundant in eukaryotic genomes, and are occupied by nucleosomes in vivo
Placing nucleosomes on the genome A free energy landscape, not just scores and a threshold !! Log likelihood Genomic Location (bp) Nucleosomes occupy 147 bp and exclude 157 bp Segal et al., 2006
Equilibrium configurations of nucleosomes on the genome One of very many possible configurations P(S) PB(S) P(S) PB(S) P(S) PB(S) P(S) PB(S) Chemical potential – apparent concentration Probability of placing a nucleosome starting at each allowed basepair i of S Probability of any nucleosome covering position i ( average occupancy) Locations i with high probability for starting a nucleosome ( stable nucleosomes) Segal et al., 2006
Cross-correlation of average occupancies predicted using yeast and chicken models Segal et al., 2006
Nucleosome coding potential at the GAL1–10 locus: predicted distribution compared to experimental Segal et al., 2006
Nucleosome coding potential at the CHA1 locus Segal et al., 2006
Reliable classification of nucleosome occupancies on depleted vs occupied regions Yeast ORFs depleted of nucleosomes (Lee et al.) (0.82, 0.68) Yeast ORFs occupied by nucleosomes (Lee et al.) (0.88, 0.76) (0.88, 0.55) p<10–6 p<10–9 (0.82, 0.32) Yeast intergenics occupied by nucleosomes (Bernstein et al.) Yeast intergenics depleted of nucleosomes (Bernstein et al.) Segal et al., 2006
In vivo nucleosome occupancies at predicted high-occupancy locations + Controls – Controls High Predictions Segal et al., 2006
In vivo nucleosome occupancies at predicted low-occupancy locations Segal et al., 2006
Summary Differing DNA sequences exhibit a > 5,000-fold range of affinities for nucleosome formation We have a predictive understanding of the DNA sequence motifs that are responsible Sequences matching these motifs are abundant in eukaryotic genomes, and are occupied by nucleosomes in vivo A model based only on these DNA sequence motifs and nucleosome-nucleosome exclusion explains ~50% of in vivo nucleosome positions
Understanding and predicting the genome’s nucleosome-forming potential Collect nucleosome-bound sequences Construct nucleosome-DNA interaction model Validate model So we used a combined experimental and computational approach to study this problem and we started by experimentally collecting sequences that are wrapped in nucleosomes. We then construct a model that captures the commonalities that we find in these sequences so that we describe in this model the DNA preferences of nucleosomes, validate the model and then use it to computationally predict the organization of the entire genome into nucleosomes, and then compare this intrinsic organization with in vivo positions and study its biological implications. Predict intrinsically encoded nucleosome organization Compare with in vivo positions Associate intrinsic encoding with biological function Segal et al., 2006
Nucleosome occupancy varies with chromosome region type Centromere Convergent intergenic regions Autonomously replicating sequences Unique intergenic regions Ribosomal proteins Divergent intergenic regions Protein coding regions Telomeres Conserved DNA binding sites Ribosomal RNAs tRNAs Average nucleosome occupancy Segal et al., 2006
Does the genome’s intrinsic nucleosome organization facilitate occupancy of functional binding sites? Segal et al., 2006
The yeast genome encodes low nucleosome occupancy over functional binding sites Segal et al., 2006
Semi-stable nucleosomes Distinctive nucleosome occupancy adjacent to TATA elements at yeast promoters Stable nucleosome Semi-stable nucleosomes Permuted Model TATA Box Segal et al., 2006
S. cerevisiae S. Paradoxus S. mikatae S. kudriavzevii S. bayanus S. castelli C. glabrata S. kluyveri K. walti K. lactis A. gossypii D. hansenii C. albicans Y. lipolytica A. nidulans S. pombe -36 -41 -22 -42 -23 -53 -21 -14 -26 -3 -16 -9 -56 -2 Nucleosome organization near 5’ ends of genes is conserved through evolution Segal et al., 2006
Predicted nucleosome organization near 5’ ends of genes – comparison to experiment Segal et al., 2006
Summary Differing DNA sequences exhibit a > 5,000-fold range of affinities for nucleosome formation We have a predictive understanding of the DNA sequence motifs that are responsible Sequences matching these motifs are abundant in eukaryotic genomes, and are occupied by nucleosomes in vivo A model based only on these DNA sequence motifs and nucleosome-nucleosome exclusion explains ~50% of in vivo nucleosome positions These intrinsically encoded nucleosome positions are correlated with, and may facilitate, essential aspects of chromosome structure and function
Factors present in the nucleus Genome sequence influences the competition between nucleosomes and DNA binding proteins Factors present in the nucleus None + 200 100 Promoter region 1 20 -20 1 Nucleosome score landscape Nucleosome occupancy Stable nucleosome Unstable nucleosome DNA binding protein DNA binding site / Promoter region 2 I II III Segal et al., 2006
Toward a proper free energy model for the sequence-dependent cost of DNA wrapping AA TT TA GC Morozov, Segal, Widom, & Siggia
Acknowledgements Nucleosome positioning in vivo Eran Segal (Weizmann Inst.) Yvonne Fondufe-Mittendorf Lingyi Chen Annchristine Thåström Yair Field (Weizmann Inst.) Irene Moore Jiping Wang (Northwestern U. Statistics) Alexandre Morozov (Rockefeller U.) Eric Siggia (Rockefeller U.)