CHANGE!! MGL Users Group meetings will now be on the 1 st Monday of each month 3:00-4:00 Room Note the change of time and room
Notes on ChIP-seq library preparation and initial handling MGL Users Group 1/21/15
Library Preparation What is the desired resolution of the experiment? Is it desired to be able to identify a specific recognition motif? Do we want to resolve binding to bp resolution? Do we just need general areas? Target resolution will influence both sample preparation and initial library construction. It may also influence a decision to do either paired-end or single-end sequencing. Paired end will result in more reliable mapping, generally, by disambiguation of highly similar sequences with another proximal read. It can also potentially resolve the ‘ends’ and sizes of protected fragments
HiSeq / SOLiD reads are short Unless the ChIP targets have a small footprint, even paired end reads may not span a full pulldown target in some cases. Only ends will be read, with intervening sequence ‘invisible’ to overall read density. In the case of digested nucleosome fragments, for example, it may be possible to have more than one tandem ligand/target on a single resulting fragment. Again, if resolution is key, it’s often desirable to ensure efficient cleavage/digestion of DNA prior to pulldown. Additional steps may be added to size-select for fragments matching most closely in size to the expected single-ligand protected fragment.
Always run an Input control ChIP-seq identifies regions of enrichment in the IP pool which may be affected by relative abundance of those in the input. To control for artificial enrichment, the input control serves as a normalization for apparent peaks of enriched signal in the IP.
Additional treatments pre- library construction Large pulldown fragments were subjected to additional Covaris shearing prior to library construction. Input Immuno-precipitated Mono-nucleosomes ~151 nt Poly-nucleosomes ~1,000-8,000 nt >1kb necessitate mate-pair library construction otherwise.
Final library sizes were similar After Covaris shearing, construction of standard fragment libraries proceeded as normal. Input Immuno-precipitated Library ~285 nt Library ~285 nt
The common workflow for ChIPseq following sequencing Bardet, et al. Nature Protocols (2012)
Data Alignment Alignment strategies may differ, particularly if redundant or highly similar sequences are expected. Alignment modes: Unique alignment – Only uniquely best alignments are used Random assignment – If equal best alignments are found, randomly assign the read to one of them Multiple assignment – If equal best alignments are found, assign the same read to all of them Depending on the aligner used, often a random/multi aligned read will be reported as mapping quality of 0. Visualization tools may mask mapping quality below a certain threshold
Peak identification & Mining Software Peak calling/comparisons HOMER, MACS, ODIN, R (DBChIP, others), MAnorm Peak annotation HOMER, Bedtools, GREAT, Cistrome, R (ChIPpeakAnno) Vizualization IGV, UCSC Genome Browser, NGS-Plot, R (base & other packages) Motif Identification Interpro (known), Pfam (known), MEME (discovery/known) Enrichment Analysis DAVID, gProfiler, BiNGO, AmiGO, R (GOstats, others)