Tiling Arrays Madelaine Gogol Programmer Analyst Microarray Norman Pavelka Postdoc Rong Li Lab Technology and Methods Seminar March 29,
Tiling Arrays - Overview What is a tiling array? What can I do with it? –ChIP-chip –CGH –Expression Which tiling arrays are available for my experiments? –in-house yeast tiling array (design details) –Agilent –Affymetrix How will we analyze and visualize the data?
What is a Tiling Array? Gene 1 Probe 1 A microarray with many probes distributed in an evenly spaced way across an entire genome.
What can I do with tiling arrays? Map the transcriptome –what’s being expressed? ChIP-chip –where are proteins binding? CGH –what are the differences in genome structure? Other possibilities –Map the methylome –Genome resequencing –Polymorphism discovery
ChIP-chip PCR w/aa-dNTP Hybridize to Microarray Analyze image, calculate ratios
array CGH Hybridize to Microarray
In-house yeast tiling array (YOGie) Covers the yeast genome Just printed resolution ~ 250 bases freely available
Spotted Microarray Manufacturing
Operon Probe Set 6307 probes, length 70 Designed one per ORF, near 3’ end. YOG arrays (yeast oligonucleotide) YBOX: 3072 new probes
Intergenic Probe Set: Design Design target: yeast Intergenic regions original goal: leave no area > 500 uncovered Gene 1Gene 2 Array Oligo Selector (AOS) Fasta format 140-mer sequences tiling the intergenic regions 9, mer sequences from the intergenic regions
Intergenic Probe Set mer probes No region greater than 360 left uncovered ~ 220 bases between probes on average Chromosome 3 completely tiled YOGi arrays (YOG+intergenic)
5’ probe set: Design Goal: –fill in gaps left by operon set in 5’ region of gene –leave no region > 500 without a probe Targets: ORF regions 5’ of operon probe Gene 1 operon probe 3’5’ Array Oligo Selector (AOS)
5’ probe set: target selection What size targets? –355mers that overlap by (500-70)/2 = target 1target =355
5’ probe set: Reduction Too many probes –reduce to 6666 or less (budget and printing constraints) Probes within 260 bases of eachother – winnowed Tm binding energy number of matches to genome Gene 1 operon probe 3’5’
5’ probe set 6, mer probes tiles the region of each gene between the operon probe and the 5' end. YOGie arrays (YOG+intergenic+enhanced)
In-house yeast tiling array: YOGie Together, the operon, intergenic, and 5’ sets make up our homemade yeast tiling array Freely available Also includes tight tiles of –all centromeres –7 sub-genomic regions kb
Agilent Microarray Manufacturing
Agilent Tiling Arrays ChIP-on-chip Arabidopsis Whole Genome C. elegans Whole Genome Drosophila Whole Genome Human CpG Island Human ENCODE 244K Human Promoter Mouse Promoter Yeast Whole Genome 4 x 44K Yeast Whole Genome 244K Zebrafish Expanded Promoter Zebrafish Proximal Promoter Custom ChIP-on-chip Oligo aCGH Human Genome 244K Human Genome 105K Human Genome 44K Mouse Genome 244K Mouse Genome 105K Mouse Genome 44K
244k 105k 44k 15k $400 $640 $720 $800 $320 $180 $100 per slide Agilent Tiling Arrays: formats and cost per hyb $400
Agilent Custom Array Design Take an agilent microarray design Remove some probes Put in your own probes –Design using agilent’s web application, earray. You can also design everything from scratch
earray
Affymetrix Microarray Manufacturing
Affymetrix Tiling Arrays Arabidopsis Tiling 1.0R Array C.elegans Tiling 1.0R Array Chromosome 21/ Array Set Chromosome 21/22 2.0R Array Drosophila Tiling 1.0R Array ENCODE Array Human Genome Arrays + Mouse Genome Arrays + S. cerevisiae Tiling 1.0R Array S. pombe Tiling 1.0FR Array Cost ~ 500$ per array, so 500$ per hyb.
Summary of Tiling Arrays (only yeast shown) Type#spots per array probe size resolution per slide cost per array cost customizable? Homemade YOGie arrays 26, between 00 some effort Agilent 44K44,000 (4)60160 between $720$180 easily Agilent 244K244,00060 overlap by 18 $400 easily Affy3,200,000 (pm/mm) 25 overlap by 20 $500 no
Tiling array data analysis: still an adventure Affy –TAS (Tiling Analysis Software) Agilent –ChIP Analytics & CGH Analytics Other –genome browsers, R packages, other people’s software, Do-it-yourself, perl, statistical models
Data Visualization: Genome Browsers: UCSC
Data Visualization: Genome Browsers: UCSC
Data Visualization: Genome Browsers: IGB
Data Analysis: Sliding windows ChIPOTle, PeakFinder, R scripts, etc.
Data Analysis: Annotating and comparing peaks
Data Analysis: Average Gene analysis Profile of binding across an average gene
Summary Tiling arrays –CHip-chip, CGH, expression Which ones are available –In-house, Agilent, Affy Data analysis Future...
The Future of Tiling Arrays The resolution continues to increase...
Other Future Genomic Technology 454 and Solexa/Illumina “Next Generation” sequencing “Sequence everything in the tube” Shares some things with tiling arrays –even more unbiased –vast quantities of data –analysis methods are being developed
5 th floor Building 2 N Bioinformatics Microarray AllisonKarinBrianChrisMe Thanks! Microarray Bioinformatics All the labs that use microarrays! Bing Li Workman lab Jennifer Bupp Jasperson lab Norman Pavelka Rong Li Lab
Technology & Methods Seminar “Adventures in Electron Microscopy” Rhonda Allen Histology Thursday, April 26th, 1:00 p.m. Classroom (1 st floor, Administration Building) Schedule with abstracts and previous presentation slides can be found on: K:\Weekly Seminar Schedule\Thursday -- Technology & Methods Information regarding previous seminars can be found at:
Norman Pavelka (Rong Li lab) On the use of Affymetrix Tiling Arrays for Comparative Genomic Hybridizations March 29, 2007 Technology & Methods Seminar: “Tiling Arrays - Probing Genome and Transcriptome Structure”
Background: Role of MYO1 in cytokinesis Myo1 Phenotype of yeast cells experiencing an acute loss of MYO1: Severe cytokinesis defect Impaired cell viability Phenotype of yeast cells experiencing a chronic loss of MYO1: Extremely heterogenous Occasionally: full recovery of cytokinesis proficiency and of growth ability
Biological question: What genome changes occurred in e-strains? Albertson & Pinkel, Hum Mol Genet (2003) Polyploidization? Aneuploidization?Interstitial deletions?Reciprocal translocations? Non-reciprocal translocations? Amplifications? Single-nucleotide mutations?
U.C. Berkeley Division of Biostatistics Working Paper Series (2002), paper 106. Method: array-based Comparative Genomic Hybridization (aCGH)
Technology: Affymetrix Yeast Tiling Arrays ~6.5 million unique probes on the chip Designed to interrogate the yeast genome with a 5bp resolution: Gresham et al., Science (2006) ~12.5 million bp in the yeast genome
1.Extraction of genomic DNA with Phenol / Chloroform / Isoamylalcohol 2.“Controlled” fragmentation with DNase I (5 min at 37 ° C) 3.End-labeling with TdT and biotin-dUTP 4.Hybridize on Affy chips 5.Stain with streptavidin-PE 6.Wash and scan chips Ladder 75 mU DNase I 150 mU DNase I Fragment length (nt) Strain 7a-1 Strain 2b (wt) Experimental protocol:
2b (low DNase I, large fragments) 2b (high DNase I, small fragments) 7a-1 (low DNase I, large fragments) 7a-1 (high DNase I, small fragments)
Limitations: What genome changes can we see by aCGH? Albertson & Pinkel, Hum Mol Genet (2003) Polyploidization? Aneuploidization?Interstitial deletions?Reciprocal translocations? Non-reciprocal translocations? Amplifications? Single-nucleotide mutations?
Observation #1: Deletion of the MYO1 locus MYO1 locus log 2 (ratio) Chromosome VIII
Observation #2: “Duplication” of the TRP1 locus Caveat #1: No information on where the signal comes from! TRP log 2 (ratio) log 2 (ratio) Chromosome IV
Caveat #2: Highly repetitive sequences! Chromosome II +1 0 log 2 (ratio) +1 0 log 2 (ratio) Ty1 LTR Full-length Ty1 (aka “Saturation” effect)
Observation #3: Gradual loss of signal towards telomeres +1 0 log 2 (ratio) +1 0 log 2 (ratio) Full sequence of chromosome II
Observation #4: Aneuploidies Chr.
Caveat #3: “Dilution” effect
Possible observation #1: Non-reciprocal translocations? Dunham et al., PNAS (2002)
Possible observation #2: Single-nucleotide changes? Gresham et al., Science (2006) log 2 (ratio) Genomic DNA Probes on the chip
Summary: What can be seen by CGH on Tiling Arrays? →Anything that causes a change in the copy number of a DNA segment, e.g. aneuploidies, deletions/amplifications, non-reciprocal translocations, etc. →Mutations that affect the hybridization of multiple overlapping probes, i.e. single-nucleotide changes. What can not be seen by CGH? →Anything that does not cause a change in the copy number of a DNA segment, e.g. polyploidization, reciprocal translocations etc. →If probes are too long and non-overlapping, single-nucleotide mutations will not be detectable. What are the most common pitfalls? →No information about where the signal actually comes from! →No reliable information from probes hybridizing to highly-repetitive sequence (because of “saturation” effect)! →If some chromosomes are gained or lost, this will affect the log-ratios also of all other chromosomes (because of “dilution” effect)!
Acknowledgements: Rong Li lab: Giulia Rancati Rong Li Microarray group: Karin Zueckert-Gaudenz Allison Peak Chris Seidel