Tiling arrays for genetic, epigentic, and environmental variation in Arabidopsis thaliana Justin Borevitz Ecology & Evolution University of Chicago
Widely Distributed Olivier Loudet
Local Population Variation Scott Hodges Ivan Baxter
Seasonal Variation Matt Horton Megan Dunning
Seasons in the Growth Chamber Changing Day length Cycle Light Intensity Cycle Light Colors Cycle Temperature Sweden Spain Seasons in the Growth Chamber Changing Day length Cycle Light Intensity Cycle Light Colors Cycle Temperature Developmental Plasticity == Behavior
Which arrays should be used? cDNA array Long oligo array BAC array
Which 25mer arrays should be used? Gene array Exon array Tiling array 35bp tile, 25mers 10bp gaps
Which 25mer arrays should be used? Tiling/SNP array SNP array Ressequencing array
RNA DNA Universal Whole Genome Array Transcriptome Atlas Expression levels Tissues specificity Transcriptome Atlas Expression levels Tissues specificity Gene/Exon Discovery Gene model correction Non-coding/ micro-RNA Gene/Exon Discovery Gene model correction Non-coding/ micro-RNA Alternative Splicing Comparative Genome Hybridization (CGH) Insertion/Deletions Copy Number Polymorphisms Comparative Genome Hybridization (CGH) Insertion/Deletions Copy Number Polymorphisms Methylation Chromatin Immunoprecipitation ChIP chip Chromatin Immunoprecipitation ChIP chip Polymorphism SFPs Discovery/Genotyping Polymorphism SFPs Discovery/Genotyping Control for hybridization/genetic polymorphisms to understand true EXPRESSION polymorphisms RNA Immunoprecipitation RIP chip RNA Immunoprecipitation RIP chip Antisense transcription Allele Specific Expression
SNP SFP MMMMMM MMMMMM Chromosome (bp) conservation SNP ORFa start AAAAA Transcriptome Atlas ORFb deletion Improved Genome Annotation
Talk Outline Whole Genome Tiling Arrays –Spatial Correction, grid alignment –Alternative splicing –Methylation –Single Feature Polymorphisms (SFPs) –Genetic Mapping –Potential deletions/ Copy Number Variants –Allele Specific Expression Resequencing/ Haplotypes –Variation Scanning Whole Genome Tiling Arrays –Spatial Correction, grid alignment –Alternative splicing –Methylation –Single Feature Polymorphisms (SFPs) –Genetic Mapping –Potential deletions/ Copy Number Variants –Allele Specific Expression Resequencing/ Haplotypes –Variation Scanning
Tiling Array Re annotation 6.25Million probes 3.125Million PM probes 1.67Million unique PM probes 17bp (blast) 736k PM features in TUs (exon array) 130k TUs 28k genes
Spatial Correction, grid Alignment Background correction for RNA, ! For DNA
Transcription subUnits (TUs) Exon1Exon2 Intron1 Tu1Tu2Tu3
Alternative Splicing V V V C C C Van Col Xu Zhang
Gene/Tu model for alternative splicing
ChIP chip treatment effect! Experimental Design same protocol/antibody dynamic binding model treatment effect Actual biological signal
Potential Deletions
Methods for labeling Extract genomic 100ng DNA (single leaf) Digest with either msp1 or hpa2 CCGG Label with biotin random primers Hybridize to array Fit model Y = + E * G +
Deltap0FALSECalledFDR % % % % % SFP detection on tiling arrays IntergenicExonintron SFPs total %8.86%3.53%5.71% SFPs/gene0>=1>=2>=3>=4>=5 genes
methylated features and mSFPs >10,000 of 100,000 at 5% FDR Enzyme effect, on CCGG featuresGxE 276 at 15% FDR mQTL?
Chip genotyping of a Recombinant Inbred Line 29kb interval
Mapbibb 100bibb mutant plants 100wt mutant plants
Array Mapping Hazen et al Plant Physiology 2005
Potential Deletions (wild lines) >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes
Fast Neutron deletions FKF1 80kb deletion CHR1cry2 10kb deletion CHR1 Het
Natural Variation on Tiling Arrays
Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) Flowering Time QTL caused by a natural deletion in FLM FLM FLM natural deletion (Werner et al PNAS 2005)
Allele specific expression
cis regulatory variation Col/Col Col/Van Van/Col Van/Van Van allele expressed Col allele expressed Col Female imprint
Allele specific expression between Col and Van
Array Haplotyping What about Diversity/selection across the genome? A genome wide estimate of population genetics parameters, θ w, π, Tajima’D, ρ LD decay, Haplotype block size Deep population structure? Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas, C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2 Fl-1, Ita-0, Mr-0, St-0, Sah-0
Array Haplotyping Inbred lines Low effective recombination due to partial selfing Extensive LD blocks ColLerCviKasBayShahLzNd Chromosome1 ~500kb
SFPs for reverse genetics 14 Accessions 30,950 SFPs`
Chromosome Wide Diversity
Diversity 50kb windows
Tajima’s D like 50kb windows RPS4 unknown
R genes vs bHLH
NaturalVariation.org USC Magnus Nordborg Paul Marjoram Max Planck Detlef Weigel Scripps Sam Hazen University of Michigan Sebastian Zollner University of Chicago Xu Zhang Evadne Smith Ken Okamoto Yan Li Michigan State Shinhan Shui Purdue Ivan Baxter Sainsbury Laboratory Jonathan Jones USC Magnus Nordborg Paul Marjoram Max Planck Detlef Weigel Scripps Sam Hazen University of Michigan Sebastian Zollner University of Chicago Xu Zhang Evadne Smith Ken Okamoto Yan Li Michigan State Shinhan Shui Purdue Ivan Baxter Sainsbury Laboratory Jonathan Jones