Haplotype mapping with Single Feature Polymorphisms in Arabidopsis Justin Borevitz Ecology & Evolution University of Chicago
Talk Outline Natural Variation in Light Response Single Feature Polymorphisms (SFPs) –Potential deletions Haplotype analysis Patterns in gene Families Aquilegia Natural Variation in Light Response Single Feature Polymorphisms (SFPs) –Potential deletions Haplotype analysis Patterns in gene Families Aquilegia
Light Affects the Entire Plant Life Cycle Light response variation can be seen under constant conditions in the lab Natural Variation under selection? Test in field
Seasons in the Growth Chamber Changing Day length Cycle Light Intensity Cycle Light Colors Cycle Temperature
Which arrays should be used? Spotted arrays Arizona 29, mers ATH1, Affymetrix expression GeneChip 202,806 unique 25bp oligo nucleotides features AtTILE1, universal whole genome array every ~35bp, > 3Million PM features Re-sequencing array 120M*8bp –20 Accessions, Perlegen, –Max Planck (Weigel), USC (Nordborg) GeneChip
RNADNA Universal Whole Genome Array Transcriptome Atlas Expression levels Tissues specificity Transcriptome Atlas Expression levels Tissues specificity Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Alternative Splicing Comparative Genome Hybridization (CGH) Insertion/Deletions Comparative Genome Hybridization (CGH) Insertion/Deletions Methylation Chromatin Immunoprecipitation ChIP chip Chromatin Immunoprecipitation ChIP chip Polymorphism SFPs Discovery/Genotyping Polymorphism SFPs Discovery/Genotyping ~35 bp tile,non-repetitive regions, “good” binding oligos,evenly spaced
ChipViewer: Mapping of transcriptional units of ORFeome From 2000v At1g09750 (MIPS) to the latest AGI At1g v Annotation (MIPS) The latest AGI Annotation
SNP SFP MMMMMM MMMMMM Chromosome (bp) conservation SNP ORFa start AAAAA Transcriptome Atlas ORFb deletion Improved Genome Annotation
Potential Deletions
False Discovery and Sensitivity PM only SAM threshold 5% FDR GeneChip SFPs nonSFPs Cereon marker accuracy % Sequence Sensitivity Polymorphic % Non-polymorphic False Discovery rate: 3% Test for independence of all factors: Chisq = , df = 1, p-value = 1.845e-40 SAM threshold 18% FDR GeneChip SFPs nonSFPs Cereon marker accuracy % Sequence Sensitivity Polymorphic % Non-polymorphic False Discovery rate: 13% Test for independence of all factors: Chisq = , df = 1, p-value = 1.309e-59 3/4 Cvi markers were also confirmed in PHYB 90%80%70% 41%53%85% 90%80%70% 67%85%100% Cereon may be a sequencing Error TIGR match is a match
Chip genotyping of a Recombinant Inbred Line 29kb interval Discovery 6 replicates X $500 12,000 SFPs = $0.25 Typing 1 replicate X $500 12,000 SFPs = $0.041
Map bibb 100 bibb mutant plants 100 wt mutant plants
bibb mapping ChipMap AS1 Bulk segregant Mapping using Chip hybridization bibb maps to Chromosome2 near ASYMETRIC LEAVES1
BIBB = ASYMETRIC LEAVES1 Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain bibbas1-101 MYB bib-1 W49* as-101 Q107* as1 bibb AS1 (ASYMMETRIC LEAVES1) = MYB closely related to PHANTASTICA located at 64cM
Array Mapping Hazen et al Plant Physiology (submitted) chr1 chr2 chr3 chr4 chr5
eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled
LOD eXtreme Array Mapping Allele frequencies determined by SFP genotyping. Thresholds set by simulations cM LOD Composite Interval Mapping RED2 QTL Chromosome 2 RED2 QTL 12cM Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)
eXtreme Array Mapping BurC F2
XAM Lz x Col F2 QTL Lz x Ler F2 (Werner et al Genetics in press)
X RED2 QTL mark1 mark2 Select recombinants by PCR >200 from >1250 plants High Low ~2Mb ~8cM >400 SFPs Col Kas Col het Col ~2 Kas hetCol het ~43 Kas Col Kashet Kas ~268 ~43~539 ~43 ~268~43 ~2 het ~539 Kas eXtreme Array Fine Mapping
Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes
Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) Flowering Time QTL caused by a natural deletion in FLM MAF1 FLM natural deletion (Werner et al PNAS 2005)
Fast Neutron deletions FKF1 80kb deletion CHR1cry2 10kb deletion CHR1 Het
Array Haplotyping What about Diversity/selection across the genome? A genome wide estimate of population genetics parameters, θ w, π, Tajima’D, ρ LD decay, Haplotype block size Deep population structure? Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas, C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2 Fl-1, Ita-0, Mr-0, St-0, Sah-0
Array Haplotyping Inbred lines Low effective recombination due to partial selfing Extensive LD blocks ColLerCviKasBayShahLzNd Chromosome1 ~500kb
Distribution of T-stats null (permutation) actual Not ColColNANA duplications 32,427 Calls 208,729 12,250 SFPs
Sequence confirmation of SFPs SFPSNPTotalFPRFDRSensitivity bay %25.0%54.1% bur %29.8%57.9% cvi %21.7%58.7% ler %22.0%62.7% lz %18.9%75.0% mr %17.9%63.2% mt %26.1%70.8% sorbo %29.7%49.1% ws %13.8%53.2%
SFPs for reverse genetics 14 Accessions 30,950 SFPs`
Chromosome Wide Diversity
Diversity 50kb windows
Tajima’s D like 50kb windows RPS4 unknown
R genes vs bHLH Theta W RPS4
Rgenes vs bHLH Tajimas’ D RPS4
R genes vs bHLH
Review Single Feature Polymorphisms (SFPs) can be used to Identify recombination breakpoints eXtreme Array Mapping Potential deletions (candidate genes) Haplotyping Diversity/Selection Association Mapping
Aquilegia (Columbines) Recent adaptive radiation, 350Mb genome
> 20k dbEST 11/14/2003 Animal lineage: good coverage Plant lineage: crop plant coverage
NSF Genome Complexity 45,000 ESTs 5’ and 3’ ends 350 arrays, RNA and genotyping –High density SFP Genetic Map Physical Map (BAC tiling path) –Physical assignment of ESTs QTL for pollinator preference –~400 RILs, map abiotic stress –QTL fine mapping/ LD mapping Develop transformation techniques Scott Hodges (UCSB) Elena Kramer (Harvard) Magnus Nordborg (USC) Justin Borevitz (U Chicago) Jeff Tompkins (Clemson)
NaturalVariation.org Salk Jon Werner Joanne Chory Joseph Ecker Max Planck Detlef Weigel UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler Salk Jon Werner Joanne Chory Joseph Ecker Max Planck Detlef Weigel UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler University of Chicago Xu Zhang Evadne Smith UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones University of Chicago Xu Zhang Evadne Smith UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones