Array Genotyping to Dissect Quantitative Trait Loci in Arabidopsis thaliana Justin Borevitz Ecology and Evolution University of Chicago naturalvariation.org If you haven’t voted, leave now, I’ll/We’ll forgive you
Talk Outline QTL Intro Transcription based Cloning Single Feature Polymorphisms (SFPs) –Potential deletions Bulk Segregant Mapping –Extreme Array Mapping Haplotype analysis New Arrays, new models Aquilegia
Light Affects the Entire Plant Life Cycle de-etiolation hypocotyl }
Quantitative Trait Loci
QTL gene Confirmation Marker Identification Genotyping Genomics path Experimental Design Mapping population Phenotyping QTL Analysis Fine Mapping Candidate gene Polymorphisms gene expression loss of function QTL gene Confirmation Experimental Design Mapping population Phenotyping QTL Analysis Fine Mapping With the Aid of Genomics
Genomics to Clone QTL Recombination Fine Mapping Gene Expression Variation Hybridization Polymorphism Association Testing, LD mapping Direct Sequencing of Candidate Gene Quantitative Complementation Transgenic Complementation
Look for gene expression differences between genotypes Identify candidate genes that map to mutation Downstream targets that map elsewhere Transcription based cloning
differences may be due to expression or hybridization
PAG1 down regulated in Cvi PLALE GREEN1 knock out has long hypocotyl in red light
What is Array Genotyping? Affymetrix expression GeneChips contain 202,806 unique 25bp oligo nucleotides. 11 features per probset for genes New array’s have even more Genomic DNA is randomly labeled with biotin, product ~50bp. 3 independent biological replicates compared to the reference strain Col GeneChip
Potential Deletions
Spatial Correction Spatial Artifacts Improved reproducibility Next: Quantile Normalization
False Discovery and Sensitivity PM only SAM threshold 5% FDR GeneChip SFPs nonSFPs Cereon marker accuracy % Sequence Sensitivity Polymorphic % Non-polymorphic False Discovery rate: 3% Test for independence of all factors: Chisq = , df = 1, p-value = 1.845e-40 SAM threshold 18% FDR GeneChip SFPs nonSFPs Cereon marker accuracy % Sequence Sensitivity Polymorphic % Non-polymorphic False Discovery rate: 13% Test for independence of all factors: Chisq = , df = 1, p-value = 1.309e-59 3/4 Cvi markers were also confirmed in PHYB 90%80%70% 41%53%85% 90%80%70% 67%85%100% Cereon may be a sequencing Error TIGR match is a match
Chip genotyping of a Recombinant Inbred Line 29kb interval Discovery 6 replicates X $500 12,000 SFPs = $0.25 Typing 1 replicate X $500 12,000 SFPs = $0.041
SNP377 SM184 SM50 SM35 SM106 G2395 SNP65 SM40 SEQ8298 TH1 MSAT7964 MAT7787 CER MbMarker Near-Isogenic Lines for LIGHT1 Ler / Cvi #3 mm 81N-J17A-A/J Ler Plants Line RVE7 GI Phenotype
LIGHT1 NIL
Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes
Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) Flowering Time QTL caused by a natural deletion in MAF1 MAF1 MAF1 natural deletion
Fast Neutron deletions FKF1 80kb deletion CHR1cry2 10kb deletion CHR1 Het
Map bibb 100 bibb mutant plants 100 wt mutant plants
bibb mapping ChipMap AS1 Bulk segregant Mapping using Chip hybridization bibb maps to Chromosome2 near ASYMETRIC LEAVES1
BIBB = ASYMETRIC LEAVES1 Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain bibbas1-101 MYB bib-1 W49* as-101 Q107* as1 bibb AS1 (ASYMMETRIC LEAVES1) = MYB closely related to PHANTASTICA located at 64cM
eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled
LOD eXtreme Array Mapping Allele frequencies determined by SFP genotyping. Thresholds set by simulations cM LOD Composite Interval Mapping RED2 QTL Chromosome 2 RED2 QTL 12cM Red light QTL RED2 from 100 Kas/ Col RILs
eXtreme Array Mapping BurC F2
XAM Lz x Col F2 QTL Lz x Ler F2
X RED2 QTL mark1 mark2 Select recombinants by PCR >200 from >1250 plants High Low ~2Mb ~8cM >400 SFPs Col Kas Col het Col ~2 Kas hetCol het ~43 Kas Col Kashet Kas ~268 ~43~539 ~43 ~268~43 ~2 het ~539 Kas eXtreme Array Fine Mapping
Array Haplotyping What about Diversity/selection across the genome? A genome wide estimate of population genetics parameters, θ w, π, Tajima’D, ρ LD decay, Haplotype block size Deep population structure? Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas, C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2 Fl-1, Ita-0, Mr-0, St-0, Sah-0
A star phylogeny 163 markers 73 accessions ~ 750kb/marker
Array Haplotyping Inbred lines Low effective recombination due to partial selfing Extensive LD blocks ColLerCviKasBayShahLzNd Chromosome1 ~500kb
RNADNA Universal Whole Genome Array Transcriptome Atlas Expression levels Tissues specificity Transcriptome Atlas Expression levels Tissues specificity Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Alternative Splicing Comparative Genome Hybridization (CGH) Insertion/Deletions Comparative Genome Hybridization (CGH) Insertion/Deletions Methylation Chromatin Immunoprecipitation ChIP chip Chromatin Immunoprecipitation ChIP chip Polymorphism SFPs Discovery/Genotyping Polymorphism SFPs Discovery/Genotyping ~35 bp tile,non-repetitive regions, “good” binding oligos,evenly spaced
Transcriptome Viewer:
SNP SFP MMMMMM MMMMMM Chromosome (bp) conservation SNP ORFa start AAAAA Transcriptome Atlas ORFb deletion Improved Genome Annotation
Review Transcription Based Cloning Single Feature Polymorphisms (SFPs) can be used to Potential deletions (candidate genes) Identify recombination breakpoints eXtreme Array Mapping Haplotyping Diversity/Selection Association Mapping
Scott Hodges (UCSB) Elena Kramer (Harvard) Magnus Nordborg (USC) Justin Borevitz (U Chicago) Jeff Tompkins (Clemson) NSF Genomics of Adaptation to the Biotic and Abiotic Environment in Aquilegia
Aquilegia (Columbines) Recent adaptive radiation, 350Mb genome
NSF Genomics of Adaptation to the Biotic and Abiotic Environment in Aquilegia 35,000 ESTs 5’ and 3’ 350 arrays, RNA and genotyping –High density SFP Genetic Map Physical Map (BAC tiling path) –Physical assignment of ESTs QTL for pollinator preference –and abiotic stress –QTL fine mapping/ LD mapping Develop transformation techniques
NaturalVariation.org Salk Jon Werner Sarah Liljegren Huaming Chen Joanne Chory Detlef Weigel Joseph Ecker UC San Diego Charles Berry Scripps Sam Hazen Steve Kay Elizabeth Winzeler University of Chicago Xu Zhang Evadne Smith Syngenta Hur-Song Chang Tong Zhu UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones University of Chicago Xu Zhang Evadne Smith Syngenta Hur-Song Chang Tong Zhu UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones Salk Jon Werner Sarah Liljegren Huaming Chen Joanne Chory Detlef Weigel Joseph Ecker UC San Diego Charles Berry Scripps Sam Hazen Steve Kay Elizabeth Winzeler