Presentation is loading. Please wait.

Presentation is loading. Please wait.

High Density Oligo Arrays for Single Feature Polymorphism Genotyping and Mapping Justin Borevitz Ecology & Evolution University of Chicago

Similar presentations


Presentation on theme: "High Density Oligo Arrays for Single Feature Polymorphism Genotyping and Mapping Justin Borevitz Ecology & Evolution University of Chicago"— Presentation transcript:

1 High Density Oligo Arrays for Single Feature Polymorphism Genotyping and Mapping Justin Borevitz Ecology & Evolution University of Chicago http://naturalvariation.org

2 Which arrays should be used? Spotted arrays Arizona 29,000 - 70mers ATH1, Affymetrix expression GeneChip 202,806 unique 25bp oligo nucleotides features AtTILE1, universal whole genome array every ~35bp, > 3Million PM features Re-sequencing array 120M*8bp –20 Accessions, Perlegen, –Max Planck (Weigel), USC (Nordborg) GeneChip

3 RNADNA Universal Whole Genome Array Transcriptome Atlas Expression levels Tissues specificity Transcriptome Atlas Expression levels Tissues specificity Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Alternative Splicing Comparative Genome Hybridization (CGH) Insertion/Deletions Comparative Genome Hybridization (CGH) Insertion/Deletions Methylation Chromatin Immunoprecipitation ChIP chip Chromatin Immunoprecipitation ChIP chip Polymorphism SFPs Discovery/Genotyping Polymorphism SFPs Discovery/Genotyping ~35 bp tile,non-repetitive regions, “good” binding oligos,evenly spaced

4 ChipViewer: Mapping of transcriptional units of ORFeome From 2000v At1g09750 (MIPS) to the latest AGI At1g09750 2000 v Annotation (MIPS) The latest AGI Annotation

5 SNP SFP MMMMMM MMMMMM Chromosome (bp) conservation SNP ORFa start AAAAA Transcriptome Atlas ORFb deletion Improved Genome Annotation

6 Talk Outline Single Feature Polymorphisms (SFPs) Barley SFPs Uses of SFPs Haplotype analysis Expression

7 Potential Deletions

8 Spatial Correction Spatial Artifacts Improved reproducibility Next: Quantile Normalization

9

10

11

12 False Discovery and Sensitivity PM only SAM threshold 5% FDR GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity Polymorphic 340 117 223 34% Non-polymorphic 477 4 473 False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p-value = 1.845e-40 SAM threshold 18% FDR GeneChip SFPs nonSFPs Cereon marker accuracy 10627 82297 100% Sequence 817 223 594 Sensitivity Polymorphic 340 195 145 57% Non-polymorphic 477 28 449 False Discovery rate: 13% Test for independence of all factors: Chisq = 265.13, df = 1, p-value = 1.309e-59 3/4 Cvi markers were also confirmed in PHYB 90%80%70% 41%53%85% 90%80%70% 67%85%100% Cereon may be a sequencing Error TIGR match is a match

13 Effect of SNP position 340 Candidate Polymorphisms False negative True Positive

14

15 Complex Genomes? Signal to Noise with Large Genomes RNA, less complex, but differential expression

16 Barley SFPs

17 RNA 2 genotypes, 18 replicates

18 False Discovery Rate RNA RNA hybridization 17 Golden Promise 19 Morex, 6 tissues SAM Analysis for the Two-Class Unpaired Case Assuming Unequal Variances s0 = 0.0342 (The 5 % quantile of the s values.) Number of permutations: 500 MEAN number of falsely called genes is computed. Deltap0CalledFALSEFDR 0.50.952715958840.206 1.00.95177445940.032 1.50.9513285650.005 2.00.951050470.001 2.50.95858300.000

19 Barley SFPs Genomic DNA 3 genotypes 3 replicates

20 False Discovery Rate DNA Genomic DNA hybridizaiton 3 replicates 3 genotypes SAM Analysis for the Multi-Class Case with 3 Classes s0 = 0.0123 (The 25 % quantile of the s values.) Number of permutations: 100 MEAN number of falsely called genes is computed. Deltap0CalledFALSEFDR 10.95401720730.47 20.9517285830.31 30.9510902580.22 40.957891390.16 50.95631860.13

21 Sequence Verification of SFPs RNAGeneChip mxSFPnonSFPgpSFP Sequence 53012403075203 MX1781154518 Non- polymorphic2200272045128 GP223761155 Chisq = 2049.2, df = 4, p-value = 0

22 Position of SNP

23 Barley SFPs per probeset

24 Uses of SFPs Recombination Events Mapping Mendelian mutations Mapping QTL Deletions Haplotyping

25 Chip genotyping of a Recombinant Inbred Line 29kb interval Discovery 6 replicates X $500 12,000 SFPs = $0.25 Typing 1 replicate X $500 12,000 SFPs = $0.041

26 Map bibb 100 bibb mutant plants 100 wt mutant plants

27 bibb mapping ChipMap AS1 Bulk segregant Mapping using Chip hybridization bibb maps to Chromosome2 near ASYMETRIC LEAVES1

28 BIBB = ASYMETRIC LEAVES1 Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain bibbas1-101 MYB bib-1 W49* as-101 Q107* as1 bibb AS1 (ASYMMETRIC LEAVES1) = MYB closely related to PHANTASTICA located at 64cM

29 Array Mapping Hazen et al Plant Physiology (2005) chr1 chr2 chr3 chr4 chr5

30 eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled

31 LOD eXtreme Array Mapping Allele frequencies determined by SFP genotyping. Thresholds set by simulations 0 4 8 12 16 020406080100 cM LOD Composite Interval Mapping RED2 QTL Chromosome 2 RED2 QTL 12cM Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)

32 eXtreme Array Mapping BurC F2

33 XAM Lz x Col F2 QTL Lz x Ler F2 (Werner et al Genetics 2005)

34 X RED2 QTL mark1 mark2 Select recombinants by PCR >200 from >1250 plants High Low ~2Mb ~8cM >400 SFPs Col Kas Col het Col ~2 Kas hetCol het ~43 Kas Col Kashet Kas ~268 ~43~539 ~43 ~268~43 ~2 het ~539 Kas eXtreme Array Fine Mapping

35 Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes

36 Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) Flowering Time QTL caused by a natural deletion in FLM MAF1 FLM natural deletion (Werner et al PNAS 2005)

37 Fast Neutron deletions FKF1 80kb deletion CHR1cry2 10kb deletion CHR1 Het

38 Array Haplotyping What about Diversity/selection across the genome? A genome wide estimate of population genetics parameters, θ w, π, Tajima’D, ρ LD decay, Haplotype block size Deep population structure? Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas, C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2 Fl-1, Ita-0, Mr-0, St-0, Sah-0

39 Array Haplotyping Inbred lines Low effective recombination due to partial selfing Extensive LD blocks ColLerCviKasBayShahLzNd Chromosome1 ~500kb

40 Distribution of T-stats null (permutation) actual Not ColColNANA duplications 32,427 Calls 208,729 12,250 SFPs

41 Sequence confirmation of SFPs SFPSNPTotalFPRFDRSensitivity bay446113750.8%25.0%54.1% bur475713201.1%29.8%57.9% cvi699213251.2%21.7%58.7% ler415114660.6%22.0%62.7% lz374014410.5%18.9%75.0% mr678711911.1%17.9%63.2% mt464814130.9%26.1%70.8% sorbo375313170.9%29.7%49.1% ws294713690.3%13.8%53.2%

42 SFPs for reverse genetics http://naturalvariation.org/sfp 14 Accessions 30,950 SFPs`

43 Chromosome Wide Diversity

44 Diversity 50kb windows

45 Tajima’s D like 50kb windows RPS4 unknown

46 R genes vs bHLH

47 Consider SFPs during expression Remove SFPs Allele specific expression

48

49

50 differences may be due to expression or hybridization

51 PAG1 down regulated in Cvi PLALE GREEN1 knock out has long hypocotyl in red light

52 References Hazen, S.P., Borevitz, J.O., Harmon, F.G., Pruneda-Paz, J.L., Schultz, T.F., Yanovsky, M.J., Liljegren, S.J., Ecker, J.R., Kay, S.A. Rapid array mapping of circadian clock and developmental mutations in Arabidopsis (Plant Physiology in Press)Rapid array mapping of circadian clock and developmental mutations in Arabidopsis Rostoks N, Borevitz JO, Hedley PE, Russell J, Mudie S, Morris J, Cardle L, Marshall DF, Waugh R Single Feature Polymorphism discovery in the barley transcriptome (Genome Biology In Press)Single Feature Polymorphism discovery in the barley transcriptome Werner JD, Borevitz JB, Uhlenhaut H, Ecker JR, Chory J, Weigel D FRIGIDA-independent variation in flowering time of natural A. thaliana accessions (Genetics In Press)FRIGIDA-independent variation in flowering time of natural A. thaliana accessions Werner JD, Borevitz JO, Warthmann N, Trainer GT, Ecker JR, Chory J, Weigel D. Quantitative trait locus mapping and DNA array hybridization identify an FLM deletion as a cause for natural flowering-time variation. Proc Natl Acad Sci U S A. 2005 Feb 15;102(7):2460-5. Supplemental data and analysis scripts Quantitative trait locus mapping and DNA array hybridization identify an FLM deletion as a cause for natural flowering-time variationSupplemental data and analysis scripts Wolyn DJ, Borevitz JO, Loudet O, Schwartz C, Maloof J, Ecker J, Berry CC, Chory J. Light Response QTL Identified with Composite Interval and eXtreme Array Mapping in Arabidopsis thaliana Genetics 2004 Jun;167(2):907-17. Light Response QTL Identified with Composite Interval and eXtreme Array Mapping in Arabidopsis thaliana Supplemental data and analysis scripts Borevitz J, Liang D, Plouffe D, Chang H, Zhu T, Weigel D, Berry C, Winzeler E, Chory J Large Scale Identification of Single Feature Polymorphisms in Complex Genomes. Genome Research. 2003 Mar; 13(3):513-23. Large Scale Identification of Single Feature Polymorphisms in Complex Genomes Supplemental data and analysis scripts

53 Review Single Feature Polymorphisms (SFPs) can be used to Identify recombination breakpoints eXtreme Array Mapping Potential deletions (candidate genes) Haplotyping Diversity/Selection Association Mapping PostDoc Positions

54 NaturalVariation.org Salk Jon Werner Joanne Chory Joseph Ecker Max Planck Detlef Weigel UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler Salk Jon Werner Joanne Chory Joseph Ecker Max Planck Detlef Weigel UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler University of Chicago Xu Zhang Evadne Smith UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones University of Chicago Xu Zhang Evadne Smith UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones


Download ppt "High Density Oligo Arrays for Single Feature Polymorphism Genotyping and Mapping Justin Borevitz Ecology & Evolution University of Chicago"

Similar presentations


Ads by Google