Presentation is loading. Please wait.

Presentation is loading. Please wait.

Toward the genetic basis of adaptation using arrays Justin Borevitz Ecology & Evolution University of Chicago

Similar presentations


Presentation on theme: "Toward the genetic basis of adaptation using arrays Justin Borevitz Ecology & Evolution University of Chicago"— Presentation transcript:

1 Toward the genetic basis of adaptation using arrays Justin Borevitz Ecology & Evolution University of Chicago http://naturalvariation.org

2 Local Population Variation

3 Seasons in the Growth Chamber Changing Day length Cycle Light Intensity Cycle Light Colors Cycle Temperature Sweden Spain Seasons in the Growth Chamber Changing Day length Cycle Light Intensity Cycle Light Colors Cycle Temperature

4 RNADNA Universal Whole Genome Array Transcriptome Atlas Expression levels Tissues specificity Transcriptome Atlas Expression levels Tissues specificity Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Gene Discovery Gene model correction Non-coding/ micro-RNA Antisense transcription Alternative Splicing Comparative Genome Hybridization (CGH) Insertion/Deletions Comparative Genome Hybridization (CGH) Insertion/Deletions Methylation Chromatin Immunoprecipitation ChIP chip Chromatin Immunoprecipitation ChIP chip Polymorphism SFPs Discovery/Genotyping Polymorphism SFPs Discovery/Genotyping ~35 bp tile,non-repetitive regions, “good” binding oligos,evenly spaced

5 SNP SFP MMMMMM MMMMMM Chromosome (bp) conservation SNP ORFa start AAAAA Transcriptome Atlas ORFb deletion Improved Genome Annotation

6 Alternative Splicing Col/Van

7 cDNA raw intensity 10% smoothed

8 Talk Outline Van/Col RILs Single Feature Polymorphisms (SFPs) –Potential deletions –Bulk segregant/ eXtreme Mapping Barley RNA SFPs Haplotype analysis Aquilegia Van/Col RILs Single Feature Polymorphisms (SFPs) –Potential deletions –Bulk segregant/ eXtreme Mapping Barley RNA SFPs Haplotype analysis Aquilegia

9 Advanced Intercross RILs VanC advanced intercross RIL population Backcross collections

10 Missing/Het data

11 Markers Lines VanC 50k SNP genotypes chr1 chr2 chr3 chr4 chr5

12 Genetic Map Van no mitochondrial insertion

13 Segregation Distortion

14 Flowering Time Variation/ Greenhouse

15 QTL Days to Flowering

16 QTL Total Leaf Number

17 QTL Growth Rate Residual Total Leaf Number Days til Flowering

18 QTL Erecta Residual Total Leaf Number Days til Flowering

19 Potential Deletions

20 Quality Control

21 Deltap0FALSECalledFDR 1.000.951886516014511.2% 1.250.95104771323907.5% 1.500.9565451150425.4% 1.750.9544841023854.2% 2.000.953298920273.4% SFP detection on tiling arrays

22 False Discovery and Sensitivity PM only SAM threshold 5% FDR GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity Polymorphic 340 117 223 34% Non-polymorphic 477 4 473 False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p-value = 1.845e-40 SAM threshold 18% FDR GeneChip SFPs nonSFPs Cereon marker accuracy 10627 82297 100% Sequence 817 223 594 Sensitivity Polymorphic 340 195 145 57% Non-polymorphic 477 28 449 False Discovery rate: 13% Test for independence of all factors: Chisq = 265.13, df = 1, p-value = 1.309e-59 3/4 Cvi markers were also confirmed in PHYB 90%80%70% 41%53%85% 90%80%70% 67%85%100% Cereon may be a sequencing Error TIGR match is a match

23 Map bibb 100 bibb mutant plants 100 wt mutant plants

24 Array Mapping Hazen et al Plant Physiology 2005

25 LUX ARRHYTHMO encodes a Myb domain protein essential for circadian rhythms Hazen et al PNAS, 2005 Cloned with Array Mapping

26 eXtreme Array Mapping 15 tallest RILs pooled vs 15 shortest RILs pooled

27 LOD eXtreme Array Mapping Allele frequencies determined by SFP genotyping. Thresholds set by simulations 0 4 8 12 16 020406080100 cM LOD Composite Interval Mapping RED2 QTL Chromosome 2 RED2 QTL 12cM Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)

28 eXtreme Array Mapping BurC F2

29 XAM Lz x Col F2 QTL Lz x Ler F2 (Werner et al Genetics 2005)

30 Potential Deletions >500 potential deletions 45 confirmed by Ler sequence 23 (of 114) transposons Disease Resistance (R) gene clusters Single R gene deletions Genes involved in Secondary metabolism Unknown genes

31 Potential Deletions Suggest Candidate Genes FLOWERING1 QTL Chr1 (bp) Flowering Time QTL caused by a natural deletion in FLM MAF1 FLM natural deletion (Werner et al PNAS 2005)

32 Fast Neutron deletions FKF1 80kb deletion CHR1cry2 10kb deletion CHR1 Het

33 Identification of the putative duplications/deletions 0.3 0.2 -0.2 0.3 -0.3 0.1 0.0 1.2 1.5 1.8 0.6 0.5 0.9 0.3 0.1 -0.05 -0.2 0.0 0.1 Observations: S S S S S S S D D D D D D S S S S S S States - DNA copy number

34 Identification of the putative duplications/deletions Hidden Markov Model (HMM) AIM: model that identifies changes in copy number and minimizes the number of false positives

35 Hidden Markov Models… A Markov process is a process which moves from state to state depending (only) on the previous n states Changes in DNA copy number: 2 copies0 copies4 copies Transition Probabilities

36 Hidden Markov Models… Observable States (Hybridization intensities): Hidden States (DNA copy number): 2 copies0 copies 4 copies 0.3 1.50.0010.81.2 Pr (obs|Hid)

37 BoCaNi (Ivan’s) Mutation

38 Natural Variation on Tiling Arrays

39 Complex, Large Genomes? Signal to Noise with Large Genomes RNA, less complex, but differential expression Barley SFPs

40 RNA 2 genotypes, 18 replicates

41 False Discovery Rate RNA RNA hybridization 17 Golden Promise 19 Morex, 6 tissues SAM Analysis for the Two-Class Unpaired Case Assuming Unequal Variances s0 = 0.0342 (The 5 % quantile of the s values.) Number of permutations: 500 MEAN number of falsely called genes is computed. Deltap0CalledFALSEFDR 0.50.952715958840.206 1.00.95177445940.032 1.50.9513285650.005 2.00.951050470.001 2.50.95858300.000

42 Sequence Verification of SFPs RNAGeneChip mxSFPnonSFPgpSFP Sequence 53012403075203 MX1781154518 Non- polymorphic2200272045128 GP223761155 Chisq = 2049.2, df = 4, p-value = 0

43 Position of SNP

44 Barley SFPs Genomic DNA 3 genotypes 3 replicates

45 False Discovery Rate DNA Genomic DNA hybridizaiton 3 replicates 3 genotypes SAM Analysis for the Multi-Class Case with 3 Classes s0 = 0.0123 (The 25 % quantile of the s values.) Number of permutations: 100 MEAN number of falsely called genes is computed. Deltap0CalledFALSEFDR 10.95401720730.47 20.9517285830.31 30.9510902580.22 40.957891390.16 50.95631860.13

46 Array Haplotyping What about Diversity/selection across the genome? A genome wide estimate of population genetics parameters, θ w, π, Tajima’D, ρ LD decay, Haplotype block size Deep population structure? Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas, C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2 Fl-1, Ita-0, Mr-0, St-0, Sah-0

47 Array Haplotyping Inbred lines Low effective recombination due to partial selfing Extensive LD blocks ColLerCviKasBayShahLzNd Chromosome1 ~500kb

48 SFPs for reverse genetics http://naturalvariation.org/sfp 14 Accessions 30,950 SFPs`

49 Chromosome Wide Diversity

50 Diversity 50kb windows

51 Tajima’s D like 50kb windows RPS4 unknown

52 R genes vs bHLH

53 Review Single Feature Polymorphisms (SFPs) can be used to Identify recombination breakpoints eXtreme Array Mapping Potential deletions (candidate genes) Haplotyping Diversity/Selection Association Mapping

54 Aquilegia (Columbines) Recent adaptive radiation, 350Mb genome

55 Species with > 20k ESTs 11/14/2003 Animal lineage: good coverage Plant lineage: crop plant coverage

56 300 F3 RILs growing (Evadne Smith) 85,000 5’ 3’ ESTs -- 51,000 clones, >3500 SNPs TIGR gene index and GenBank arrays being designed by Nimblegen Aquilegia (Columbines)

57 Genetics of Speciation along a Hybrid Zone

58 NSF Genome Complexity Physical Map (BAC tiling path) –Physical assignment of ESTs QTL for pollinator preference –~400 RILs, map abiotic stress –QTL fine mapping/ LD mapping Develop transformation techniques http://www.AQgenome.org Scott Hodges (UCSB) Elena Kramer (Harvard) Magnus Nordborg (USC) Justin Borevitz (U Chicago) Jeff Tompkins (Clemson)

59 NaturalVariation.org Salk Jon Werner Joanne Chory Joseph Ecker Max Planck Detlef Weigel UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler Salk Jon Werner Joanne Chory Joseph Ecker Max Planck Detlef Weigel UC San Diego Charles Berry Scripps Sam Hazen Elizabeth Winzeler University of Chicago Xu Zhang Evadne Smith Ken Okamoto Purdue Ivan Baxter UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones University of Chicago Xu Zhang Evadne Smith Ken Okamoto Purdue Ivan Baxter UC Davis Julin Maloof University of Guelph, Canada Dave Wolyn Sainsbury Laboratory Jonathan Jones


Download ppt "Toward the genetic basis of adaptation using arrays Justin Borevitz Ecology & Evolution University of Chicago"

Similar presentations


Ads by Google