Presentation is loading. Please wait.

Presentation is loading. Please wait.

Meiotic gene conversion in humans: rate, sex ratio, and GC bias Amy L. Williams June 19, 2013 University of Chicago.

Similar presentations


Presentation on theme: "Meiotic gene conversion in humans: rate, sex ratio, and GC bias Amy L. Williams June 19, 2013 University of Chicago."— Presentation transcript:

1 Meiotic gene conversion in humans: rate, sex ratio, and GC bias Amy L. Williams June 19, 2013 University of Chicago

2 Gene conversion defined Meiosis: produces haploid germ cells with recombinations Gene conversion: short segment copied into given chromosome from other homolog Meiosis Crossover Gene Conversion Two types of recombination:

3 Number of gene conversions per meiosis? –4-15× # crossovers? Jeffreys and May (2004) Length of gene conversion tracts? –55-290 bp? Jeffreys and May (2004) Study question 1: gene conversion rate?

4 Number of gene conversions per meiosis? –4-15× # crossovers? Jeffreys and May (2004) Length of gene conversion tracts? –55-290 bp? Jeffreys and May (2004) Per base-pair rate? Fraction of genome affected –R = (number × tract length) / genome length –2.2×10 -6 to 4.4×10 -5 ? Jeffreys and May (2004) Study question 1: gene conversion rate?

5 Study question 2: male vs. female rate? Gender differences in rate? –Crossovers: female rate 1.78× male (deCODE)

6 Study question 3 & 4: GC bias? Localization? GC bias observed in allelic transmissions? Crossover hot spots influence location? Locations of gene conversions independent in a given meiosis? Myers et al., Science 2005

7 Summary: study questions 1.Genome-wide de novo gene conversion rate? 2.Different rate between males/females? 3.Extent of GC bias in tracts? 4.Localization: Hotspots? Tracts independent?

8 Outline Background / study questions Study design and methods Results –SNP chip data –Sequence data

9 Approaches to identify gene conversions Linkage disequilibrium based –Can give rate estimate –Averaged over human history, both genders Sperm-based –Many meiotic products: per-individual estimates –Single molecule: genome-wide assays difficult Pedigree-based –De novo, per-gender events observable –Data for many samples required

10 Study design: SNP chip data for pedigrees Primary analysis: pedigree SNP chip data Challenge: small tracts –Tracts covered by ≤ 1 SNP –Not all tracts covered, but still obtain overall rate Chip data give per base-pair rate –R = # gene conversions / # informative sites

11 Datasets for analysis Mexican American pedigrees Data source 1: San Antonio Family Studies –2,490 genotyped samples, 80 pedigrees –SNP chip genotypes (Illumina 1M, 660k) –Can estimate de novo gene conversion rate

12 Datasets for analysis Mexican American pedigrees Data source 1: San Antonio Family Studies –2,490 genotyped samples, 80 pedigrees –SNP chip genotypes (Illumina 1M, 660k) –Can estimate de novo gene conversion rate Data source 2: T2D-GENES Consortium –607 sequenced samples, 20 pedigrees –Whole genome sequence (Complete Genomics) –Can examine tract length, distribution, etc. Though need deep data on single family to do so

13 Study design: SNP chip data for pedigrees Pedigree-based haplotypes/phase reveal recombinations –Heterozygous sites: informative for recombination Phasing method: Hapi –Phases nuclear families –Williams et al., Genome Biol. 2010

14 Family-based phase reveals recombinations Hapi output: paternal haplotype transmissions Crossover: Haplotype 2 Haplotype 1

15 Family-based phase reveals recombinations Hapi output: paternal haplotype transmissions Crossover:Gene Conversion: Haplotype 2 Haplotype 1

16 Other pedigree phasing methods Most pedigree phasing methods slow –Runtime complexity for phasing ~O(m 2 2n ) n = # non-founders m = # markers –Example: nuclear family with 11 children 4,194,304 states per marker Can merge exponential class of states Many states extremely unlikely to be optimal

17 Hapi: efficient phasing of nuclear families Hapi: state space reduction improves efficiency –Merges exponential class of states –Omits states that cannot yield optimal solution Applied to family with 11 children –Average per marker states: 4.2, maximum 48

18 Hapi: efficient phasing of nuclear families Hapi: state space reduction improves efficiency –Merges exponential class of states –Omits states that cannot yield optimal solution Applied to family with 11 children –Average per marker states: 4.2, maximum 48 Program All families (N=103) RuntimeSpeedup Hapi3.1 s- Merlin1,005 s323× Allegro v27,661 s2,462× Superlink1,393 s*448× * Superlink failed to analyze 11 child family; 8/11 children used

19 Hapi: efficient phasing of nuclear families Hapi: state space reduction improves efficiency –Merges exponential class of states –Omits states that cannot yield optimal solution Applied to family with 11 children –Average per marker states: 4.2, maximum 48 Program All families (N=103)≤ 3 children (N=86) RuntimeSpeedupRuntimeSpeedup Hapi3.1 s-2.2 s- Merlin1,005 s323×8.7 s3.8× Allegro v27,661 s2,462×14.5 s6.4× Superlink1,393 s*448×38.8 s17.2× * Superlink failed to analyze 11 child family; 8/11 children used

20 Applying Hapi to multi-generational pedigrees Hapi currently applies to nuclear families –For 3-generation pedigrees analyzed for gene conversions, omit sites with phase conflicts Will not bias results, but data are reduced

21 Applying Hapi to multi-generational pedigrees Hapi currently applies to nuclear families –For 3-generation pedigrees analyzed for gene conversions, omit sites with phase conflicts Will not bias results, but data are reduced Extension to Hapi possible to efficiently analyze arbitrarily large pedigrees –Most San Antonio Family Studies pedigrees too large to be phased in practical time

22 Approach to identifying gene conversions 1.Perform QC, phase 3-generation pedigrees 2.Find gene conversions in 2 nd generation: single SNP double crossovers 3.Confirm: –Gene converted allele in 3 rd generation –Other allele in 2 nd generation sibling(s) False positive only if ≥ 2 genotyping errors

23 Outline Background / study questions Study design and methods Results –SNP chip data –Sequence data

24 Current analysis dataset Analyzed SNP chip data for 16 pedigrees –Data for both parents, 3+ children, 1+ grandchild –190 samples –42 meioses (21 paternal, 21 maternal) 4.15×10 6 informative sites

25 Rate: 7.95×10 -6 /bp/generation –Within range of Jeffreys and May (2004) –Close to LD-based estimates Result 1: 33 putative gene conversions, rate Male Female

26 Rate: 7.95×10 -6 /bp/generation –Within range of Jeffreys and May (2004) –Close to LD-based estimates Result 1: 33 putative gene conversions, rate Male Female Are these real gene conversions?

27 19 sites sequenced by T2D-GENES Consortium –18/19 gene conversion genotypes verified Differing site looks like sequencing artifact –2 nd generation recipient has genotype mismatch 3 rd generation grandchild shows same genotype –If sequence data correct, gene conversion in grandchild T2D-GENES sequence confirms events

28 More female gene conversions than male –Females transmit 1.54× males –Difference (yet) not significant – larger sample coming Different rates expected based on crossovers –Female crossover rate 1.78× male (deCODE) Result 2: gene conversion rates by gender

29 Result 3: gene conversions localize in hotspots 2.71% of genome in ≥10 cM/Mb hotspots

30 Result 3: gene conversions localize in hotspots 10/33 gene conversions with ≥10 cM/Mb: P=1.1×10 -8 2.71% of genome in ≥10 cM/Mb hotspots

31 Result 4: observe extreme GC bias 31 GC informative sites –A/C, A/G T/C, T/G GC transmission in 74% of cases (95% CI 59% – 90%) –GC bias likely (P=5.3×10 -3 )

32 Outline Background / study questions Study design and methods Results –SNP chip data –Sequence data

33 Sequence near chip-identified gene conversions Sequence available for 11/33 putative sites

34 Sequence near chip-identified gene conversions Sequence available for 11/33 putative sites Shortest resolution for tract length ≤ 143 bp

35 Sequence near chip-identified gene conversions Sequence available for 11/33 putative sites Clustered gene conversions in 4 sequences

36 Sequence near chip-identified gene conversions Sequence available for 11/33 putative sites Clustered gene conversions in 4 sequences Boxed regions confirmed by Sanger sequencing

37 Relationship to complex crossover? Haplotype 2 Haplotype 1

38 Conclusions Estimate of de novo gene conversion rate –7.95×10 -6 /bp/generation –Females: 1.54× gene conversions vs. males Enriched in hotspots: similar mechanism to crossover GC vs AT allele transmitted ~3:1 – GC bias Complex/clustered gene conversions observed in sequence data –Suggests unique correlation within short region

39 The T2D-GENES Consortium (NIDDK) San Antonio Family Studies (NIDDK, NIMH) NHGRI NRSA Fellowship Acknowledgements Nick PattersonDavid ReichJohn BlangeroGiulio GenoveseTom DyerKati Truax


Download ppt "Meiotic gene conversion in humans: rate, sex ratio, and GC bias Amy L. Williams June 19, 2013 University of Chicago."

Similar presentations


Ads by Google