Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequencing 128 Ashkenazi Genomes: Implications for Medical Genetics and History Shai Carmi Department of Computer Science Columbia University Itsik Pe’er’s.

Similar presentations


Presentation on theme: "Sequencing 128 Ashkenazi Genomes: Implications for Medical Genetics and History Shai Carmi Department of Computer Science Columbia University Itsik Pe’er’s."— Presentation transcript:

1 Sequencing 128 Ashkenazi Genomes: Implications for Medical Genetics and History Shai Carmi Department of Computer Science Columbia University Itsik Pe’er’s lab UCLA October 2014

2 Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and Population History Opportunities and Future Directions

3 Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and Population History Opportunities and Future Directions

4 Ashkenazi Jewish (AJ) Genetics: Significance Medical genetics Large founder population Mendelian disorders Complex diseases o Breast cancer, Parkinson’s, Crohn’s Population genetics Debated origins Genetics of a founder event mtDNA: Behar et al., 2004; Behar et al., 2006 Y chr: Behar et al., 2003; Behar et al., 2004 Disease genes: Risch et al., 2003; Slatkin, 2004 SNP arrays: Gusev et al., 2012; Palamara et al., 2012 Review: Ostrer and Skorecki, 2013

5 Founder Populations: Opportunities Recent successes Greece o Tachmazidou et al., 2013; HDL Finland o Kurki et al. 2014; aneurysm Iceland o Many papers; most recently Steinthorsdottir et al., 2014; T2D Ashkenazi Jews o Hui et al., in preparation; Crohn’s See also: Hatzikotoulas et al., 2014 Zuk et al., 2014 Time Founder population Non-founder population Disease alleles Bottleneck Population size Present Problem: Common genotyping platforms do not include alleles rare outside the founder population

6 Opportunities: Reduced Haplotype Diversity Chromosom es in the sample Full sequence Partial sequence (SNP array, low-coverage sequence) Observed data Imputation Inferred sequence Nearly-complete inferred sequence Problem: The Ashkenazi population is missing a reference panel of complete sequences

7 Opportunities: Personal Genomics in AJ Personal clinical genomics is here But genomes are hard to interpret Problem: The Ashkenazi population is missing a reference panel of complete sequences

8 The Documented Ashkenazi History Ca. 1000: Small communities in Northern France, Rhineland Migration east Expansion Migration to US and Israel Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews? Whole- genomes?

9 Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and Population History Opportunities and Future Directions

10 The Ashkenazi Genome Consortium NY area labs interested in specific diseases Quantify utility in medical genetics Learn about population history Phase I: 128 whole genomes (Completed*) Phase II: ≈500 whole genomes (NYGC; under way) Large cohorts of AJ cases Impute * Carmi et al., Nat Commun, 2014

11 Technical Details PropertyGenome (exome) Coverage≈56x Fraction called96.7±0.3% (98.1%) Concordance with arrays 99.67±0.25% Ti/Tv ratio2.14±0.004 (3.05) Ashkenazi ancestry verified Some phenotypes exist Sequencing by Complete Genomics in three batches o Uniform QC measures Error rate estimates o Using runs-of-homozygosity and a duplicate o SNVs: ≈10-40k errors per genome (FDR: 0.3-1.3%) o Indels: ≈10-30k errors per genome (FDR: 2-6%) QC: Remove indels, poly-allelic variants, Hardy-Weinberg violations, low call rate Errors after QC: ≈5k per genome hets roh

12 Comparison to Europeans Comparison panels: 26 Flemish from Belgium (platform- matched) 87 North-West Europeans [CEU (1000 Genomes)] Fraction novel (%) (dbSNP135) Population-specific variants (25x25 genomes)

13 An Ashkenazi reference panel filters more benign variants than a European panel. AJ Clinical Genomics

14 AJ Medical Genetics: Imputation An Ashkenazi reference panel improves imputation accuracy of AJ SNP arrays compared to the standard European panel. Correlation between imputed and real data Rare variants (≤1%) accuracy: 87% vs 65% Using Impute2

15 AJ Medical Genetics: Applications Our consortium: o An expanded carrier screening panel o Pharmacogenetically-important alleles o Low-frequency deletions in tumors o Association studies: schizophrenia, Parkinson’s, Crohn’s, longevity, cancer Others: o Frequency lookups (clinical/pedigrees) o Association studies: Epilepsy, Autism, …

16 Principal Component Analysis (PCA) Price et al., 2008; Olshen et al., 2008; Need et al., 2009; Kopelman et al., 2009; Atzmon et al., 2010; Behar et al., 2010; Bray et al., 2010; Guha et al., 2012; Behar et al., 2014 Ashkenazi Jews Middle- East Europ e Druze Palestinians Bedouins Sardinians Tuscans Italians Basque French Flemish Sephardi Jews (Italy, Turkey)

17 The Documented Ashkenazi History Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews?

18 Variant Discovery Rate Heterozygosity paradox? Number of variants Predicted number of new variants

19 A Model for Ancient History Out-of-Africa Middle- East European gene flow into AJ 25x25 genomes

20 The Documented Ashkenazi History Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews?

21 Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and Population History Opportunities and Future Directions

22 Identical-by-Descent (IBD) Shared Segment Formal definition: A contiguous segment inherited from a single, recent common ancestor. g IBD segment After Browning & Browning, 2012 What’s “recent”?

23 Identical-by-Descent (IBD) Shared Segment Practical definition: A contiguous segment nearly identical over a sequence length longer than a cutoff. g IBD segment Formal definition: A contiguous segment inherited from a single, recent common ancestor.

24 Applications A segment indicates recent co- ancestry: o Disease mapping o Pedigree reconstruction o Detecting natural selection o Demographic (historical) inference o Estimating mutation rates Identical sequence across individuals: o Resolving haplotypes (phasing) o Imputation o Estimating heritability o Estimating genotyping error rate g IBD segment Eskin’s lab

25 IBD Sharing Theory Model: o A population with a constant effective size N o Two chromosomes of length L (Morgans) o A minimal segment length m (Morgans) The number of shared segments n m ? The fraction of the chromosome in shared segments f m ? L m ℓ1ℓ1 ℓ3ℓ3 ℓ2ℓ2

26 Results overview Palamara et al., 2012; Carmi et al., Genetics, 2013; Carmi et al., Theor Popul Biol, 2014

27 Demographic Inference: Maximum Likelihood Carmi et al., Theor Popul Biol, 2014 Use the distribution of the number of shared segments

28 Demographic Inference: A Practical Approach Palamara et al., 2012 Method: Record IBD segments in each length bin Using Eq. (1), find the history N(t) that fits best Hypothetical example

29 IBD Sharing in Ashkenazi Jews Gusev et al., 2012 A pair of AJ individuals shares ≈50cM in ≈15 long segments (>3cM) Atzmon et al., 2010 Bray et al., 2010 AJ EU

30 Inferring the Bottleneck Size and Time Carmi et al., Nat. Commun., 2014 Palamara et al., 2012

31 Inferring the Bottleneck Size and Time Carmi et al., Nat. Commun., 2014 Palamara et al., 2012

32 Inferring the Bottleneck Size and Time Carmi et al., Nat. Commun., 2014 Palamara et al., 2012 Time (years)

33 Caveats Phasing and sequencing errors; IBD detection errors Reasonable power only for 10-50 generations ago Model specification (e.g. prolonged bottleneck, admixture) Parameter95% confidence interval Ancestral size3654-5856 Bottleneck size249-419 Growth rate (per generation) 16-53% Bottleneck time (years)625-800 A bottleneck 700ya confirmed by an independent method: lengths of haplotypes around rare variants o Mathieson and McVean, 2014

34 The Documented Ashkenazi History Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews?

35 Outline Ashkenazi Jewish Genetics: Background The Ashkenazi Genome Sequencing Project Segment Sharing and Population History Opportunities and Future Directions

36 Coverage by Shared Segments A sequenced reference panel Partly sequenced genome Impute What fraction of the genome can we cover with shared segments? Full sequence Partial sequence Nearly-complete inferred sequence

37 The Era of Near-Complete Coverage Now Phase II Mine public data? Other studies? Opportunities: Interpret personal genomes o Time-stamp rare mutations Cost-effective large-scale association studies o Resolve haplotypes o Impute SNP arrays or low-coverage sequences o Mapping rare variants/haplotypes See Carmi et al., Genetics, 2013 for a theoretical analysis

38 The Era of Near-Complete Coverage New algorithms needed! g IBD segment Time-stamp rare mutations Now Phase II Mine public data? Other studies?

39 Ashkenazi History Origin? Founder event? European gene flow: o Where? o When? o How much? Relation to other Jews?

40 The Place of European Gene Flow “Most of these theories … are myths or speculation … based on some vague or misunderstood references. … It will probably be impossible to say definitely where the hundreds or thousands of Jews in Poland in the 13 th to 14 th centuries came from.” B. Weinryb, The Jews of Poland, 1972

41 Approach Johnson et al., 2011; Moreno-Estrada et al., 2013 o o o o o o o o o o o o EU ME x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x EUEU x x x x x x x x x x x x x x x o o o o o o x x x x x x x x x x x x EU ME AJ An Ashkenazi genome PC2 PC1 PC2

42 Preliminary Results Origin in the Levant Gene flow mostly from West-Europe, about 30 generations ago Sex-imbalanced history?

43 Summary It is important to study Ashkenazi genetics We sequenced 128 whole-genomes Useful for personal clinical genomics and imputation Segment sharing reveals a founder event and suggests opportunities My research statement

44 Acknowledgements Funding: Human Frontier Science program Itsik Pe’er’s lab: James Xue, Ethan Kochav, Shuo Yang, Pier Palamara, Vladimir Vacic TAGC consortium members: Todd Lencz, Semanti Mukherjee (LIJMC) Lorraine Clark, Xinmin Liu (CUMC) Gil Atzmon, Harry Ostrer, Danny Ben-Avraham (AECOM) Inga Peter, Judy Cho (ISMMS) Ariel Darvasi (HUJI) Joseph Vijai (MSKCC) Ken Hui (Yale) VIB Ghent, Belgium Thank you for your attention! Harvard University: Peter Wilton, John Wakeley Sheba Medical Center: Eitan Friedman


Download ppt "Sequencing 128 Ashkenazi Genomes: Implications for Medical Genetics and History Shai Carmi Department of Computer Science Columbia University Itsik Pe’er’s."

Similar presentations


Ads by Google