Loss-of-co-Homozygosity mapping and exome sequencing of a Syrian pedigree identified the candidate causal mutation associated with rheumatoid arthritis. Yukinori Okada 1,2, Namrata Gupta 2, Daniel Mirel 2, Stacey Gabriel 2, Thurayya Arayssi 3, Faten Mouassess 4, Walid AL. Achkar 4, Layla A. Kazkaz 5,6, Robert M. Plenge 1,2. 1. Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA. 2. Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, USA. 3. Weill Cornell Medical College-Qatar, Education City, Doha, Qatar. 4. Molecular Biology and Biotechnology Dept, Human Genetics Division, Damascus, Syria. 5. Tishreen Hospital, Damascus, Syria. 6. Syrian Association for Rheumatology, Damascus, Syria. Background/Purpose: Although there are >50 rheumatoid arthritis (RA) risk loci that contain common variants, there are no loci that harbor rare mutations that influence RA risk in a Mendelian fashion. Here, we perform whole exome sequencing to search for rare, causal mutations in a 4-generation, 49-person consanguineous Syrian pedigree in which 8 individuals were affected with rheumatoid arthritis (RA). Method: We performed GWAS genotyping on 16 family members (affected and unaffected) and genome-wide exome sequencing in the 4 anti-CCP positive RA cases. We developed a novel non-parametric linkage analysis we term “Loss-of-co-Homozygosity” (LOcH) mapping that extends homozygosity mapping to include any type of inheritance mode. LOcH uses genome-wide SNP data to search the regional stretches that lose one or both homozygous genotypes (i.e., lose “co-homozygosity”) in affected cases, to identify ancestry-shared haplotype. Candidate mutations selected by exome sequencing and LOcH mapping were further validated by iPlex assay in 24 family members. Result: Using GWAS data and LOcH mapping, we identified 12% of the genome in which the same ancestral haplotype was shared among all RA cases. Exome sequence identified 15 nonsense or missense candidate mutations shared among all cases. Validation iPlex assay found that 1 mutation preferentially segregated in cases compared to controls (P = 0.023). The mutated gene is phospholipase B1 (PLB1) at 2p23, which has been implicated in human epidermal barrier function. Conclusion: While additional investigation of PLB1 mutation is required, our approach highlights a novel method of statistical analysis of genome-wide sequence data. GWAS genotyping of 4 RA cases and 12 controls “LOcH mapping” of genetic locus shared among 4 RA cases Exome sequence of 4 RA cases Filtering of nonsense/missense variants shared among RA cases using 1000Genome/ESP/dbSNP databases Selection of 15 candidate causal variants Validation of the candidate variants by iPlex assay for all available 24 family members 4 variants shared among 5 RA cases and 1 anti-CCP antibody positive control AAABBB +++ AAABBB +-+ Co-Homozygosity in genotype counts AAABBB -+- AAABBB ++- AAABBB --+ AAABBB +-- AAABBB -++ Loss-of-co-Homozygosity in genotype counts ▪ LOcH mapping can impute presence of the exome-derived mutation of each additional control using GWAS data in the LOcH stretch. ▪ When a control has ancestry-shared haplotype, LOcH stretch remains after inclusion of a control. ▪ When a control does not have ancestry-shared haplotype, a LOcH stretch diminishes after inclusion of a control. LOcH mapping identifies ancestry-shared haplotype among affected cases ▪ In a family with Mendlian disease, the causal mutation resides on the same ancestry-shared haplotype. ▪ Regardless of recessive/dominant mode of inheritance, all the case have at least one ancestry-shared haplotype, which should be “Loss- of-co-Homozygosity (LOcH)” in GWAS data. ▪ LOcH mapping can screen the loci with the causal mutation, as an extension of Homozygosity mapping to a disease with unknown inheritance mode. 1 variant at PLB1 gene preferentially segregated in RA cases compared to controls ~ Study design ~~ Syrian family with RA ~ 4 RA cases with ▪ Exome sequence, ▪ GWAS genotyping, ▪ iPlex validation assay. 12 controls with ▪ GWAS genotyping, ▪ iPlex validation assay. 1 RA cases, 1 anti-CCP positive control, 6 controls with ▪ iPlex validation assay. We enrolled consanguineous Syrian family with rheumatoid arthritis (RA). 49 family members include 8 RA cases (II-12,13,14, III-3,17,18, IV-5,9) and 1 anti-CCP antibody-positive control (III-2). ~ LOcH (Loss-of-co-Homozygosity) mapping ~ LOcH mapping imputes candidate mutations in non-exomed subjects. 65,524 variants identified by exome sequence. Ts/Tv= ,804 missense / nonsense variants. 900 variants not in dbSNP 132/1000G Phase I/ESP 5400 with non-reference allele frequency ≥0.05. Variant filtering of exome sequence data 476/156 SNVs/Indels with available genotypes in all 4 cases. 13/8 SNVs/Indels with ≥1 non-ref alleles in all 4 cases. ~ Exome sequence of RA cases ~ Subjects : 4 RA cases. Exon capture : Agilent SureSelect Human All Exon Kitv2 (~44Mb). Sequencer : Illumina HiSeq. Analysis : GATK pipeline at Broad (GRch37.64). Mean/Median depth of the variants : 290.1/204. Genotype concordance with GWAS data : 99.56% ~ LOcH stretches and exome-derived mutations ~ LOcH stretches and SNVsLOcH stretches and Indels ▪ LOcH mapping for 4 RA cases identified 36 LOcH stretches covering 12% of genome. ▪ All exome-derived candidate causal SNVs were included in LOcH stretches (P = 1.1× ). ▪ Only 2 of 8 exome-derived candidate causal Indels were included in LOcH stretches (P = 0.44). ▪ Distinct overlap rates between SNVs and Indels suggested lower quality of exome-derived indels. ~ Candidate causal SNV in PLB1 gene at 2p23 ~ ▪ Java TM software for LOcH mapping and genotype imputation is available for the request to the authors. ▪ Contact : Yukinori Okada, MD, PhD, ▪ We conducted validation iPlex assay of candidate causal mutations for all 24 available family subjects. ▪ 13 exome-derived SNVs and 2 Indels included in LOcH stretches were selected. ▪ 3 SNVs and 1 Indel were observed for all 5 RA cases and 1 anti-CCP positive control. ▪ Of these, a non-synonymous SNV in phospholipase B1 (PLB1) gene at 2p23 preferentially segregated in cases compared to controls (P = 0.023) Distribution of PLB1 mutation Subjects with PLB1 mutationSubjects without PLB1 mutation