LARA MANGRAVITE SAGE BIONETWORKS ON BEHALF OF THE RA CHALLENGE ORGANIZING TEAM The DREAM Rheumatoid Arthritis Responder Challenge: Motivation, Data, Scoring and Results
Challenge Organizers Eli Stahl, Mt Sinai Gaurav Pandey, Mt Sinai Jing Cui, Brigham and Women’s Andre Falcao, U Lisbon Robert Plenge, Merck Peter Gregersen, Feinstein Institute Jeff Greenberg, Corrona Dimitrios Pappas, Corrona Kaleb Michaud, Arthritis Internet Registry Generators of Training Dataset Solly Sieberts Abhi Pratap Christine Suver Bruce Hoff Thea Norman Venkat Balagurusamy Stephen Friend Gustavo Stolovitzky Funders
~30% of RA patients fail to respond to anti-TNF therapy Rheumatoid Arthritis Treatment -- Predicting nonresponse would assist in precision medicine, clinical trial design, and development of new therapies Robert Plenge
n=2,706 Pharmacogenetics of antiTNF response DrugN SNP- heritability (se) P-value All patients (0.10)0.02 etanercept7160 (0.34)0.5 infliximab (0.29)0.02 adalimumab (0.25)0.08 infliximab + adalimumab (0.13)0.003 Ciu and Stahl et al PLoS Genetis 2013 Eli Stahl
Rationale Given sizable estimated heritability, is it possible to use genetic features to predict treatment response? Polygenic approach: Combined influence of weak effects Population subtypes: Not all individuals react similarly Does genetic heritability foretell genetic prediction?
RA Responder Challenge Design Discovery (phase I) GWAS of treatment response in RA (n≈2,700 patients) GWAS of treatment response in RA (n≈2,700 patients) Genomic data (e.g., expression profiling) Genomic data (e.g., expression profiling) Polygenic SNP predictor of response Refine model Plenge et. al. Nature Genetics 2013
Discovery (phase I) Validation (phase II) GWAS of treatment response in RA (n≈2,700 patients) GWAS of treatment response in RA (n≈2,700 patients) Genomic data (e.g., expression profiling) Genomic data (e.g., expression profiling) Polygenic SNP predictor of response Refine model Submit models GWAS of treatment response in RA (n≈1,100 patients) GWAS of treatment response in RA (n≈1,100 patients) Score models RA Responder Challenge Design Plenge et. al. Nature Genetics 2013
Discovery (phase I) Validation (phase II) GWAS of treatment response in RA (n≈2,700 patients) GWAS of treatment response in RA (n≈2,700 patients) Genomic data (e.g., expression profiling) Genomic data (e.g., expression profiling) Polygenic SNP predictor of response Refine model GWAS of treatment response in RA (n≈1,100 patients) GWAS of treatment response in RA (n≈1,100 patients) Submit models Score models RA Responder Challenge Design Plenge et. al. Nature Genetics 2013
RA Challenge Data Genotypes ~ 2.3 million SNPs Clinical ~ 6 traits N=2076Response Discovery Dataset Combine set from 4 studies Test Data Genotypes ~ 2.3 million SNPs Clinical ~ 6 traits N=723 Generated for this challenge
RA Challenge: Build the best possible predictors of anti-TNF response in RA TEAM PHASE February - June 2014 Self-aggregate into teams and build the best possible predictor of response. COMMUNITY PHASE July - October 2014 Work together across teams to assess the contribution of genetics to prediction. Team Phase Community Phase
RA Responders Challenge Predict treatment response as measured by change in disease activity score (DAS28) in response to ant- TNF therapy. Scoring: Average rank of pearson correlation and spearman correlation. Identify poor responders to anti-TNF therapy as defined by EULAR criteria. Scoring: Average rank of AUC and PR.
Team Phase Results Subchallenge 1: Predicting deltaDAS Subchallenge 2: Predicting nonresponders Best models: Team Guan Lab & Team SBI_Lab Solly Sieberts 32 teams
The Community Phase (July – October) Work in collaboration to determine: -- Whether genetic information contributes in a meaningful way to predictions? -- Best possible predictors of response. -- What components of the modeling approaches are most beneficial for this question.
Community Phase Participants
Community Phase Logistics First part: teams split into groups and shared knowledge to help inform one another’s efforts Second part: all teams came together to devise an analytical plan to explicitly address these questions.
Teams share ideas and then work individually to provide: Do models using genetic features improve on prediction relative to clinical models? What is the contribution of feature selection vs. modeling algorithm on performance? Does the use of biological priors in feature selection improve relative to random selection? Can supervised ensemble approach improve upon individual predictions?
Subchallenge 1:Predicting deltaDAS
Subchallenge 2: Predicting Nonresponders
Ensemble Modeling by Gaurav Pandey
Conclusions Gaussian Process Regression appears to work best with this type of problem. SNP selection more important than algorithmic selection in most cases. Genetic information improves prediction of nonresponders over use of clinical information. Ability to predict response based on clinical features may be valuable to clinicians in and of themselves.
Today’s Speakers: Best Performers from Independent Team Phase Fan Zhu on behalf of Team Guan Lab A generic method for predicting clinical outcomes and drug response Javier Garcia-Garcia on behalf of Team SBI_Lab Predicting response to arthritis treatments: regression-based gaussian processes on small sets of SNPs