Estimating fitness landscapes John Pinney

Slides:



Advertisements
Similar presentations
Background Surveillance data indicate a decline in the prevalence of antiretroviral drug resistance among treated patients. Improved treatment strategies.
Advertisements

Use of Bioisosteric Replacement Tools to Obtain Mutation- Resistant Antivirals Mattia CF Prosperi University of Roma TRE Faculty of Computer Science Engineering.
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Analysis of High-Throughput Screening Data C371 Fall 2004.
Emerging patterns of drug resistance and viral tropism in cART-naïve and failing patients infected with HIV-1 subtype C Thumbi Ndung’u, BVM, PhD Associate.
Objective of the DAP A) Specify an analysis plan that can be applied to a wide variety of clinical HIV resistance studies. B) Include both Intervention.
6/28/00TPED1 Resistance Testing: What is it? What does it mean? How does drug resistance emerge? Overview of methods Advantages and disadvantages Current.
Data mining in bioinformatics: problems and challenges Sorin Draghici WWW:
Salvage Antiretroviral Therapy Guiding Principles, Strategies and the Role of Resistance Testing.
1 A Prediction Interval for the Misclassification Rate E.B. Laber & S.A. Murphy.
Fitness effects of HIV mutations Lucy Crooks Theoretical Biology, ETH Zurich.
Prediction Methods Mark J. van der Laan Division of Biostatistics U.C. Berkeley
THE BUILDING BLOCKS OF LIFE. BUILT FOR YOU Putting Engineering back into Protein Engineering Jun Liao, UC Santa Cruz Manfred K. Warmuth, UC Santa Cruz.
1 Functional prediction in proteins (purifying and positive selection)
1 A Prediction Interval for the Misclassification Rate E.B. Laber & S.A. Murphy.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Epitope Selection Rational Vaccine design. Why? Therapeutic vaccines Therapeutic vaccines Treatment of viral infections (e.g., HIV, HCV), and resistant.
Classification and Prediction: Regression Analysis
“FUTURE CONSIDERATIONS FOR PK/PD RESEARCH” Terrence F. Blaschke, M.D. Professor of Medicine and Molecular Pharmacology Stanford University.
McGraw-Hill © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Correlational Research Chapter Fifteen.
ANTIRETROVIRAL RESISTANCE Jennifer Fulcher, MD, PhD.
Persisting long term benefit of genotypic guided treatment in HIV infected patients failing HAART and Importance of Protease Inhibitor plasma levels. Viradapt.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Correlation Scatter Plots Correlation Coefficients Significance Test.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Phylogenetic Insight into HIV Transmission Networks in a Southeastern US Cohort Ann Dennis 1, Stéphane Hué 2, Christopher Hurt 1, Sonia Napravnik 1, Deenan.
From Genomic Sequence Data to Genotype: A Proposed Machine Learning Approach for Genotyping Hepatitis C Virus Genaro Hernandez Jr CMSC 601 Spring 2011.
Prediction of HIV-1 Drug Resistance: Representation of Target Sequence Mutational Patterns via an n-Grams Approach Majid Masso School of Systems Biology,
Summary Slide Presentation Are subtype differences important in HIV drug resistance? Lessells RJ, Katzenstein DK, de Oliveira T. Are subtype differences.
Natural polymorphisms in the protease of HIV-1 isolates explain hypersusceptibility to protease inhibitors A.F. Santos, D.M. Tebit, M.S. Lalonde, A. Ratcliff,
MEASURES of CORRELATION. CORRELATION basically the test of measurement. Means that two variables tend to vary together The presence of one indicates the.
Correlational Research Chapter Fifteen Bring Schraw et al.
1 ARV Drug Resistance HAIVN Harvard Medical School AIDS Initiative in Vietnam.
Increased phenotypic susceptibility (hypersusceptibility, HS) to NNRTIs is observed in ~30% of viral isolates with NRTI- resistance mutations 1 and has.
Phylodynamically Estimated HIV Diversification Rates Reveal Prevention of HIV-1 by Antiretroviral Therapy Jeffrey B. Joy, Richard H. Liang, Rosemary M.
Clinical case 19 Lin, I-Yao (Sally). Case 19 Having been confined in the hospital for almost a month due recurrent pneumonia, Mr. XXX, 42 y/o, married,
Unit 6: Specialised Techniques: Anti-Microbial Resistance Monitoring and Assessment of STI Syndrome Aetiologies #4-6-1.
QSAR Study of HIV Protease Inhibitors Using Neural Network and Genetic Algorithm Akmal Aulia, 1 Sunil Kumar, 2 Rajni Garg, * 3 A. Srinivas Reddy, 4 1 Computational.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
Why are there so few key mutant clones? Why are there so few key mutant clones? The influence of stochastic selection and blocking on affinity maturation.
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Combining Least Absolute Shrinkage and Selection Operator (LASSO) and Heat Map Visualization for Biomarkers Detection of LGL Leukemia By: David Garcia.
Super Learning in Prediction HIV Example Mark van der Laan Division of Biostatistics, University of California, Berkeley.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Drug Resistance Reports
HIV-1 Resistance Testing in Drug Development Antiviral Drugs Advisory Committee Meeting November 2-3, 1999.
From OLS to Generalized Regression Chong Ho Yu (I am regressing)
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Single-Subject and Correlational Research Bring Schraw et al.
Examining the Genetic Similarity and Difference of the Three Progressor Groups at the First and Middle Visits Nicole Anguiano BIOL398: Bioinformatics Laboratory.
Statistical Comments on Retrospective Analysis Girish Aras, Ph.D. Jonathan Ma, Ph.D. Center for Drug Evaluation and Research, FDA.
Stock market forecasting using LASSO Linear Regression model
Modelling proteomes: Application to understanding HIV disease progression Ram Samudrala Department of Microbiology University of Washington How does the.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
CZ5225 Methods in Computational Biology Lecture 6: Drug resistance mutations and model developments CZ5225 Methods in Computational Biology.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
HIV Drug Resistance Surveillance Satellite Session: HIV Drug Resistance Surveillance and Control: a Global Concern Silvia Bertagnolio, MD WHO,
Stats Methods at IC Lecture 3: Regression.
CHAPTER 3 Describing Relationships
Misha L. Rajaram and Karin S. Dorman Iowa State University
Multiple Imputation Using Stata
Regression Analysis PhD Course.
Regional lymph nodes and distal extracranial metastases are not a reliable surrogate for actionable mutation in brain metastases. Regional lymph nodes.
Predict Failures with Developer Networks and Social Network Analysis
A Web-based Interactive Genome Library for Surveillance, Detection, Characterization and Drug-Resistance Monitoring of Influenza Virus Infection in the.
Regression Usman Roshan.
New developments in laboratory monitoring of HIV-1 infection
Research in mathematical biology
Rapid Detection of HIV-1 subtype C Integrase resistance mutations by the Use of High-Resolution Melting Analysis Tendai Washaya BSc, Msc. Pre-PhD Student.
Presentation transcript:

Estimating fitness landscapes John Pinney

Genotype network

0 = ‘Wild type’

Genotype network 0 Δ1Δ1

0 Δ1Δ1 Δ2Δ2 Δ3Δ3 Δ4Δ4 Δ5Δ5

0 Δ1Δ2Δ3Δ4Δ5Δ1Δ2Δ3Δ4Δ5 Δ1Δ2Δ3Δ4Δ1Δ2Δ3Δ4 Δ1Δ2Δ3Δ1Δ2Δ3 Δ1Δ2Δ1Δ2 Δ1Δ1

+ Fitness values at every node = Fitness landscape

With an accurate fitness landscape we could predict: mutational trajectories e.g. under drug treatment. rates of emergence of drug resistance. optimal drug combinations to prevent emergence of drug resistance.

At best, fitness data for only relatively few genotypes will be available.

How can we estimate unobserved values? How can we tell if these estimates are good enough for real applications of fitness landscapes?

How can we estimate unobserved values? Specific mutations are expected to contribute to fitness in different ways => Machine learning based on mutations as features.

HIV-1 drug resistance database A great resource for exploring genotype-phenotype relationships. Includes a large amount of sequence data from clinical and lab studies from early 1990s onwards.

In vitro data Viruses with known sequence are assayed to assess their ability to reproduce in vitro in the presence of various drugs. Most of these isolates were obtained from patients who may have been untreated or on any number of drug regimes. => some biases in sequence coverage Genotypes are described using mutations relative to a particular consensus sequence (e.g. subtype B)

Summary of Phenosense results for a variety of protease inhibitors (PIs).

Machine learning from in vitro data Using mutations relative to the consensus sequence as indicator variables, we can apply standard machine learning techniques to predict fitness under a given condition from the sequence. Given the large number of uninformative features, LASSO and other techniques that include feature selection tend to do well.

from Rhee et al. (2006) using least- squared regression to obtain coefficients for contribution of each mutation to resistance against a selection of PI drugs.

from Hinkley et al. (2011) using generalised kernel ridge regression. tested model using only main effects (ME) against model incorporating epistasis: inter- genic, intra-genic or both (MEEP)

from Hinkley et al. (2011) These authors found ~18% improvement in predictive power by including epistasis between mutations within the same gene – e.g. the HIV protease shown.

In vivo data A drug resistance fitness landscape in vitro may not be the same as that experienced by the virus when exposed to the patient’s immune system. Another approach is to learn fitness landscapes by comparing the sequences of drug-naïve viruses against those obtained from patients on a specific drug regime.

Machine learning from in vivo data Deforche et al. (2008) apply a Bayesian Network Probability of a set of mutations (A 1,A 2,...,A n ) Fitness of a set of mutations (A 1,A 2,...,A n ) A phylogenetic guide tree is used to take sequence sampling bias into account

Predicting and validating mutational trajectories

Where next?