David Kim Allergan Inc. SoCalBSI California State University, Los Angeles.

Slides:



Advertisements
Similar presentations
1. 2 Hydrogen bonding 3 Surface tension 4 Ice, water, vapor.
Advertisements

Autocorrelation and Heteroskedasticity
Physical Properties As it is true for all substances, each organic compound has certain physical and chemical properties. some of the important physical.
General Linear Model Introduction to ANOVA.
Ions in aqueous Solutions And Colligative Properties
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
R OBERTO B ATTITI, M AURO B RUNATO The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Feb 2014.
Ionization and dissociation of drugs-1
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Simple Multiple Line Fitting Algorithm Yan Guo. Motivation To generate better result than EM algorithm, to avoid local optimization.
Simple Linear Regression
Introduction: Hydrogels are three dimensional cross linked polymers. They are highly desired because of their flexibility, hydrophilic nature, and because.
Reversed Phase HPLC Mechanisms Nicholas H. Snow Department of Chemistry Seton Hall University South Orange, NJ 07079
Applications and integration with experimental data Checking your results Validating your results Structure determination from powder data calculations.
Statistics for Managers Using Microsoft® Excel 5th Edition
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Chapter 15: Model Building
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Simple Linear Regression Analysis
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Quantitative Structure-Activity Relationships (QSAR)  Attempts to identify and quantitate physicochemical properties of a drug in relation to its biological.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Chapter 11 Simple Regression
Basic Statistics (for this class) Special thanks to Jay Pinckney (The HPLC and Statistics Guru) APOS.
Molecular Descriptors
Chapter 14: Solutions Chemistry 1020: Interpretive chemistry Andy Aspaas, Instructor.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Chemistry Water, Acids and Bases. Inorganic Chemistry The study of inorganic compounds  water  acids  bases.
Chapter 8: Regression Analysis PowerPoint Slides Prepared By: Alan Olinsky Bryant University Management Science: The Art of Modeling with Spreadsheets,
Surveillance monitoring Operational and investigative monitoring Chemical fate fugacity model QSAR Select substance Are physical data and toxicity information.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 11: Other Types of Phase Equilibria in Fluid Mixtures (selected topics)
3. Statistics Test results on a drug such as (1) with17 variants (No. of compounds, n, = 17 ) differing only in X are tabulated with the values of physical.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Correlation. Correlation is a measure of the strength of the relation between two or more variables. Any correlation coefficient has two parts – Valence:
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Chapter 2: Atoms and Molecules of Ancient Earth Life requires about 25 elements carbon (C) oxygen (O) hydrogen (H) nitrogen (N)
1 Non-contact Specular Microscopy for Evaluation of Corneal Endothelium in Early Fuchs’ Endothelial Corneal Dystrophy Jianyan Huang 1, MD, PhD; Tudor Tepelus.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Lipophilicity & Permeability 김연수. Chapter 5. Lipophilicity.
Chapter 6. pKa & Chapter 7. Solubility
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
김소연 Permeability OverviewPermeability FundamentalsPermeability EffectPermeability Structure Modification StrategiesProblem.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
Correlation Between the Transdermal Permeation of Ketoprofen and its Solubility in Mixtures of a pH 6.5 Phosphate Buffer and Various Solvents Ref.: Drug.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
ADME Dr Basma Damiri Toxicology In general, a toxicant should be absorbed in order to have an effect. True or false? Why? False Some toxicants.
Julia Salas CS379a Aim of the Study To determine distinguishing features of orally administered drugs –Physical and structural features probed.
Chapter 15 Multiple Regression Model Building
Regression Analysis AGEC 784.
Chapter 9 Multiple Linear Regression
Roberto Battiti, Mauro Brunato
Virtual Screening.
Properties of Water.
Predict Failures with Developer Networks and Social Network Analysis
Properties of Water.
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Liquid Chromatography - Method Development and Validation
Presentation transcript:

David Kim Allergan Inc. SoCalBSI California State University, Los Angeles

Objective Develop a model to predict corneal permeability based on literature compounds

Introduction Ocular drug delivery mechanism (through cornea and/or conjunctiva) Focus of the project is the corneal route

Three major cell layers of the Cornea

Why predict corneal permeability? Allergan, Inc. develops drugs which are administered through the eye A drug is only effective if it can reach its target tissue Can save company time and money in determining if the drug can pass through the cornea before the drug is synthesized

Introduction Few models have been developed to predict corneal permeability Congeneric model (one class of compounds) Non-congeneric model (mutiple class of compounds) Develop non-congeneric model focused on drug-like compounds

Find optimal training and testing set percentage Final Model Statistical analysis Literature Compound names logPC and logD structure of compounds Run Partial Least Squares modeling Pick best model Remove descriptor with the lowest importance Rebuild model Filter descriptors (intuitively) Generate descriptor values

Partition Coefficient: Log D = log of the Distribution Coefficient (pH 7.65) Log PC = log of the Permeability Coefficient (cm/s) Yoshida, F., Topliss, J.G., J. Pharm. Sci. 85, (1996)

Compounds in Literature Went through published literature Filtered compounds to look only for drug like compounds Came up with 30 compounds and their measured permeability Next step in our model building process is to produce descriptors for each of our compounds

Descriptors Molecular weight or volume Degree of ionization Aqueous solubility Hydrogen-bonding Log D Polar surface area (PSA) pKa Solvent accessible surface area

Schrödinger Software Named after Erwin Schrödinger –Nobel prize winner for the Schrödinger equation which deals with quantum mechanics Suite of various programs dealing with computational chemistry Two programs used: Maestro – calculate descriptor values Canvas – generate model

Maestro Program

Can generate 77 descriptors Can manually input descriptors (eg. log D) Filtered descriptors which do not deal with permeability (intuitively) to reduce noise Came up with 30 descriptors to use Export the 30 compounds and its 30 descriptors to Canvas

Canvas Program

Partial Least Squares (PLS) modeling Can specify what descriptors to use to build the model Can specify the compounds used for training and testing the model Model assessment: corresponding statistics of the model

Statistics Training Set Standard deviation (SD) – low Coefficient of determination (R 2 ) – high close to 1 Coefficient of determination, cross validation (R 2 -CV) – high close to 1 Stability – close to 1 F-statistic (overall significance of the model) – high P-value (probability that correlation happened by chance) – low <0.01

Statistics Testing Set Root Mean Squared Error (RMSE) – low Q 2 – high close to 1 Pearson correlation coefficient (r-Pearson) – high close to 1 Important for the assessment of what percentage of the compounds we want to use for the training set Important for the assessment of our model as we start to remove unnecessary descriptors

Finding the ideal training set percentage Ran PLS modeling specifying various percentages to use for the training set 40%, 50%, 60%, 70%, 80% Looked at the statistics of each of the models built Found that using 80% of the compounds for the training set was ideal 30 compounds found in literature 24 in training set and 6 in the testing set

bx coefficient After the PLS model is built, it gives the bx coefficient for each descriptor in order to predict permeability The bx coefficient is the weight that the model puts on the descriptor after the descriptor values have been scaled Example: log PC = 0.348(scaled MW) –0.221(scaled log D) (scaled log P)……

Removal of descriptors Started with 30 descriptors and built a model Identified the descriptor with the lowest bx coefficient and removed it Rebuilt model with 29 descriptors Repeat…. while keeping track of the statistics Want to keep track of statistics to know when to stop Example: log PC = 0.348(scaled MW) –0.221(scaled log D) (scaled log P)………….(30) log PC = 0.392(scaled MW) –0.183(scaled log D)……………………………….………….(29)

Test Statistics Training Statistics

Remaining 8 Descriptors CIQPlogS – conformation independent predicted aqueous solubility QPlogS - predicted aqueous solubility FOSA – hydrophobic component of the total solvent accessible surface area PISA -  (carbon and attached hydrogen) component of the total solvent accessible surface area

Remaining 8 Descriptors QPlogKp - predicted skin permeability QPlogBB – predicted blood/brain partition coefficient donorHB - Estimated number of hydrogen bonds that would be donated by the solute to water molecules in an aqueous solution log D – Distribution coefficient

Permeability Model Function log PC = (scaledCIQPlogS ) (scaledFOSA) (scaledPISA) (scaledQPlogBB) (scaledQPlogKp) (scaledQPlogS) (scaleddonorHB) (scaledlogD) SD = R 2 = F = (p < )

Predicted vs Observed Permeability

Conclusion Successfully created a model to predict the corneal permeability of compounds Showed that the Schrödinger software generates significant descriptors to build a permeability model

Potential Future Work Apply the model to external training set to asses its predictability power Build a more refined model with more compounds Find other descriptors other than the ones generated by Maestro and use them in the model building

Acknowledgments Dr. Ping Du Dr. Chungping Yu Pushpa Chandrasekar Noeris Salem Allergan SoCalBSI