Parallelism in practice USP Bioassay Workshop August 2010

Slides:



Advertisements
Similar presentations
Group 5: Historical control data Follow-up of 2005 IWGT where the use of historical control data in interpretation of in vitro results was identified as.
Advertisements

Sampling designs using the National Pupil Database Some issues for discussion by Harvey Goldstein (University of Bristol) & Tony Fielding (University of.
Assumptions underlying regression analysis
Test Development.
Brief introduction on Logistic Regression
Statistical considerations Alfredo García – Arieta, PhD Training workshop: Training of BE assessors, Kiev, October 2009.
Charles Y. Tan, PhD USP Statistics Expert Committee
Contact: Eric Rozet, Statistician +32 (0)
Inference for Regression
Simple Logistic Regression
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
Mitigating Risk of Out-of-Specification Results During Stability Testing of Biopharmaceutical Products Jeff Gardner Principal Consultant 36 th Annual Midwest.
Kyiv, TRAINING WORKSHOP ON PHARMACEUTICAL QUALITY, GOOD MANUFACTURING PRACTICE & BIOEQUIVALENCE Statistical Considerations for Bioequivalence.
World Health Organization
Determine impurity level in relevant batches1
BA 555 Practical Business Analysis
Detect Unknown Systematic Effect: Diagnose bad fit to multiple data sets Advanced Statistical Techniques in Particle Physics Grey College, Durham 18 -
PSYC512: Research Methods PSYC512: Research Methods Lecture 6 Brian P. Dyre University of Idaho.
Equivalence margins to assess parallelism between 4PL curves
Sample size calculations
DATA QUALITY and ANALYSIS Strategy for Monitoring Post-fire Rehabilitation Treatments Troy Wirth and David Pyke USGS – Biological Resources Division Forest.
Regression Eric Feigelson. Classical regression model ``The expectation (mean) of the dependent (response) variable Y for a given value of the independent.
Multivariate Analysis of Variance, Part 1 BMTRY 726.
Qian H. Li, Lawrence Yu, Donald Schuirmann, Stella Machado, Yi Tsong
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
AS 737 Categorical Data Analysis For Multivariate
Difference Two Groups 1. Content Experimental Research Methods: Prospective Randomization, Manipulation Control Research designs Validity Construct Internal.
Inference for regression - Simple linear regression
PharmAthene was formed to meet the critical needs of the United States and its allies by developing and commercializing medical countermeasures against.
©2012 Pearson Education, Auditing 14/e, Arens/Elder/Beasley Audit Sampling for Tests of Details of Balances Chapter 17.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Analysis and Visualization Approaches to Assess UDU Capability Presented at MBSW May 2015 Jeff Hofer, Adam Rauk 1.
Assessing Survival: Cox Proportional Hazards Model
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
A Bayesian Approach to Parallelism Testing in Bioassay
Regulatory requirements Drs. Jan Welink Training workshop: Assessment of Interchangeable Multisource Medicines, Kenya, August 2009.
Quality Control Lecture 5
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
Statistical considerations Drs. Jan Welink Training workshop: Assessment of Interchangeable Multisource Medicines, Kenya, August 2009.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
The USP Performance Test Dissolution Systems Suitability Studies Walter W. Hauck, Ph.D. USP Consultant Presentation to Advisory Committee for Pharmaceutical.
1 Is it potent? Can these results tell me? Statistics for assays Ann Yellowlees PhD Quantics Consulting Limited.
Statistical Analysis of IC50s Nick Andrews, Statistics Unit CFI, HPA.
University of Ostrava Czech republic 26-31, March, 2012.
1 METHODS FOR DETERMINING SIMILARITY OF EXPOSURE-RESPONSE BETWEEN PEDIATRIC AND ADULT POPULATIONS Stella G. Machado, Ph.D. Quantitative Methods and Research.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Validation Defination Establishing documentary evidence which provides a high degree of assurance that specification process will consistently produce.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Statistics and Nutrient Levels Julie Stahli Metro Wastewater Reclamation District March 2010.
Sample Size Determination
Malaysia, EVALUTION OF DOSSIERS IN WHO- PREQUALIFICATION PROJECT MULTISOURCE TB-DRUGS Evaluation of bioavailability/bioequivalence data Based,
1 Say good things, think good thoughts, and do good deeds.
Section 6.4 Inferences for Variances. Chi-square probability densities.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
IV-BSCT Department of Physical Sciences Philippine Normal University.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
©2012 Prentice Hall Business Publishing, Auditing 14/e, Arens/Elder/Beasley Audit Sampling for Tests of Details of Balances Chapter 17.
Canadian Bioinformatics Workshops
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Inference for Least Squares Lines
Audit Sampling for Tests of Details of Balances
Discussion of presentations Issues 1. imbalanced sample size 2
Multivariate Analysis Lec 4
Presentation transcript:

Parallelism in practice USP Bioassay Workshop August 2010 Ann Yellowlees Kelly Fleetwood Quantics Consulting Limited

Contents What is parallelism? Approaches to assessing parallelism Significance Equivalence Experience Discussion

Setting the scene: Relative Potency RP: ratio of concentrations of reference and sample materials required to achieve the same effect RP = Cref / Csamp

Parallelism One curve is a horizontal shift of the other These are ‘parallel’ or ‘similar’ curves Finney: A prerequisite of all dilution assays

Real data: continuous response

Linear model (4 concentrations) Y = a + β log (C) Parallel when the slopes β equal NB the range - which concentrations? Do we care about the asymptotes?

Four parameter logistic model 4PL: Y = γ + (δ - γ) / [1 + exp (β log (C) – α)] Parallel when asymptotes γ , δ slope β equal Mention symmetry. A looks the same B does not look the same

Five parameter logistic model 5PL: Y = γ + (δ - γ) / [1 + exp (β log (C) – α) ]φ Parallel when asymptotes γ , δ slope β asymmetry φ equal A: same B: not the same. Slope not the same

Tests for parallelism Approach 1 Is there evidence that the reference and test curves ARE NOT parallel? Compare unrestricted vs restricted models Test loss of fit when model restricted to parallel ‘p value’ approaches Traditional F test approach as preferred by European Pharmacopoeia Chi-squared test approach as recommended by Gottschalk & Dunn (2005)

Approach 2 Pharmacopoeial disharmony exists!! (existed?) Is there evidence that the reference and test curves ARE parallel? Equivalence test approach as recommended in the draft USP guidance (Hauck et al 2005) Fit model allowing non-parallel curves Confidence intervals on differences between parameters Pharmacopoeial disharmony exists!! (existed?)

In practice... Four example data sets Data set 1: 60 RP assays (96 well plates, OD: continuous) Data set 2: 15 RP assays (96 well plates, OD : continuous) Data set 3: 12 RP assays (96 well plates, OD : continuous) Data set 4: 60 RP assays (in vivo, survival at day x: binary*) * treated as such for this purpose; wasteful of data 11 11

In practice... We have applied the proposed methods in the context of individual assay ‘pass/fail’ (suitability): Data set 1 Compare 2 ‘significance’ approaches Compare ‘equivalence’ with ‘significance’ Data sets 2, 3 Data set 4 Compare ‘F test’ (EP) with ‘equivalence’ (USP) 12

Data set 1 60 RP assays 8 dilutions 2 independent wells per dilution 4PL a good fit (vs 5PL) NB precision Model log e OD s log e conc AVERAGE SLOPE = 1/.384 = 2.6 GD_RegressionGraph_4PL_WEIGHTED_077wmf 13 13

Data set 1: F test and chi-squared test F test: straightforward Chi-squared test: need to establish mean-variance relationship This is a data driven method!!! Very arbitrary Establishing equivalence limits Hauck paper: provisional capability based limits can be set using reference vs reference assays Not available in our dataset ...

Data set 1: F test and chi-squared test 12/60 = 20% of assays have p < 0.05 Evidence of dissimilarity? – OR – Precise assay? Chi-squared test: 58/60 = 97% of assays have p < 0.05! Intra-assay variability is low  differences between parallel and non-parallel model are exaggerated Histograms of F-test p-values and G&D p-values Followed by example graph to illustrate why G&D behaves so poorly: Intra-assay is variability is low, compared to quality of the fit, differences between curves exaggerated. Poor choice of statistic This is a data driven method!!! Very arbitrary Establishing equivalence limits Hauck paper: provisional capability based limits can be set using reference vs reference assays Not available in our dataset ...

Data set 1: Comparison of approaches to parallelism

Data set 1: Comparison of approaches to parallelism Some evidence of ‘hook’ in model Residual SS inflated NOTE HOOK

Data set 1: Comparison of approaches to parallelism Excluding top 2 points because of HOOK Approx 20 /60 pass Remodelled : quadratic relationship re fitted

Data set 1: F test and chi-squared test RSSparallel = 159 RSSnon-parallel = 112 RSSp – RSSnp = 47 Pr(23>47) < 0.01 F test: P = 0.03 Example where both fail

Data set 1: F test and chi-squared test RSSparallel = 100.2 RSSnon-parallel = 99.0 RSSp – RSSnp = 1.2 Pr(23>1.2) = 0.75 Example where both PASS

Data set 1: USP methodology Prove parallel Lower asymptote:

Data set 1: USP methodology Upper asymptote: This is interesting: demonstrates that its not enough just to order the data and take the 2nd from the end as your limit. Need to examine it. Check for bias!

Data set 1: USP methodology Scale: Scale for reference: 0.384 (range 0.344 to 0.416) NB scale = 1/ slope

Data set 1: USP methodology Criteria for 90% CI on difference between parameter values: Lower asymptotes: (-0.235, 0.235) Upper asymptotes: (-0.213, 0.213) Scales: (-0.187, 0.187) Applying the criteria: 3/60 = 5% of assays fail the parallelism criteria No assay fails more than one criterion ‘scale ‘ parameter from R parameterisation: allows log RP to be estimated as a1 – a2 (easy variance)

Data set 1: Comparison of approaches to parallelism

Data set 1: Comparison of approaches to parallelism This plate ‘fails’ all 3 tests USP: Lower asymptote FAILS ALL whether or not hook included

Data set 1: Comparison of approaches to parallelism Equivalence test: scales not equivalent F test p-value = 0.60 Chi-squared test p-value < 0.001 F test passes: high variability

Data set 2: Comparison of approaches to parallelism Constant variance 28 28

Data set 3: Comparison of approaches to parallelism Linear fit for mean variance Again the G&D test suggests more assays “FAIL” 29 29

In practice... Data set 4: Compare ‘F test’ with ‘equivalence’ Methodology for Chi-squared test not developed for binary data 30

Data set 4 60 RP assays 4 dilutions 15 animals per dilution 31 Actual model is a GLM (i.e. response 0,1 dependent on survival), % Survival shown for illustrative purposes only;. SLOPES: average = -2.41. range (-14.71, -1.03) 31 31

Data set 4: Comparison of approaches to parallelism F test: 5/60 = 8% fail Equivalence: Fail 5% = 3 Equivalence: could choose limit to match

Data set 4: Comparison of approaches to parallelism F-test approach and Equivalence approach could be in agreement depending on how limits are set.

Broadly... F test Chi-squared USP Fail (?wrongly?) when very precise assay Pass (?wrongly?) when noisy Linear case: p value can be adjusted to match equivalence Chi-squared Fail when very precise assay (even if difference is small) If model fits badly – weighting inflates RSS (e.g. hook) 2 further data sets supported this USP Limits are set such that the extreme 5% will fail They do! Regardless of precision, model fit etc 34 34

Stepping back… What are we trying to do? Produce a biologic to a controlled standard that can be used in clinical practice For a batch we need to know its potency With appropriate precision In order to calculate clinical dose Perhaps add more information about precision to this 35 35

Some thoughts Establish a valid assay Use all development assay results unless a physical reason exists to exclude them Statistical methodology can be used to flag possible outliers for investigation USP <111> applies this to individual data points Parallelism / similarity Are the parameter differences fundamentally zero? Or is there a consistent slope difference (e.g)? Equivalence approach + judgment for acceptable margin Perhaps add more information about precision to this 36 36

Some thoughts 2. Set number of replicates to provide required precision Combine RP values plus confidence intervals for reportable value Per assay, use all results unless physical reason not to (They are part of the continuum of assays) Flag for investigation using statistical techniques Reference behaviour Parallelism 4. Monitor performance over time (SPC) Reference stability Perhaps add more information about precision to this 37 37

Which parallelism test? Our view: Chi squared test requires too many complex decisions and is very sensitive to the model F test not generally applicable to the assay validation stage Does not allow examination of the individual parameters Does not lend itself to judgment about ‘How parallel is parallel?’ The equivalence test approach fits in all three contexts With adjustment of the tolerance limits as appropriate 38

Thank you USP the invitation Clients use of data BioOutsource: www.biooutsource.com Other clients who prefer to remain anonymous Quantics staff analysis and graphics Kelly Fleetwood (R), Catriona Keerie (SAS) 39