Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallelism in practice USP Bioassay Workshop August 2010

Similar presentations


Presentation on theme: "Parallelism in practice USP Bioassay Workshop August 2010"— Presentation transcript:

1 Parallelism in practice USP Bioassay Workshop August 2010
Ann Yellowlees Kelly Fleetwood Quantics Consulting Limited

2 Contents What is parallelism? Approaches to assessing parallelism
Significance Equivalence Experience Discussion

3 Setting the scene: Relative Potency
RP: ratio of concentrations of reference and sample materials required to achieve the same effect RP = Cref / Csamp

4 Parallelism One curve is a horizontal shift of the other
These are ‘parallel’ or ‘similar’ curves Finney: A prerequisite of all dilution assays

5 Real data: continuous response

6 Linear model (4 concentrations)
Y = a + β log (C) Parallel when the slopes β equal NB the range - which concentrations? Do we care about the asymptotes?

7 Four parameter logistic model
4PL: Y = γ + (δ - γ) / [1 + exp (β log (C) – α)] Parallel when asymptotes γ , δ slope β equal Mention symmetry. A looks the same B does not look the same

8 Five parameter logistic model
5PL: Y = γ + (δ - γ) / [1 + exp (β log (C) – α) ]φ Parallel when asymptotes γ , δ slope β asymmetry φ equal A: same B: not the same. Slope not the same

9 Tests for parallelism Approach 1
Is there evidence that the reference and test curves ARE NOT parallel? Compare unrestricted vs restricted models Test loss of fit when model restricted to parallel ‘p value’ approaches Traditional F test approach as preferred by European Pharmacopoeia Chi-squared test approach as recommended by Gottschalk & Dunn (2005)

10 Approach 2 Pharmacopoeial disharmony exists!! (existed?)
Is there evidence that the reference and test curves ARE parallel? Equivalence test approach as recommended in the draft USP guidance (Hauck et al 2005) Fit model allowing non-parallel curves Confidence intervals on differences between parameters Pharmacopoeial disharmony exists!! (existed?)

11 In practice... Four example data sets
Data set 1: 60 RP assays (96 well plates, OD: continuous) Data set 2: 15 RP assays (96 well plates, OD : continuous) Data set 3: 12 RP assays (96 well plates, OD : continuous) Data set 4: 60 RP assays (in vivo, survival at day x: binary*) * treated as such for this purpose; wasteful of data 11 11

12 In practice... We have applied the proposed methods in the context of individual assay ‘pass/fail’ (suitability): Data set 1 Compare 2 ‘significance’ approaches Compare ‘equivalence’ with ‘significance’ Data sets 2, 3 Data set 4 Compare ‘F test’ (EP) with ‘equivalence’ (USP) 12

13 Data set 1 60 RP assays 8 dilutions 2 independent wells per dilution
4PL a good fit (vs 5PL) NB precision Model log e OD s log e conc AVERAGE SLOPE = 1/.384 = 2.6 GD_RegressionGraph_4PL_WEIGHTED_077wmf 13 13

14 Data set 1: F test and chi-squared test
F test: straightforward Chi-squared test: need to establish mean-variance relationship This is a data driven method!!! Very arbitrary Establishing equivalence limits Hauck paper: provisional capability based limits can be set using reference vs reference assays Not available in our dataset ...

15 Data set 1: F test and chi-squared test
12/60 = 20% of assays have p < 0.05 Evidence of dissimilarity? – OR – Precise assay? Chi-squared test: 58/60 = 97% of assays have p < 0.05! Intra-assay variability is low  differences between parallel and non-parallel model are exaggerated Histograms of F-test p-values and G&D p-values Followed by example graph to illustrate why G&D behaves so poorly: Intra-assay is variability is low, compared to quality of the fit, differences between curves exaggerated. Poor choice of statistic This is a data driven method!!! Very arbitrary Establishing equivalence limits Hauck paper: provisional capability based limits can be set using reference vs reference assays Not available in our dataset ...

16 Data set 1: Comparison of approaches to parallelism

17 Data set 1: Comparison of approaches to parallelism
Some evidence of ‘hook’ in model Residual SS inflated NOTE HOOK

18 Data set 1: Comparison of approaches to parallelism
Excluding top 2 points because of HOOK Approx 20 /60 pass Remodelled : quadratic relationship re fitted

19 Data set 1: F test and chi-squared test
RSSparallel = 159 RSSnon-parallel = 112 RSSp – RSSnp = 47 Pr(23>47) < 0.01 F test: P = 0.03 Example where both fail

20 Data set 1: F test and chi-squared test
RSSparallel = 100.2 RSSnon-parallel = RSSp – RSSnp = Pr(23>1.2) = 0.75 Example where both PASS

21 Data set 1: USP methodology Prove parallel
Lower asymptote:

22 Data set 1: USP methodology
Upper asymptote: This is interesting: demonstrates that its not enough just to order the data and take the 2nd from the end as your limit. Need to examine it. Check for bias!

23 Data set 1: USP methodology
Scale: Scale for reference: (range to 0.416) NB scale = 1/ slope

24 Data set 1: USP methodology
Criteria for 90% CI on difference between parameter values: Lower asymptotes: (-0.235, 0.235) Upper asymptotes: (-0.213, 0.213) Scales: (-0.187, 0.187) Applying the criteria: 3/60 = 5% of assays fail the parallelism criteria No assay fails more than one criterion ‘scale ‘ parameter from R parameterisation: allows log RP to be estimated as a1 – a2 (easy variance)

25 Data set 1: Comparison of approaches to parallelism

26 Data set 1: Comparison of approaches to parallelism
This plate ‘fails’ all 3 tests USP: Lower asymptote FAILS ALL whether or not hook included

27 Data set 1: Comparison of approaches to parallelism
Equivalence test: scales not equivalent F test p-value = 0.60 Chi-squared test p-value < 0.001 F test passes: high variability

28 Data set 2: Comparison of approaches to parallelism
Constant variance 28 28

29 Data set 3: Comparison of approaches to parallelism
Linear fit for mean variance Again the G&D test suggests more assays “FAIL” 29 29

30 In practice... Data set 4: Compare ‘F test’ with ‘equivalence’
Methodology for Chi-squared test not developed for binary data 30

31 Data set 4 60 RP assays 4 dilutions 15 animals per dilution 31
Actual model is a GLM (i.e. response 0,1 dependent on survival), % Survival shown for illustrative purposes only;. SLOPES: average = range (-14.71, -1.03) 31 31

32 Data set 4: Comparison of approaches to parallelism
F test: 5/60 = 8% fail Equivalence: Fail 5% = 3 Equivalence: could choose limit to match

33 Data set 4: Comparison of approaches to parallelism
F-test approach and Equivalence approach could be in agreement depending on how limits are set.

34 Broadly... F test Chi-squared USP
Fail (?wrongly?) when very precise assay Pass (?wrongly?) when noisy Linear case: p value can be adjusted to match equivalence Chi-squared Fail when very precise assay (even if difference is small) If model fits badly – weighting inflates RSS (e.g. hook) 2 further data sets supported this USP Limits are set such that the extreme 5% will fail They do! Regardless of precision, model fit etc 34 34

35 Stepping back… What are we trying to do?
Produce a biologic to a controlled standard that can be used in clinical practice For a batch we need to know its potency With appropriate precision In order to calculate clinical dose Perhaps add more information about precision to this 35 35

36 Some thoughts Establish a valid assay
Use all development assay results unless a physical reason exists to exclude them Statistical methodology can be used to flag possible outliers for investigation USP <111> applies this to individual data points Parallelism / similarity Are the parameter differences fundamentally zero? Or is there a consistent slope difference (e.g)? Equivalence approach + judgment for acceptable margin Perhaps add more information about precision to this 36 36

37 Some thoughts 2. Set number of replicates to provide required precision Combine RP values plus confidence intervals for reportable value Per assay, use all results unless physical reason not to (They are part of the continuum of assays) Flag for investigation using statistical techniques Reference behaviour Parallelism 4. Monitor performance over time (SPC) Reference stability Perhaps add more information about precision to this 37 37

38 Which parallelism test?
Our view: Chi squared test requires too many complex decisions and is very sensitive to the model F test not generally applicable to the assay validation stage Does not allow examination of the individual parameters Does not lend itself to judgment about ‘How parallel is parallel?’ The equivalence test approach fits in all three contexts With adjustment of the tolerance limits as appropriate 38

39 Thank you USP the invitation Clients use of data
BioOutsource: Other clients who prefer to remain anonymous Quantics staff analysis and graphics Kelly Fleetwood (R), Catriona Keerie (SAS) 39

40


Download ppt "Parallelism in practice USP Bioassay Workshop August 2010"

Similar presentations


Ads by Google