Statistical Analysis of IC50s Nick Andrews, Statistics Unit CFI, HPA.

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

Departments of Medicine and Biostatistics
Measures of Variation Sample range Sample variance Sample standard deviation Sample interquartile range.
Measures of Dispersion
IB Math Studies – Topic 6 Statistics.
ANOVA: Analysis of Variation
The Basics of Regression continued
Biol 500: basic statistics
Statistics 800: Quantitative Business Analysis for Decision Making Measures of Locations and Variability.
Influenza Neuraminidase Inhibitor IC 50 Data: Calculation, Interpretation and Statistical Analyses.
Vocabulary for Box and Whisker Plots. Box and Whisker Plot: A diagram that summarizes data using the median, the upper and lowers quartiles, and the extreme.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Drawing and comparing Box and Whisker diagrams (Box plots)
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Statistics for clinical research An introductory course.
Summary statistics Using a single value to summarize some characteristic of a dataset. For example, the arithmetic mean (or average) is a summary statistic.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
Significance Tests: THE BASICS Could it happen by chance alone?
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Table of Contents 1. Standard Deviation
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
KNR 445 Statistics t-tests Slide 1 Variability Measures of dispersion or spread 1.
Analyze Data USE MEAN & MEDIAN TO COMPARE THE CENTER OF DATA SETS. IDENTIFY OUTLIERS AND THEIR EFFECT ON DATA SETS.
10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
UTOPPS—Fall 2004 Teaching Statistics in Psychology.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Measures of Dispersion How far the data is spread out.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Interpretation of Raw Data: NA Activity and IC50 Assays Angie Lackenby.
ANOVA: Analysis of Variance. The basic ANOVA situation Two variables: 1 Nominal, 1 Quantitative Main Question: Do the (means of) the quantitative variables.
Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics ANalysis Of VAriance: ANOVA.
VARIABILITY. Measure of Variability A measure of variability is a summary of the spread of performance. Suppose that 2 students took 10 quizzes in a Preparatory.
Measures of variability: understanding the complexity of natural phenomena.
1 Chapter 4 Numerical Methods for Describing Data.
Copyright © 2005 Pearson Education, Inc. Slide 6-1.
Summary Statistics, Center, Spread, Range, Mean, and Median Ms. Daniels Integrated Math 1.
STATS 10x Revision CONTENT COVERED: CHAPTERS
Box and Whisker Plots Example: Comparing two samples.
Measures of Center and Absolute Mean Deviation Some old, some new……
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
Exploratory Data Analysis
Chapter 16: Exploratory data analysis: numerical summaries
a graphical presentation of the five-number summary of data
Introduction To compare data sets, use the same types of statistics that you use to represent or describe data sets. These statistics include measures.
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Teaching Statistics in Psychology
Analyze Data: IQR and Outliers
Unit 4 Statistics Review
Numerical Measures: Skewness and Location
Warm-up 8/25/14 Compare Data A to Data B using the five number summary, measure of center and measure of spread. A) 18, 33, 18, 87, 12, 23, 93, 34, 71,
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
The absolute value of each deviation.
Approximate the answers by referring to the box plot.
Box-And-Whisker Plots
Define the following words in your own definition
Statistics Vocabulary Continued
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
Box-And-Whisker Plots
Box-And-Whisker Plots
Statistics Vocabulary Continued
Describing Data Coordinate Algebra.
Statistics Vocab Notes
Analyze Data: IQR and Outliers
Presentation transcript:

Statistical Analysis of IC50s Nick Andrews, Statistics Unit CFI, HPA

Outline Curve fitting to estimate IC50 Determining cut-offs for outliers (high IC50) Monitoring trends in IC50

Curve fitting The data in each run consists of measuring the RFU of a sample against the antiviral at different concentrations (ten 4-fold dilutions) as well as with no-antiviral (virus-control) Estimate IC50: the concentration at which the rfu is 50% of that from the virus control.

Sources of variation in IC50 Curve fitting Within a run (2 – replicates) Between runs (Reference samples)

Point to Point (Excel)

Smooth Curves (S-shaped Graph-pad)

Comparison-curve fitting SampleP-PS-shape Ratio (Hi/Low) 292R K> V A/Lisbon22/ A/Latvia/685/ A/Denmark/1/ A/Denmark/2/ A/Denmark/3/

Comments (Curve fitting) For most samples curve will make no more than about 5% difference to results. Must be clear exactly how the curve fitting and calculation of IC50 works – e.g. is IC50 based on 50% of OD of virus control or 50% of fitted upper asymptote. Need to ensure problems in curve fitting are flagged.

Between replicate variation (Point to point) SampleRep1Rep2Ratio(High/Low) 292R K3769> V A/Lisbon22/ A/Latvia/685/ A/Denmark/1/ A/Denmark/2/ A/Denmark/3/

Comments on replicates Between replicates in a run variation is greater than curve fitting with most results within about 15%-20%. Using replicates and taking the average reduces this effect. Large differences between replicates should be flagged (e.g. >30%).

Between run variation

Comparison runs (Reference Virus) SampleRun1Run2Ratio(High/Low) 274H Osel H Zan Y Osel Y Zan V Osel V Zan DB Osel DB Zan

Comments Between runs variation is higher still at about 40%+. Hence retest those with high values and use controls in runs to identify problems in a run.

Determining Cut-offs The aim is to pick up outliers with high IC50 results as they may indicate resistant strains. Need a method to determine the ‘normal range’ in the absence of outliers. Various statistical methods may be used – the important thing is to ensure the outliers do not unduly affect the cut-off.

Methods used Need sufficient data points for initial estimate (about 30-50). These can be refined later with more data. Advisable to transform data to approximate normality (log-transform). Obtain a robust estimation of the Standard deviation (1.48*Median absolute deviation or using 0.75*Inter-quartile range). Use median + say 1.65, 3 SD.

Example: Initial 30 results Median=0.47 Robust SD (log-scale) = 0.17 Cut1 =1.07 Cut2=1.81

Example: Final results 2006/07 Median=0.48 Robust SD (log-scale) = 0.12 Cut1 =0.93 Cut2=1.35

Do we need to log-transform? Most results from dilution assays produce ‘geometric results’ so likely to be sensible Sometimes doesn’t seem to matter but sometimes data are skewed. (e.g. lower quartile much closer to median than upper quartile) – important to transform as robust methods assume data are normal once outliers are removed.

Non-log scale 1.6

Log-scale (y-axis) 2.7

Alternative presentation – Box and Whisker Plot Uses median * IQR (+2SD) Could be adjusted to match the scatter plots 3.0

Monitoring ICD50 (outliers) over time First look at the data (scatter plots/box-whisker) Calculate statistics for different time periods –Median –%Outliers –Geometric means with 95% CI with outliers removed –Kruskal Wallis test for medians –Calculate Geometeric Means with 95% CI and t-test/ anova (outliers removed) –regression for time trend (outliers removed).

Monitoring ICD50 (outliers) over time Perform statistical tests –Kruskal Wallis test for medians –Chi-squared or Fishers exact test for % outliers –Anova or t-tests for comparing means (oultiers removed). Alternatively look for non- overlapping 95% CI which is a conservative method (approx p<0.007)

Graphical Presentation – Box and Whisker

Scatter Plot

Scatter plot

Statistics H3N2 Zanamivir NIMR Year Samples>3SD (%)medianGeomean (95% CI) 2005/06881 (1.1%) ( ) 2006/ (2.3%) ( ) P-value comparing %>3SD = 0.67 P-value comparing medians = – significant although actual difference small. 95% CI for Geometric means do not overlap Note of caution: Changes may occur within a year so this comparison maybe too simplistic. For example there appears to be some evidence of a change within 2006/07

Summary Good use of statistical methods can help interpret the IC50 results and ensure assay results are reliable. Many appropriate methods already in place but more could be incorporated