Thoughts on Simplifying the Estimation of HIV Incidence John Hargrove, Alex Welte, Paul Mostert [and others]

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Correlation Oh yeah!.
Mean, Proportion, CLT Bootstrap
Linear Regression t-Tests Cardiovascular fitness among skiers.
CHAPTER 7 Sampling Distributions
Lecture 4 This week’s reading: Ch. 1 Today:
The Simple Linear Regression Model: Specification and Estimation
BA 555 Practical Business Analysis
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Point and Confidence Interval Estimation of a Population Proportion, p
The Basics of Regression continued
Excellence Justify the choice of your model by commenting on at least 3 points. Your comments could include the following: a)Relate the solution to the.
Today Concepts underlying inferential statistics
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Inference for regression - Simple linear regression
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Basic Statistics. Basics Of Measurement Sampling Distribution of the Mean: The set of all possible means of samples of a given size taken from a population.
Psychometrics.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
HSRP 734: Advanced Statistical Methods July 10, 2008.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Quantitative Skills 1: Graphing
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Issues concerning the interpretation of statistical significance tests.
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Example x y We wish to check for a non zero correlation.
Scientific Methodology: The Hypothetico-Deductive Approach, the Test of Hypothesis, and Null Hypotheses BIOL January 2016.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
Quantitative Methods in the Behavioral Sciences PSY 302
Estimating standard error using bootstrap
Chapter 7: Sampling Distributions
Understanding Standards Event Higher Statistics Award
12 Inferential Analysis.
Statistical Methods For Engineers
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Gerald Dyer, Jr., MPH October 20, 2016
CHAPTER 7 Sampling Distributions
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
12 Inferential Analysis.
CHAPTER 7 Sampling Distributions
Chapter 7: Sampling Distributions
15.1 The Role of Statistics in the Research Process
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 9: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
CHAPTER 7 Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
The Practice of Statistics – For AP* STARNES, YATES, MOORE
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Chapter 7: Sampling Distributions
Presentation transcript:

Thoughts on Simplifying the Estimation of HIV Incidence John Hargrove, Alex Welte, Paul Mostert [and others]

Estimates of incident (new) cases are important in the assessment of changes in an epidemic, identifying “hot spots” and in gauging the effects of interventions HIV incidence most accurately estimated via longitudinal studies – but these are lengthy, expensive, logistically challenging. Do provide a “gold standard” against which to judge other estimates of HIV incidence

An alternative way of estimating incidence, involving none of the disadvantages of a longitudinal study, would be to use a single chemical test that can be used to estimate the proportions of recent vs long- established HIV infections in cross- sectional surveys

Idea: identify HIV test where measured outcome not simply +/- but rather a graded response increasing steadily over a long period

One such assay is the BED-CEIA developed by CDC Graph shows result for a seroconverting client taken from the ZVITAMBO study carried out in Zimbabwe [14,110 post partum women followed up at 6-wk, 3-mo, then every 3-mo to two years]

The idea is to calibrate the BED assay to estimate the “average” time [or “window”] taken for a person’s BED optical density [OD] to increase to a given OD cutoff In cross-sectional surveys proportion of HIV positive people with BED < cut-off allows us to calculate the proportion of new infections – and thus the incidence. Estimation of the window period is thus central to the successful application of the BED

Data from commercial seroconversion panels with accurately known times of seroconversion indicate Problem 1. Delay (~25 days) between sero- conversion and the onset of then increase in BED optical density

Sero- negative Baseline OD = Extrapolated time when OD = baseline Date of seroconversion Date of infection 22 ''  1 Window period (  ) Sero- positive

Problem 2: C onsiderable variability between clients in a real population. No prospect of using BED to identify individual recent infections. Idea only to estimate population incidence

Problem 3: Often have limited follow-up: of 353 seroconverters in ZVITAMBO, 167 only produced a single HIV positive sample, Samples per client (S) Frequency

Problem 4: The available data for a given client quite often do not span the OD cut- off. The proportion that fail to do so varies with the chosen cut-off. Failure to span increases the uncertainty in estimating the time at which the OD cut-off is crossed Problem 5: There is a large variation (27 – 656 days) in the time (t 0 ) elapsing between last negative and first positive HIV tests. The degree of uncertainty in the timing of seroconversion increases with increasing t 0

We need to consider how variation in samples per client, t0, and failure to span the cut- off affect our estimate of the window period.

How to approach problem? Scatter-plot of the data? Makes no use of the information of the trend for individual clients and ignores the fact that the sequential points for that client are not independent.

Alternative which uses trend in BED OD is suggested by an approximately linear relationship between square root of OD and time-since-last- negative HIV test (t). Allows a regression approach taking out variance due to t and to difference between clients

Gives consistent results; in that results independent of whether we insist on minimum of 3, 4 or 5 samples per client; and on value of t 0 between 75 and 180 days

Are we even using the right transformation? And should we be using the time of last negative HIV test as the origin Try instead to do a preliminary estimate of the time when OD starts to increase by fitting a quadratic polynomial to the data. Then use this estimate as the origin.

Seems to suggest that the true relationship may actually be a power function. What it really were? What would we see if we plotted OD vs time since-last negative

Our problem is that we do not know when seroconversion occurred. We only know the time of the last HIV negative test. And the greater the delay between last negative and first positive tests the greater the uncertainty Examples of times when HIV -ve tests might have been taken

For zero offset the window is UNDER-estimated; for 100-day offset it is OVER-estimated

This approach to window estimation is clearly not optimal since the window estimate changes with the timing of the last HIV-negative test But can we do any better?

If OD increases as a power function fit: or equivalently where a and b are constants, t is the time since the last negative and t 0 is the estimated time of seroconversion. We use the data to estimate a, b and t 0 by non- linear regression

For the generated data [without noise] this approach gives the correct window – regardless of the time of the last negative test But for real data in 40% of 61 cases the time of seroconversion was estimated to be before the time of the last negative test or after the time of the first positive. [Work in progress]

Turnbull survival analysis different approach suggested by Paul Mostert (Stellenbosch Statistics Department). This is a slightly more sophisticated variant of the Kaplan Meier survival analysis. Works on the basis that the (unknown) times of: i) seroconversion ii) OD cut-off each lie between two known times The times of the two events are quantified using interval censoring

Turnbull window estimates Runs All data (red; 183 d) 2: Excluding max OD 0.8 (green; 210 d) 4: Excluding 2 and 3 (blue; 163 d)

The window length is estimated using a non-parametric survival technique which makes no assumptions about any parametric models and underlying distributions.. No interpolation is used to obtain the cut-off time where the BED OD reaches 0.8 or the seroconversion time point. Only time points that will define the interval boundaries were used, which means that time points more than four for a specific women were not fully utilised. However, time points as few as two per women could be used in this estimation of window length.

Conclusion There is still no general agreement on how best to estimate the window for methods like the BED. Fortunately most of those described seem to give fairly similar answers – though it’s not clear to what extent this is happening by chance.