Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Similar presentations


Presentation on theme: "Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University."— Presentation transcript:

1 Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University

2 Outline  Definitions  Classical or Frequentist  Bayesian  Comparison (Bayesian vs. Classical)  Bayesian Data Analysis  Examples

3 Definitions  Problem: Unknown population parameter (θ) must be estimated.  EXAMPLE #1:  θ = Probability that a randomly selected person will be a cancer survivor  Data are binary, parameter is unknown and continuous  EXAMPLE #2:  θ = Mean survival time of cancer patients.  Data are continuous, parameter is continuous.

4 Definitions  Step 1 of either formulation is to pose a statistical (or probability)model for the random variable which represents the phenomenon.  EXAMPLE #1:  a reasonable choice for f (y|θ) (the sampling density or likelihood function) would be that the number of 6 month survivors (Y) would follow a binomial distribution with a total of n subjects followed and the probability of any one subject surviving is θ.  EXAMPLE #2:  a reasonable choice for f (y|θ) survival time (Y) has an exponential distribution with mean θ.

5 Classical (Frequentist) Approach  All pertinent information enters the problem through the likelihood function in the form of data(Y1,...,Yn)  objective in nature  software packages all have this capability  maximum likelihood, unbiased estimation, etc.  confidence intervals, difficult interpretation

6 Bayesian Data Analysis  data (enters through the likelihood function as well as allowance of other information  reads: the posterior distribution is a constant multiplied by the likelihood muliplied by the prior Distribution  posterior distribution: in light of the data our updated view of the parameter  prior distribution: before any data collection, the view of the parameter

7 Additional Information  Prior Distributions  can come from expert opinion, historical studies, previous research, or general knowledge of a situation (see examples)  there exists a “flat prior” or “noninformative” which represents a state of ignorance.  Controversial piece of Bayesian methods  Objective Bayes, Empirical Bayes

8 Bayesian Data Analysis  inherently subjective (prior is controversial)  few software packages have this capability  result is a probability distribution  credible intervals use the language that everyone uses anyway. (Probability that θ is in the interval is 0.95)  see examples for demonstration

9 Mammography Test Result PositiveNegative Patient Status Cancer88%12% Healthy24%76% o Sensitivity: o True Positive o Cancer ID’d! o Specificity: o True Negative o Healthy not ID’d!

10 Mammography Illustration  My friend (40!!!) heads into her OB/GYN for a mammography (according to Dr.’s orders) and finds a positive test result.  Does she have cancer?  Specificity, sensitivity both high! Seems likely... or does it?  Important points: incidence of breast cancer in 40 year old women is 126.2 per 100,000 women.

11 Bayes Theorem for Mammography

12 Mammography Tradeoffs  Impacts of false positive  Stress  Invasive follow-up procedures  Worth the trade-off with less than 1% (0.46%)chance you actually have cancer???

13 Mammography Illustration  My mother-in-law has the same diagnosis in 2001.  Holden, UT is a “downwinder”, she was 65.  Does she have cancer?  Specificity, sensitivity both high! Seems likely... or does it?  Important points: incidence of breast cancer in 65 year old women is 470 per 100,000 women, and approx 43% in “downwinder” cities.  Does this change our assessment?

14 Downwinder Mammography

15 Modified Example #1  One person in the class stand at the back and throw the ball tothe target on the board (10 times).  before we have the person throw the ball ten times does the choice of person change the a priori belief you have about the probability they will hit the target (θ)?  before we have the person throw the ball ten times does the choice of target size change the a priori belief you have about the probability they will hit the target (θ)?

16 Prior Distributions  a convenient choice for this prior information is the Beta distribution where the parameters defining this distribution are the number of a priori successes and failures. For example, if you believe your prior opinions on the success or failure are worth 8 throws and you think the person selected can hit the target drawn on the board 6 times, we would say that has a Beta(6,2) distribution.

17 Bayes for Example #1  if our data are Binomial(n, θ) then we would calculate Y/n as our estimate and use a confidence interval formula for a proportion.  If our data are Binomial(n, θ) and our prior distribution is Beta(a,b), then our posterior distribution is Beta(a+y,b+n−y).  thus, in our example:  a = b = n = y =  and so the posterior distribution is: Beta(, )

18 Bayesian Interpretation  Therefore we can say that the probability that θ is in the interval (, ) is 0.95.  Notice that we don’t have to address the problem of “in repeated sampling”  this is a direct probability statement  relies on the prior distribution

19 Example: Phase II Dose Finding  Goal:  Fit models of the form: Where And d=1,…,D is the dose level

20 Definition of Terms  ED(Q):  Lowest dose for which Q% of efficacy is achieved  Multiple definitions:  Def. 1  Def. 2  Example: Q=.95, ED95 dose is the lowest dose for which.95 efficacy is achieved

21 Classical Approach  Completely randomized design  Perform F-test for difference between groups  If significant at, then call the trial a “success”, and determine the most effective dose as the lowest dose that achieves some pre-specified criteria (ED95)

22 Bayesian Adaptive Approach  Assign patients to doses adaptively based on the amount of information about the dose-response relationship.  Goal: maximize expected change in information gain:  Weighted average of the posterior variances and the probability that a particular dose is the ED95 dose.

23 Probability of Allocation  Assign patients to doses based on Where is the probability of being assigned to dose

24 Four Decisions at Interim Looks  Stop trial for success: the trial is a success, let’s move on to next phase.  Stop trial for futililty: the trial is going nowhere, let’s stop now and cut our losses.  Stop trial because the maximum number of patients allowed is reached (Stop for cap): trial outcome is still uncertain, but we can’t afford to continue trial.  Continue

25 Stop for Futility  The dose-finding trial is stopped because there is insufficient evidence that any of the doses is efficacious.  If the posterior probability that the mean change for the most likely ED95 dose is within a “clinically meaningful amount” of the placebo response is greater than 0.99 then the trial stops for futility.

26 Stop for Success  The dose-finding trial is stopped when the current probability that the ED95* is sufficiently efficacious is sufficiently high.  If the posterior probability that the most likely ED95 dose is better than placebo reaches a high value (0.99) or higher then the trial stops early for success.  Note: Posterior (after updated data) probability drives this decision.

27 Stop for Cap  Cap: If the sample size reaches the maximum (the cap) defined for all dose groups the trial stops.  Refine definition based on application. Perhaps one dose group reaching max is of interest.  Almost always $$$ driven.

28 Continue  Continue: If none of the above three conditions hold then the trial continues to accrue.  Decision to continue or stop is made at each interim look at the data (accrual is in batches)

29 Benefits of Approach  Statistical: weighting by the variance of the response at each dose allows quicker resolution of dose-response relationship.  Medical: Integrating over the probability that each dose is ED95 allows quicker allocation to more efficacious doses.

30 Example of Approach  Reduction in average number of events  Y=reduction of number of events  D=6 (5 active, 1 placebo)  Potential exists that there is a non- monotonic dose-response relationship.  Let be the dose value for dose d.

31 Model for Example

32 Dynamic Model Properties  Allows for flexibility.  Borrows strength from “neighboring” doses and similarity of response at neighboring doses.  Simplified version of Gaussian Process Models.  Potential problem: semi- parametric, thus only considers doses within dose range:

33 Example Curves ?

34 Simulations  5000 simulated trials at each of the 5 scenarios  Fixed dose design,  Bayesian adaptive approach as outlined above  Compare two approaches for each of 5 cases with sample size, power, and type-I error

35 Results (power & alpha) CasePr(S)Pr(F)Pr(cap)P(Rej) 1.018.973.009.049 2100.235 3100.759 4100.241 5100.802

36 Results (n) 010204080120 151.626.126.231.233.536.8 228.410.913.818.922.519.2 327.711.314.525.21715.2 431.210.813.319.622.227.8 528.918.022.321.114.510.7 Fixed130

37 Observations  Adaptive design serves two purposes:  Get patients to efficacious doses  More efficient statistical estimation  Sample size considerations  Dose expansion -- inclusion of safety considerations  Incorporation of uncertainties!!! Predictive inference is POWERFUL!!!

38 Conclusions  Science is subjective (what about the choice of a likelihood?)  Bayes uses all available information  Makes interpretation easier  BAD NEWS: I have showed very simple cases... they get much harder.  GOOD NEWS: They are possible (and practical) with advanced computational procedures


Download ppt "Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University."

Similar presentations


Ads by Google