Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comprehensive Project By Melissa Joy.  Background Information on Probability  Intro to Fay’s Formula  Notation  Overview of the method behind Fay’s.

Similar presentations


Presentation on theme: "Comprehensive Project By Melissa Joy.  Background Information on Probability  Intro to Fay’s Formula  Notation  Overview of the method behind Fay’s."— Presentation transcript:

1 Comprehensive Project By Melissa Joy

2  Background Information on Probability  Intro to Fay’s Formula  Notation  Overview of the method behind Fay’s Formula  Breast cancer example using raw data  Table of age conditional breast cancer risk  Table of age conditional cancer risk (all sites)  Bibliography  Thank you’s

3  Probability is the likelihood or chance that something will happen  Conditional Probability is the probability of some event A, given the occurrence of some other event B. ◦ It is written P(A|B) ◦ It is said “the probability of A, given B” ◦ P(A|B) = P(A ∩ B) P(B)

4  Probability density function (pdf) is a function,f(x), that represents a probability distribution in terms of integrals.  The probability x lies in the interval [a, b] is given by ∫ a f (x) dx b

5 A(x,y): Age-conditional probability of getting cancer between x and y, given alive and cancer free up until age x Or equivalently, the probability that an individual of age x will get cancer in the next (y - x) years, given alive and cancer free up until age x Goal: Write A(x,y) in terms of data that is easily found and collected

6 Probability density functions: (For simplicity, these pdf’s will be constant so I will refer to them as probabilities)  λ: Failure rates  S: Survival rates Subscripts:  c: denotes incidence of cancer  d: denotes incidence of death from cancer  o: denotes death from other (non-cancer) related causes  An asterisk (*) signifies that the data implies that the individual was cancer free up until a particular age.

7 A(x,y): Age-conditional probability of getting cancer between x and y, given alive and cancer free up until age x A(x,y) = P(first cancer occurs between age x and y) P(alive and cancer free at age x given cancer free before) A(x,y) = ∫ x f c (a) da S* (x) Goal: Rewrite A(x,y) with no * terms f c (a): probability density function of the first occurrence of cancer happening at age a (a between x and y) S*(a): probability that the person is alive and cancer free at age x, given they are cancer free up until age x y Fay, Michael P. "Estimating Age Conditional Probability of Developing Disease From Surveillance Data." Population Health Metrics 2 (2004): 6-14. Fay, Michael P., Ruth Pfeiffer, Kathleen A. Cronin, Chenxiong Le, and Eric J. Feuer. "Age-Conditional Probabilities of Developing Cancer." Statistics in Medicine 22 (2003): 1837-1848.

8 It is true that f c (a) = λ c * (a) S* (a) P (first cancer occurs between age x and y) = ∫ x f c (a) da = ∫ x λ c * (a) S* (a) da f c (a): probability density function of the first occurrence of cancer happening at age a (a between x and y) λ c *(a): probability that the first cancer occurs at age a, given alive and cancer free up until age a S*(a): probability that the person is alive and cancer free at age a, given they are cancer free up until age a y A(x,y) = ∫ x f c (a) da S* (x) y y Starting with the Numerator Goal: Rewrite A(x,y) with no * terms A(x,y) = ∫ x λ c *(a) S*(a) da S* (x) y

9 It could be found that: λ c (a) = f c (a) S(x) λ c (a) = λ c * (a) S* (a) S(x) So by re-arranging the above equation we get λ c (a) S (a) = λ c * (a) S*(a) f c (a): probability density function of the first occurrence of cancer happening at age a (a between x and y) λ c (a): probability that the first cancer occurs at age a S(a): probability that the person is alive and cancer free at age a λ c *(a): probability that the first cancer occurs at age a, given alive and cancer free up until age a S*(a): probability that the person is alive and cancer free at age a, given they are cancer free up until age a A(x,y) = ∫ x λ c (a) S (a) da S* (x) y We can now rewrite the numerator without * terms Goal accomplished for the numerator! A(x,y) = ∫ x λ c *(a) S*(a) da S* (x) y

10 S* (x) = S c * (x) S o *(x) and we know S o *(x) = S o (x) Through a long series of calculations we find that: S c *(x) = 1 - ∫ 0 λ c (a) S d (a) da A(x,y) = ∫ x λ c (a) S (a) da S* (x) So we can rewrite the denominator as S* (x) = S o (a) {1 - ∫ 0 λ c (a) S d (a) da} y x x Goal: Rewrite A(x,y) with no * terms A(x,y) = ∫ x λ c (a) S (a) da S o (x) {1 - ∫ 0 λ c (a) S d (a) da} y x S*(a): probability that the person is alive and cancer free at age a, given they are cancer free up until age a S c *(a): probability that the person is cancer free at age a, given they are cancer free up until age a S o *(a): probability that the person did not die from non- cancer related causes at age a, given they are cancer free up until age a S o (a): probability that the person did not die from non-cancer related causes at age a S d (a): probability that the person did not die from cancer at age a λ c (a): probability that the first cancer occurs at age a S(a): probability that the person is alive and cancer free at age a

11 A(x,y): Age-conditional probability of getting cancer between x and y, given alive and cancer free up until age x A(x,y) = ∫ x λ c (a) S (a) da S o (x) {1 - ∫ 0 λ c (a) S d (a) da} y x A(x,y) = ∫ x f c (a) da S* (x) y We started from: Goal accomplished!

12 c : number of incidences of cancer ≈ 160 d: number of cancer caused deaths ≈ 20 o: number of deaths from other causes ≈ 1500 n : Mid-interval population ≈ 3 million Approximated SEER Data 2004 λ c (a)≈ c /n λ d (a) ≈ d /n λ o (a) ≈ o /n λ c (20) ≈ 160/3 million = 0.00005333 λ d (20) ≈ 20/3 million = 0.0000066667 λ o (a) ≈ 1500/3 million = 0.0005 Let’s find the failure rates Failure rates are the probability that you will get cancer, die of cancer or die from other causes

13  S c (20)= 1- λ c (20) = 0.99994667  S d (20)= 1- λ d (20) = 0.999993  S o (20)= 1- λ o (20) = 0.9995  S(20) = 1- {λ c (20) + λ o (20)} = 0.99944667 Survival rates are the probability that the individual has not gotten cancer, died from cancer, or died from other causes. S (without a subscript) is the probability of being alive and cancer free.

14 A(x,y) = ∫ x λ c (a) ∙ S (a) da S o (x) {1 - ∫ 0 λ c (a) ∙ S d (a) da} http://seer.cancer.gov/csr/1975_2004/results_merge d/topic_lifetime_risk.pdf A(20,30) = ∫ 20 λ c (20) ∙ S (20) da S o (20) {1 - ∫ 0 λ c (20) ∙ S d (20) da} = 10 λ c (20) ∙ S (20) S o (20) {1 – (20 λ c (20) ∙ S d (20) )} = 0.000534 = 0.0534% What does this number mean? y x 30 20

15 Current Age+10 years+20 years+30 yearsEventually 00 %0 %0 %0.06 %12.28 % 100 %0.06 %0.48 %12.42 % 200.05 %0.48 %1.89 %12.45 % 300.43 %1.84 %4.24 %12.46 % 401.43 %3.86 %7.04 %12.19 % 502.51 %5.79 %8.93 %11.12 % 603.51 %6.87 %8.76 %9.21 % 703.88 %6.07 %-6.59 % 803.04 %--3.76 % Table from Surveillance, Epidemiology and End Results (SEER) database http://seer.cancer.gov/csr/1975_2004/results_merge d/topic_lifetime_risk.pdf

16 Current Age+10 years+20 years+30 yearsEventually 00.16 %0.33 %0.75 %40.93 % 100.17 %0.60 %1.58 %41.33 % 200.43 %1.42 %3.93 %41.39 % 301.01 %3.55 %9.59 %41.49 % 402.60 %8.77 %20.01 %41.35 % 506.47 %18.27 %31.33 %40.67 % 6013.16 %27.71 %36.08 %38.13 % 7018.46 %29.07 %-31.67 % 8017.10 %--21.30 % Table from Surveillance, Epidemiology and End Results (SEER) database http://seer.cancer.gov/csr/1975_2004/results_ merged/topic_lifetime_risk.pdf

17  Fay, Michael P. "Estimating Age Conditional Probability of Developing Disease From Surveillance Data." Population Health Metrics 2 (2004): 6-14.  Fay, Michael P., Ruth Pfeiffer, Kathleen A. Cronin, Chenxiong Le, and Eric J. Feuer. "Age-Conditional Probabilities of Developing Cancer." Statistics in Medicine 22 (2003): 1837-1848.  Ries LAG, Melbert D, Krapcho M, Mariotto A, Miller BA, Feuer EJ, Clegg L, Horner MJ, Howlader N, Eisner MP, Reichman M, Edwards BK (eds). SEER Cancer Statistics Review, 1975-2004, National Cancer Institute. Bethesda, MD, http://seer.cancer.gov/csr/1975_2004/results_merged/topic_lifetime_risk.pdf, based on November 2006 SEER data submission, posted to the SEER web site, 2007.  "What Is Your Risk?." Your Disease Risk. (2005). Harvard Center For Cancer Prevention. 2 Oct 2007.

18  Professor Lengyel  Professor Buckmire  Professor Knoerr  And… the entire Oxy math department THANK YOU!

19 Go to http://www.yourdiseaserisk.wustl.edu/ to calculate your risk and learn what could raise and lower your risk


Download ppt "Comprehensive Project By Melissa Joy.  Background Information on Probability  Intro to Fay’s Formula  Notation  Overview of the method behind Fay’s."

Similar presentations


Ads by Google