Applied Statistics for Biological Dosimetry Part 1 Lecture Module 8 Lecture: Applied statistics for biological dosimetry - Part 1. Purpose: To present the techniques and statistical considerations for constructing dose effect curves. Learning objectives: Upon completion of this lecture the participants will have: Been introduced to the basic physics and biological requirements for constructing a dose response curve. Been introduced to the linear quadratic model of dose response. Been shown the Poisson distribution and how to test for it, Been introduced to dose response curve fitting. Duration: 1 hour.
Radiation induces Chromosome Damage The yield of chromosome damage depends on dose, dose rate and radiation type Dose estimation is made using a calibration curve and any laboratory intending to carry out biological dosimetry should establish its own dose–response data It is accepted that the yield of radiation-induced chromosome aberrations in peripheral blood lymphocytes shows a clear relationship with dose. Moreover, after G0 lymphocyte irradiation, no major differences in radiation sensitivities between individuals have been described apart from a few people suffering from radiosensitive syndromes. Therefore the use of chromosome damage as an endpoint for dose-assessment is widely accepted. However, experience has shown that despite the adoption of similar laboratory protocols, differences do exist between laboratories. Therefore the recommendation is that each laboratory should establish its own reference calibration data for qualities of radiation likely to be encountered in accidents.
How to produce dose-effect calibration curve Lymphocytes should be irradiated in vitro to approximate as closely as possible the in vivo situation Freshly taken blood specimens in lithium heparin tubes should be used and irradiated as whole blood at 37°C After irradiation they should be held for a further 2 h at 37oC For low LET radiation, 10 or more doses should be used in the range 0 – 5.0 Gy. However, if the laboratory is capable of obtaining data at doses below 0.25 Gy, this is very desirable For high LET radiation a maximum dose of 2.0 Gy is suggested The IAEA manual proposes a common method that should be used to approximate as closely as possible to the in vivo situation. Freshly taken blood samples, with lithium heparin anticoagulant, should be maintained at 37oC before, during and for 2 hours after irradiation. The post-irradiation holding is to allow for repair of the induced DNA damage. Lymphocyte cultures are then set up and processed following a standard procedure that is also used for case investigations. For low-LET type radiations the dose-effect function is linear-quadratic and the data points should be fitted to this model. In order to produce a well fitted curve many dose points should be used and typically ~10 points in the range 0-5 Gy are suggested. Moreover, these points should be equally distributed below and above 1 Gy. It is highly recommended to obtain data below 0.25 Gy, for example 0.1 Gy. Because most referred cases involve low doses the quality of the calibration data at low doses is critical. This means that many thousands of cells need to be scored at several low dose data points to ensure that the linear coefficient of the fitted curve has sufficient accuracy and acceptably small statistical uncertainty. Remember that 0 Gy is also a data point which of course can be used for all the laboratory’s curves. It is highly recommended that a laboratory should establish its own background data set from sampling healthy persons exposed only to background radiation. In most societies such background donors will have to include persons with unremarkable medical exposures such as an occasional dental x-ray.
Physics considerations The preparation of a dose–response curve must be supported by reliable and accurate physical dosimetry The irradiation should be uniform There must be sufficient material surrounding the blood to provide charged particle equilibrium The preparation of a dose–response curve must be supported by reliable and accurate physical dosimetry and for this reason several points should be considered. The blood should be exposed far enough away from the source so that the irradiation can be regarded as uniform. This means that the difference in dose rate to a cell at the entrance side of the specimen tube and a cell at exit side should be negligible. There must be sufficient material surrounding the blood for charged particle equilibrium to exist. The surrounding materials should be reduced to a minimum to avoid the complications of scattered radiation. The materials should have atomic compositions similar to blood because the dose to blood close to the specimen container wall will be caused by electrons arising from interactions within the wall. The exposure set up should be calibrated by physical measurements and most commonly an ionization chamber is used but other methods are possible. The physical dosimetry system that is used should have a calibration traceable to a national or international standard.
Biological Considerations Stimulate the lymphocytes with a mitogen (PHA) Culture for 48-50h in presence of BrdU Stain with FPG Peripheral blood lymphocytes are in a non-cycling phase of the cell cycle, in quiescent G0, so to obtain metaphase spreads it is needed to stimulate them to divide with a mitogen like phytohaemagglutinin, PHA. Because dicentrics, rings and acentrics impede the segregation of DNA during mitosis, and daughter cells normally can not survive, the analysis should be restricted to cells in their first division. This can be guaranteed by including Bromodeoxyuridine (BrdU) in the cultures in order to permit fluorescence plus Giemsa (FPG) staining. This thymidine analogue is taken up preferentially into replicating DNA. When one chromatid is bifiliarly and the other one unifiliarly substituted, FPG staining produces a “harlequin effect” in the metaphase chromosomes of cells which are in their second or later post-substitution division. There is no universally established concentration of BrdU that should be used, but a standard method is proposed in the IAEA Manual. MI MII To restrict the analysis to guaranteed first division cells
Number of cells to be analyzed At higher doses, scoring should aim to detect 100 dicentrics at each dose At lower doses it is difficult to achieve 100 dicentrics, and instead several thousand cells per point should be scored; a number between 3000 and 5000 is suggested In all cases, the actual number of cells scored should be dependent on the number of dose points in the low dose region, with the focus on minimizing the error on the fitted curve It is important to obtain statistically significant α and β coefficients when the dose response data are fitted to the linear quadratic model. For this reason, not only the number of doses analyzed but also the number of cells analyzed at each dose are key points.
Which aberrations to score ? Some laboratories calibrate against the dicentric frequency; others also include centric rings. Either method is acceptable. Both of these aberrations in an M1 cell should be accompanied by an acentric fragment. These fragments must not be listed with excess acentrics. Chromosome-type aberrations: dicentrics and rings
Linear quadratic function For low LET radiation there is very strong evidence that the yields of chromosome aberrations or micronuclei (Y) are related to dose (D) by the linear quadratic equation Y = C + D + D2 For high LET radiation, the α-term becomes large and eventually the β-term becomes biologically less relevant and also statistically ‘masked‘ and the dose response is approximated by the linear equation Over the 50+ years that biological dosimetry has been practiced numerous dose response data sets have been published. The evidence is overwhelming that the data fit to a linear quadratic model, reducing to linear for densely ionising high LET radiations. This model is consistent with the biophysical understanding of how energy is deposited at the DNA molecular level. Y = C + D
Y = C+ αD + βD2 Linear term αD corresponds to one-track action, and quadratic term βD2 corresponds to two-track action. C is background frequency of dicentrics αD : dicentric formed by one track βD: dicentric formed by two tracks This diagram represents the traditional explanation of the two coefficients. It assumes the “Breakage-and-reunion” model of exchange aberration formation. The dicentric requires two DSB each on a different chromosome. A single ionising track can produce the DSB, i.e., is able to break both sides of the double helix. A single track may also be able to traverse two double helices, breaking both and this is reflected by the linear yield coefficient, α. Alternatively the two double helices may each be broken by a different track and this is the origin of the quadratic coefficient, β. Dicentric, and other aberrations, formation, is believed to start with radiation induction of DNA double-strand breaks (DSB). A key assumption is that any DSB is made by traversal of one track but that track can continue on and produce second track elsewhere
Implications of linear quadratic model (1) Ratio α/β delineates dose at which two terms contribute equally to number of dicentrics formed For low-LET radiation at moderate or high doses, where there are many tracks per cell nucleus, but each track has only small probability of making one DSB and very small probability of making more than one DSB, quadratic term, βD2, dominates linear term
Implications of linear quadratic model (2) For high-LET radiation at low doses, where typically there are very few radiation tracks per cell nucleus, each track typically makes a number of DSBs, linear term, αD, dominates At sufficiently low doses of any type of radiation, when the average number of events per cell is less than one, the function αD + βD2 is reduced to αD Close to the origin the slope is linear
Poisson or non-Poisson Objective of curve fitting is to determine those values of coefficients C, α and β which best fit data points For dicentrics, irradiation with X or gamma rays produces a distribution of damage which is very well represented by Poisson distribution Key assumption is that variance (σ2) equals mean (y) In contrast, neutrons and other types of high LET radiation produce distributions which display overdispersion, where variance exceeds the mean For micronuclei data tend to overdispersion at all doses even with photon irradiation This slide starts to explain the Poisson distribution. It is important to note that the distribution among cells assumed for dicentrics is not just a theoretical distribution, it affects some key features in biodosimetry. These are: The mathematical approach to obtain the fitted coefficients; The way to estimate an overexposure with its uncertainty (confidence interval); The way to deal with special irradiation situations like non-uniform exposures and accidents with high LET radiation.
However, it must be pointed out that since early radiobiology it was accepted that cellular damage induced by radiation is generally not a simple Poisson process Particles traversing cell nucleus follow a Poisson process For practical purposes the Poisson distribution is widely accepted; and in real cases, when compared with physical dosimetry, its assumption gives very accurate dose estimations. However it should be noted that in the formation of a dicentric there are many processes involved at different levels: atomic, molecular, organelle and whole cell. Interation with DNA is a random process Repair–misrepair is another type of random process
Poisson-distributed events: Tracks intersecting cell nuclei A way to explain a Poisson process. In this case the probability of a cell to be traversed by a track follows the Poisson distribution. A nucleus can be traversed by 0,1,2 or more tracks. In biological dosimetry it is assumed that because the probability of a nucleus being traversed by a track is Poissonian, the final distribution of dicentrics among cells will also be Poisson. Poisson-distributed events: Tracks intersecting cell nuclei
Poisson distribution is discrete probability distribution that expresses probability of number of events occurring in fixed period of time if these events occur with known average rate and independently of time since last event If expected number of occurrences in this interval is λ, then probability that there are exactly n occurrences (n being a whole number, n = 0, 1, 2, ...) is equal to: This is the formal mathematical description of a Poisson distribution. The first 5 terms of the Poisson series are shown and for the dicentric assay this defines the probability that a cell contains 0, 1, 2, 3 or 4 dicentrics.
= frequency of dicentrics per cell =0.25 =0.50 =1 =4 n The Poisson distribution is clearly a discrete distribution because of course one can only score whole numbers of dicentrics in a cell. We use the continuous approximation to fit these curves. Probability to have a cell (x axis) with n number of dicentrics. As the increases the shape of the distribution trends to the normal distribution
Test data for conformity to Poisson Because curve fitting methods are based on Poisson statistics, dicentric cell distribution should be tested for compliance with Poisson distribution for each dose point used to construct calibration curve
How to test? Parameter λ is not only the mean number of occurrences, but also its variance. One of the main characteristics of the Poisson distribution is that λ is not only its mean but also its variance. The basis of the test for Poisson is to calculate this ratio, variance : mean, which is known as the dispersion index (DI), to see if it equals 1. Values obtained, which can be higher or lower than 1.0, are then tested to see if they are significantly different from 1.0. This is done by the u-test. The mean is calculated as the dicentrics per cell: Total no. of dic / Total no. of cells scored The variance is the Standard Deviation2. It can be calculated by: cells with 0 dic (0 - mean)2 + cells with 1 dic (1 – mean)2 +…….cells with n dic (n-mean)2 / N – 1 Where N is the number of cells scored. Then the dispersion index would be 1
A normalized unit of DI is Papworth’s u test N indicates the number of cells analyzed and X the number of dicentrics detected The most widely used method to test if the observed variance and mean are equal is the u-test (u stands for unit normal deviate). There are other statistical tests but in the biological dosimetry field this test, which was developed by David Papworth working in a cytogenetics research environment, has been adopted universally. Values of u greater than +1.96 are considered to indicate overdispersion.This is sometimes encountered and there clear explanations (non uniform exposure; high LET) Values greater than -1.96 indicate underdispersion but rarely occur and there are no obvious physics or biological explanations. u values higher than 1.96 indicate overdispersion (with a two-sided significance level, α = 0.025)
cell distribution of dicentrics g-rays (Cobalt-60) dose (Gy) N X cell distribution of dicentrics D u 1 2 3 4 5 6 0.000 5000 8 4992 1.00 -0.07 0.100 5002 14 4988 -0.13 0.250 2008 22 1987 20 1.08 2.61 0.500 2002 55 1947 0.97 -0.86 0.750 1832 100 1736 92 1.03 0.79 1.000 1168 109 1064 99 -0.02 1.500 562 474 76 12 1.06 2.000 332 103 251 63 17 1.14 1.82 3.000 193 108 104 72 15 0.83 -1.64 4.000 35 41 21 0.88 -0.84 5.000 59 107 11 19 9 1.15 0.81 20 MeV 4He particles 7 2000 1997 -0.04 0.051 900 881 0.98 -0.44 0.104 1029 27 1004 23 1.12 2.84 0.511 1136 199 960 154 1.07 1.60 1.010 304 217 69 1.09 1.536 142 96 75 40 25 -0.20 2.050 137 120 44 16 1.20 1.65 2.526 144 148 66 34 1.40 3.40 3.029 98 47 1.56 3.93 This illustrates that the assumption of Poisson works well for low-LET, but for high LET radiation types it is widely accepted that dicentric cell distribution is usually overdispersed; u values greater than 1.96. A recommended paper to follow up this concept further is: Edwards, A.A., Lloyd, D.C., and Purrot, R.J., Radiation induced chromosome aberrations and the Poisson distribution. Radiat. Environ. Biophys. 16 (1979) 89-100. After low-LET radiation exposure only one dose shows overdispersion. Whereas after high-LET exposure three doses showed u values >1.96
Fitting dose response data to curve Objective of curve fitting is to find values for C, a and b for which curve is closest to observed data points
10 doses, in Gy, equally distributed Fitting X ray data Dose cells dic var/mean 0. 5000 8 1 0.1 5002 14 1 0.25 2008 22 1 0.5 2002 55 1 0.75 1832 100 1 1.0 1168 109 1 1.5 562 100 1 2.0 332 103 1 3.0 193 108 1 4.0 103 103 1 5.0 59 107 1 Conforms with Poisson 10 doses, in Gy, equally distributed Here we have the basic scoring results which are tabulated in slide no.19 for the x-ray calibration curve. The dicentric distributions among the scored cells were tested, by variance / mean and u-value, and all except one were within ± 1.96. The exception was only slightly overdispersed and the distribution table shows that this was due to just one cell having been scored with 2 dicentrics. Clearly the overall data set does not deviate from Poisson so, for curve fitting, 1.0 should be used at each dose rather than the absolute values of the ratios. With this information we can construct a dose-effect curve. Observed frequency
Y = C + D + D2 maximizing likelihood of observations by the method of iteratively reweighted least squares This, in effect, is finding the coefficients that make the data fit best to the model. Assuming the Poisson distribution
Iteratively Reweighted Least Squares A common criterion for closeness is the sum of squares differences Essentially this is testing the closeness of each of the data points to the best linear quadratic curve through the points. The technique is called iteratively reweighted least squares. Mathematically you are minimixing the difference between the actual data points and the fitted curve. In the equation Y0 is the observed dicentric yield and Yf is the yield of the function Y = αD + βD2. For a Poisson set of data this gives the same result as an alternative commonly used test called maximum likelihood fitting. Iteratively Reweighted Least Squares
More accurate are data points, because more aberrations have been scored, closer curve should lie to the point Accuracy = SE/Y0 (100) DOSE cells Aberrations 5000 8 0.1 14 0.25 2008 22 0.5 2002 55 0.75 1832 100 1 1168 109 1.5 562 2 332 103 3 193 108 4 5 59 107 Accuracy 35.4 26.7 21.3 13.5 10.0 9.6 9.9 9.7 less accuracy This means that the first coefficients obtained have a tendency to be closer to the higher doses, where more dicentrics have been scored. SE is the standard error and Y0 is the observed dicentric yield. more accuracy
Because for Poisson distribution variance is Common approach is to minimize SSD with weights of data points by inverse of their variance Because for Poisson distribution variance is equal to mean, weight used is the inverse of mean To consider this problem each point is weighted by its variance. Because for Poisson variance and mean is the same value, most programs use the inverse of the mean to weight. Iteratively Reweighted Least Squares
Handling overdispersion when curve fitting For overdispersed (non-Poisson) distributions, as obtained after high LET radiation, weights must take into account the overdispersion If data show a statistically significant trend of σ2/y with dose, then that trend should be used Otherwise, the Poisson weight on each data point should be divided by the average value of σ2/y Where most data points show overdispersion of dicentrics, such as is expected with high LET radiations, the weights need to be adjusted. One needs to inspect the data to see whether there is a systematic trend of increasing overdispersion with dose. If so, this trend should be used. Otherwise if there is just general overdispersion, but no obvious trend, then an average value should be calculated to be applied to each dose point. Currently, from reviewing published high LET dose response data sets, it is not clear whether a systematic trend with dose is the normal situation so each data set should be considered on a case-by-case basis.
Iteratively Reweighted Least Squares First coefficients are obtained by minimising the equation w (yo-yf)2 where yo is the observed yield, yf the expected yield from a linear-quadratic model and w = 1/(yo/N) is the weighting factor. N is the number of cells at each dose. Then the coefficients are recalculated using as weighting factor w = 1/(yf/N), obtaining new coefficients and new expected frequencies y’f for each dose. This procedure is repeated with new weighting factors, w = 1/(y’f/N) and so on, until the coefficients do not vary. Note that the variances are based on the fitted means, not the Poisson overdispersion variance. Iteratively Reweighted Least Squares
Finally Goodness of fit of curve and significance of fitted α and β coefficients should then be tested, for instance using Chi-squared (2) test and appropriate form of F- test (e.g., F-test, z-test or t-test) respectively If there is evidence of lack of fit (i.e. 2 is greater than degrees of freedom (DF)), then standard error should be increased by (2/DF)1/2 Additionally, as most of programs calculate the SE values based on sum of squares, instead of Poisson estimate of variance, it can be considered good practice to increase the SE by (DF/2)1/2
p values shown indicate that the fitted data points were not statistically different from the observed ones confirming a good fit Examples of testing the goodness of fits for two dose response curves. Moreover the significance of the linear and quadratic coefficients was also confirmed by the F-test; for each coefficient the F value was higher than 3.44 (the cut off value for F.05 [8, 8]) and the z value was higher than 1.96 .
Several programs can be used to obtain coefficients Some of them can only be used for fitting While others are more user friendly and versatile with many other biodosimetry applications Do not despair! You do not need to blow the dust of your old student statistics text books. A lot of work has been done for you. Software is available. In particular two packages; CABAS and Dose Estimate have been assembled by practicing biodosimetry laboratories. These are user-friendly, PC-based packages into which you essentially just need to insert the basic scoring data and it undertakes the Poisson testing and curve fitting for you. Moreover these packages also carry out other statistical operations that are specific to biodosimetry such as contaminated Poisson and Qdr partial body dose estimations. The Dose Estimate package in particular also has assembled a number of other useful standard statistical tools such as t-testing and chi-squared