WFM 5201: Data Management and Statistical Analysis Lecture-11: Frequency Analysis Akm Saiful Islam Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET) June, 2008
Frequency Analysis Probability Position Formula and Probability Plot Analytical Frequency Analysis Normal and Log-normal distribution Gumbels Extreme Value distributions Type I Log Pearsons Type III distribution
Introduction to Frequency Analysis The magnitude of an extreme event is inversely related to its frequency of occurrence, very severe events occurring less frequently than more moderate events. The objective of frequency analysis is to relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions. Frequency analysis is defined as the investigation of population sample data to estimate recurrence or probabilities of magnitudes. It is one of the earliest and most frequent uses of statistics in hydrology and natural sciences. Early applications of frequency analysis were largely in the area of flood flow estimation. Today nearly every phase of hydrology and natural sciences is subjected to frequency analyses.
Methods Two methods of frequency analysis are described: one is a straightforward plotting technique to obtain the cumulative distribution and the other uses the frequency factors. The cumulative distribution function provides a rapid means of determining the probability of an event equal to or less than some specified quantity. The inverse is used to obtain the recurrence intervals. The analytical frequency analysis is a simplified technique based on frequency factors depending on the distributional assumption that is made and of the mean, variance and for some distributions the coefficient of skew of the data.
Plotting Position Formula The frequency of an even can be obtained by use of “plotting position” formulas. Where, P = the probability of occurrence n = the number of values m = the rank of descending values with largest equal to 1 T = 1-P = the mean number of exceedances a=c = parameters depending on n
Plotting Position relationship
Parameters n 10 20 30 40 50 a 0.448 0.443 0.442 0.441 0.440 60 70 80 90 100 0.439 a is generally recommended as 0.4 . For normal distribution a = 3/8 For Gumbel’s (EV1) distribution a = 0.4
Exercise-1: Using the 23 years of annual precipitation depths for a station given in the table below, estimation the exceedance frequency and recurrence intervals of the highest ten values using Weibull equation
Year Rain depth (in) Rank, m P (m/(n+1)) Tr (year) 1981 24 1 0.042 Here, n = 23 Year Rain depth (in) Rank, m P (m/(n+1)) Tr (year) 1981 24 1 0.042 1986 23 3 1988 0.125 8 1978 21 4 0.167 6 1993 20 5 0.208 4.8 1999 19 1998 1980 0.333 1990 18 10 1983 0.417 2.4
Probability plot A probability plot is a plot of a magnitude versus a probability. Determining the probability to assign a data point is commonly referred to as determining the plotting position.
Plotting position may be expressed as a probability from 0 to 1 or a percent from 0 to 100. Which method is being used should be clear from the context. In some discussions of probability plotting, especially in hydrologic literature, the probability scale is used to denote prob or . One can always transform the probability scale to or even if desired.
Gumbel’s (1958) Criteria The plotting position must be such that all observations can be plotted. The plotting position should lie between the observed frequencies m-1/n of m/n and n where is the rank of the observation beginning with m=1 for the largest (smallest) value and n is the number of years of record (if applicable) or the number of observations. The return period of a value equal to or larger than the largest observation and the return period of a value equal to or smaller than the smallest observation should converge toward n . The observations should be equally spaced on the frequency scale. The plotting position should have an initiative meaning, be analytically simple, and be easy to use.
Steps for probability plot Rank the data from the largest (smallest) to the smallest (largest) value. If two or more observations have the same value, several procedures can be used for assigning a plotting position. Calculate the plotting position of each data point from relationship Table presented in earlier slide. Select the type of probability paper to be used. Plot the observations on the probability paper. A theoretical normal line is drawn. For normal distribution, the line should pass through the mean plus one standard deviation at 84.1% and the mean minus one standard deviation at 15.9%.
Analytical Frequency Analysis Chow has proposed where, K is the frequency factor s is the standard deviation and x bar is the mean value.
Methods of Analytical Frequency Analysis Normal distribution Log-normal distribution Gumbel’s Extreme Value distributions Type I Log Pearsons Type III distribution
Normal Distribution The probability that X is less than or equal to x when X can be evaluated from The parameters (mean) and (variance) are denoted as location and scale parameters, respectively. The normal distribution is a bell-shaped, continuous and symmetrical distribution (the coefficient of skew is zero). (4.9)
Standard normal distribution The probability that X is less than or equal to x when X is can be evaluated from (4.9)
The equation (4.9) cannot be evaluated analytically so that approximate methods of integration arc required. If a tabulation of the integral was made, a separate table would be required for each value of and . By using the liner transformation , the random variable Z will be N(0,1). The random variable Z is said to be standardized (has and ) and N(0,1) is said to be the standard normal distribution. The standard normal distribution is given by (4.10) and the cumulative standard normal is given by (4.11)
Figure 4.2.1.3 Standard normal distribution
Figure 4.2.1.3 shows the standard normal distribution which along with the transformation contains all of the information shown in Figures 4.1 and 4.2. Both and are widely tabulated. Most tables utilize the symmetry of the normal distribution so that only positive values of Z are shown. Tables of may show or prob Care must be exercised when using normal probability tables to see what values are tabulated.
Exercise-2: Assume the following data follows a normal distribution Exercise-2: Assume the following data follows a normal distribution. Find the rain depth that would have a recurrence interval of 100 years. Year Annual Rainfall (in) 2000 43 1999 44 1998 38 1997 31 1996 47 ….. …..
Solution: Mean = 41.5, St. Dev = 6.7 in (given) x= Mean + Std.Dev * z x = 41.5 + z(6.7) P(z) = 1/T = 1/100 = 0.01 F(z) = 1.0 – P(z) = 0.99 From Interpolation using Tables E.4 Z = 2.33 X = 41.5 + (2.33 x 6.7) = 57.11 in
Table: Area under standardized normal distribution
Linear Interpolation from Z-table For, p=0.99 , find z ? From table, z1 = 2.32 , p1 = 0.9898 and z2 =2.33 p2=0.9901