Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stochastic Hydrology Fundamentals of Hydrological Frequency Analysis

Similar presentations


Presentation on theme: "Stochastic Hydrology Fundamentals of Hydrological Frequency Analysis"— Presentation transcript:

1 Stochastic Hydrology Fundamentals of Hydrological Frequency Analysis
Professor Ke-sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

2 General concept of hydrological frequency analysis
Hydrological frequency analysis is the work of determining the magnitude of hydrological variables that corresponds to a given exceedance probability. Frequency analysis can be conducted for many hydrological variables including floods, rainfalls, and droughts. The work can be better understood by treating the interested variable as a random variable. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

3 Let X represent the hydrological (random) variable under investigation
Let X represent the hydrological (random) variable under investigation. A value xc is chosen such that an event is said to occur if X assumes a value exceeding xc. Every time when a random experiment (or a trial) is conducted the event may or may not occur. We are interested in the number of Bernoulli trials in which the first success occur. This can be described by the geometric distribution. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

4 Geometric distribution
Geometric distribution represents the probability of obtaining the first success in x independent and identical Bernoulli trials. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

5 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

6 Recurrence interval vs return period
Average number of trials to achieve the first success. Recurrence interval vs return period 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

7 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

8 The general equation of frequency analysis
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

9 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

10 Collecting required data.
Estimating the mean, standard deviation and coefficient of skewness. Determining appropriate distribution. Calculating xT using the general eq. It is apparent that calculation of involves determining the type of distribution for X and estimation of its mean and standard deviation. The former can be done by GOF tests and the latter is accomplished by parametric point estimation. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

11 Data series for frequency analysis
Complete duration series A complete duration series consists of all the observed data. Partial duration series A partial duration series is a series of data which are selected so that their magnitude is greater than a predefined base value. If the base value is selected so that the number of values in the series is equal to the number of years of the record, the series is called an “annual exceedance series”. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

12 Peak-over-threshold series Data independency
Extreme value series An extreme value series is a data series that includes the largest or smallest values occurring in each of the equally- long time intervals of the record. If the time interval is taken as one year and the largest values are used, then we have an “annual maximum series”. Annual exceedance series and annual maximum series are different. Peak-over-threshold series Data independency Why is it important? 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

13 Parameter estimation Method of moments Maximum likelihood method
Method of L-moments (Gaining more attention in recent years) Depending on the distribution types, parameter estimation may involve estimation of the mean, standard deviation and/or coefficient of skewness. Parameter estimation exemplified by the gamma distribution. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

14 Gamma distribution parameter estimation
Gamma distribution is a special case of the Pearson type III distribution (with zero location parameter). Gamma density where , , and  are the mean, standard deviation, and coefficient of skewness of X (or Y), respectively, and  and  are respectively the scale and shape parameters of the gamma distribution. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

15 MOM estimators 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

16 Maximum likelihood estimator
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

17 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

18 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

19 Evaluating bias of different estimators of coefficient of skewness
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

20 Evaluating mean square error of different estimators of coefficient of skewness
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

21 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

22 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

23 Techniques for goodness-of-fit test
A good reference for detailed discussion about GOF test is: Goodness-of-fit Techniques. Edited by R.B. D’Agostino and M.A. Stephens, 1986. Probability plotting Chi-square test Kolmogorov-Smirnov Test Moment-ratios diagram method L-moments based GOF tests 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

24 Probability plotting Fundamental concept
Probability papers Empirical CDF vs theoretical CDF Misuse of probability plotting 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

25 Suppose the true underlying distribution depends on a location parameter  and a scale parameter  (they need not to be the mean and standard deviation, respectively). The CDF of such a distribution can be written as where Z is referred to as the standardized variable and G(z) is the CDF of Z. If the random sample is truly from a cumulative distribution F(X), then Z=G-1(F(X)) and X are linearly related. In practice, Z can be found by using Z=G-1(Fn(X)). 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

26 where x represents the observed values of the random variable X.
Also let Fn(X) represents the empirical cumulative distribution function (ECDF) of X based on a random sample of size n. A probability plot is a plot of on x where x represents the observed values of the random variable X. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

27 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

28 Most of the plotting position methods are empirical
Most of the plotting position methods are empirical. If n is the total number of values to be plotted and m is the rank of a value in a list ordered by descending magnitude, the exceedence probability of the mth largest value, xm, is , for large n, shown in the following table. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

29 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

30 Misuse of probability plotting
Log Pearson Type III ? 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

31 Misuse of probability plotting
48-hr rainfall depth Log Pearson Type III ? 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

32 Fitting a probability distribution to annual maximum series (Non-parametric GOF tests)
How do we fit a probability distribution to a random sample? What type of distribution should be adopted? What are the parameter values for the distribution? How good is our fit? 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

33 Chi-square GOF test 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

34 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

35 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

36 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

37 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

38 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

39 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

40 Chi-square Goodness-of-fit test in R
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

41 Kolmogorov-Smirnov GOF test
The chi-square test compares the empirical histogram against the theoretical histogram. By contrast, the K-S test compares the empirical cumulative distribution function (ECDF) against the theoretical CDF. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

42 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

43 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

44 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

45 In order to measure the difference between Fn(X) and F(X), ECDF statistics based on the vertical distances between Fn(X) and F(X) have been proposed. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

46 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

47 Stochastic convergence Almost-sure convergence or
Convergence with probability 1 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

48 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

49 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

50 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

51 Hypothesis test using Dn
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

52 Values of for the Kolmogorov-Smirnov test
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

53 K-S Goodness-of-fit test in R (ks.test)
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

54 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

55 Interpretation of the probability distribution of the test statistic
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

56 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

57 IDF curve fitting using the Horner’s equation
The intensity-duration-frequency (IDF) relationship of the design storm depths 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

58 DDF curves 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

59 IDF curves 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

60 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

61 Alternative IDF fitting (Return-period specific)
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

62 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

63 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

64 Further discussions on frequency analysis
Extracting annual maximum series Probabilistic interpretation of the design total depth Joint distribution of duration and total depth Selection of the best-fit distribution 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

65 Annual maximum series Data in an annual maximum series are considered IID and therefore form a random sample. For a given design duration tr, we continuously move a window of size tr along the time axis and select the maximum total values within the window in each year. Determination of the annual maximum rainfall is NOT based on the real storm duration; instead, a design duration which is artificially picked is used for this purpose. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

66 Random sample for estimation of design storm depth
The design storm depth of a specified duration with return period T is the value of D(tr) with the probability of exceedance equals  /T. Estimation of the design storm depth requires collecting a random sample of size n, i.e., {x1, x2, …, xn}. A random sample is a collection of independently observed and identically distributed (IID) data. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

67 Probabilistic interpretation of the design storm depth
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

68 It should also be noted that since the total depth in the depth-duration- frequency relationship only represents the total amount of rainfall of the design duration (not the real storm duration), the probability distributions in the preceding figure do not represent distributions of total depth of real storm events. Or, more specifically, the preceding figure does not represent the bivariate distribution of duration and total depth of real storm events. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

69 The usage of annual maximum series for rainfall frequency analysis is more of an intelligent and convenient engineering practice and the annual maximum data do not provide much information about the characteristics of the duration and total depth of real storm events. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

70 Joint distribution of the total depth and duration
Total rainfall depth of a storm event varies with its storm duration. [A bivariate distribution for (D, tr).] For a given storm duration tr, the total depth D(tr) is considered as a random variable and its magnitudes corresponding to specific exceedance probabilities are estimated. [Conditional distribution] In general, 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

71 Selection of the best-fit distribution
Methods of model selection based on loss of information. Akaike information criterion (AIC) Schwarz's Bayesian information criterion (BIC) Hannan-Quinn information criterion (HQIC) Anderson-Darling criterion (ADC) Common practices of WRA-Taiwan SE and U SSE and SE 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

72 Information-criteria-based model selection
where is the log-likelihood function for the parameter  associated with the model, n is the sample size, and p is the dimension of the parametric space. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

73 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

74 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

75 WRA Practice p: Number of distribution parameters
Weibull plotting position formula is used for calculation of cumulative probability. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

76 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

77 Model selection based on information criteria using R
The nsRFA package MSClaio2008(x) 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

78 MSClaio2008 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

79 Indicatively, AICc should be used when (n/p) < 40.
When the sample size, n, is small, with respect to the number of estimated parameters, p, the AIC may perform inadequately. In those cases a second- order variant of AIC, called AICc, should be used: Indicatively, AICc should be used when (n/p) < 40. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

80 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

81 Rationale of the information criteria
The Akaike information criterion uses the Kullback-Leibler divergence as the discrepancy measure between the true model f(x) and the approximating model g(x). Information and entropy 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

82 What is information? Consider the following statements:
I will eat some food tomorrow. A major earthquake will strike Taiwan tomorrow. Which statement conveys more information? 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

83 Definition of entropy 侯如真,2001. 訊息熵應用於雨量站網設計之理論探討。國立臺灣大學農業工程學研究所碩 士論文。
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

84 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

85 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

86 Kullback-Leibler Divergence
8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University

87 where pj is the number of parameters of the jth model.
If there are several candidate distributions, we only need to calculate H(X|qi(X)) since H(X|p(X)) is a constant. In practical applications, the above term is estimated as (Akaike, 1973) where pj is the number of parameters of the jth model. 8/3/2019 Dept. of Bioenvironmental Systems Engineering, National Taiwan University


Download ppt "Stochastic Hydrology Fundamentals of Hydrological Frequency Analysis"

Similar presentations


Ads by Google