Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alafia river: Autocorrelation Autocorrelation of standardized flow.

Similar presentations


Presentation on theme: "Alafia river: Autocorrelation Autocorrelation of standardized flow."— Presentation transcript:

1

2

3

4 Alafia river: Autocorrelation
Autocorrelation of standardized flow

5 Alafia River: Monthly streamflow distribution

6 Storage-Yield Analysis
Sequent Peak Procedure Rt = y Kt = Kt-1 + Rt – Qt If Kt < 0, Kt=0 S = Max(Kt)

7

8

9 Reservoir Storage-Yield Analysis
R/Q

10 Box Plot Outliers: beyond 1.5*IQR Whiskers: 1.5*IQR or largest value
Box: 25th %tile to 75th %tile Line: Median (50th %tile) - not the mean Note: The range shown by the box is called the “Inter-Quartile Range” or IQR. This is a robust measure of spread. It is insensitive to outliers since it is based purely on the rank of the values.

11 Reservoir Reliability Analysis

12 General function fitting

13 General function fitting – Independent data samples
x1 x2 x3 y ……… Input Output Independent data vectors Example linear regression y=a x + b+

14 Time series function fitting
x1 x2 . xt

15 Time series autoregressive function fitting – Method of delays
Embedding dimension x1 x2 x x4 x2 x3 x x5 x3 x4 x x6 ……… xt-3 xt-2 xt-1 xt Samples data vectors constructed using lagged copies of the single time series ExampleAR1 model xt =  xt-1 +  Trajectory matrix

16 Generating a random variable from a given distribution
F(U) F(X) U X Generate U from a uniform distribution between 0 and 1 Solve for X=F-1(U) Basis P(X<x)=P(U<F(x))=P(F-1(U)<x) F-1(U) is randomly distributed with CDF F(x)

17 Fitting a probability distribution to data
Hillsborough River at Zephyr Hills, September flows = 8621 mgal S = 8194 mgal n = 31 mgal

18 Method of Moments Using the sample moments as the estimate for the population parameters

19 Method of Moments Gamma distribution =1.1 =1.3 x 10-3

20 Method of Moments Log-Normal distribution =0.643 =8.29

21 Method of Maximum Likelihood
“Back into” the estimate by assuming the parameters we are trying to estimate from the data are known. How likely are the sample values we have, given a certain set of parameter values? We can express this as the joint density of the random sample given the parameter value. After we obtain the data (random sample), we use the joint density to define the Likelihood function. Say… each data point is treated as an indep sample from the prob dist. For a given distribution, what is

22 Likelihood ln(L)= -311 (for gamma) ln(L)= -312 (for log normal)
Could use maximization of L or ln(L) to select parameters rather than fitting moments

23 Normalizing Transformations and fitting a marginal distribution
Much theory relies on the central limit theorem so applies to Normal Distributions Where the data is not normally distributed normalizing transformations are used Log Box Cox (Log is a special case of Box Cox) A specific PDF, e.g. Gamma A non parametric PDF

24 Approach Select the class of distributions you want to fit
Estimate parameters using an appropriate goodness of fit measure Likelihood PPCC (Filliben’s statistic) Kolmogorov Smirnov p value Shapiro Wilks W

25 Normalizing transformation for arbitrary distribution
Arbitrary distribution F(x) Normal distribution Fn(y) x y Normalizing transformation Back transformation

26 Kernel Density Estimate (KDE)
Place “kernels” at each data point Sum up the kernels Width of kernel determines level of smoothing Determining how to choose the width of the kernel could be a full day lecture! Narrow kernel Sum of kernels Medium kernel Individual kernels Wide kernel

27 1-d KDE of Log-transformed Flow
Level of smoothing: 0.5 Rug plot: shows location of data points Level of smoothing: 0.2 Level of smoothing: 0.8

28 Non parametric PDF in R # Read in Willamette R. flow data q=matrix(scan("willamette_data.txt"),ncol=3,byrow=T) # Assign variables yr=q[,1] mo=q[,2] flow=q[,3] # Format flows into a matrix fmat=matrix(flow,ncol=12,byrow=T) # focus on January and February # Marginal distributions # Create histogram for each month, with actual streamflow data on x-axis and KDE # of marginal distribution using....Gaussian kernel and nrd0 bandwidth par(mfrow=c(1,2)) for(i in 1:2){ x=fmat[,i] hist(x,nclass=15,main= month.name[i] ,xlab="cfs",probability=T) lines(density(x,bw="nrd0",na.rm=TRUE),col=2) rug(x,,,,2) box() } hist(x,nclass=15,main= month.name[i] ,xlab="cfs",probability=T) lines(density(x,bw="nrd0",na.rm=TRUE),col=2) rug(x,,,,2)

29 Non parametric CDF in R cdf.r=function(density) { x=density$x
yt=cumsum(density$y) n=length(yt) y=(yt-yt[1])/(yt[n]-yt[1]) # force onto the range 0,1 without checking for significant error list(x=x,y=y) } dd=density(x,bw="nrd0",na.rm=TRUE) cdf=cdf.r(dd) plot(cdf,type="l") cdf.r=function(density) { x=density$x yt=cumsum(density$y) n=length(yt) y=(yt-yt[1])/(yt[n]-yt[1]) # force onto the range 0,1 without checking for significant error list(x=x,y=y) } dd=density(x,bw="nrd0",na.rm=TRUE) cdf=cdf.r(dd) plot(cdf,type="l") ylookup.r=function(x,cdf) int=sum(cdf$x<x) # This identifies the interval for interpolation n=length(cdf$x) if(int < 1){ y=cdf$y[1] }else if(int > n-1) y=cdf$y[n] else y=((x-cdf$x[int])*cdf$y[int+1]+(cdf$x[int+1]-x)*cdf$y[int])/(cdf$x[int+1]-cdf$x[int]) return(y) xlookup.r=function(y,cdf) int=sum(cdf$y<y) # This identifies the interval for interpolation x=cdf$x[1] x=cdf$x[n] x=((y-cdf$y[int])*cdf$x[int+1]+(cdf$y[int+1]-y)*cdf$x[int])/(cdf$y[int+1]-cdf$y[int]) return(x) ylookup.r=function(x,cdf) xlookup.r=function(y,cdf) { int=sum(cdf$y<y) # This identifies the interval for interpolation x=((y-cdf$y[int])*cdf$x[int+1]+(cdf$y[int+1]-y)*cdf$x[int])/(cdf$y[int+1]-cdf$y[int]) return(x) }

30 Gamma Estimate parameters using moments or maximum likelihood

31 Box-Cox Normalization
The Box-Cox family of transformations that includes the logarithmic transformation as a special case (l=0). It is defined as: z = (x -1)/ ;   0 z = ln(x);  = 0 where z is the transformed data, x is the original data and  is the transformation parameter.

32 Log normalization with lower bound
z = ln(x-)

33 Determining Transformation Parameters (, )
PPCC (Filliben’s Statistic): R2 of best fit line of the QQplot Kolomgorov-Smirnov (KS) Test (any distribution): p-value Shapiro-Wilks Test for Normality: p-value

34 Quantiles Rank the data Theoretical distribution, e.g. Standard Normal
x1 x2 x3 . xn pi qi qi is the distribution specific theoretical quantile associated with ranked data value xi

35 Quantile-Quantile Plots
QQ-plot for Raw Flows QQ-plot for Log-Transformed Flows ln(xi) qi xi qi Need transformation to make the Raw flows Normally distributed.

36 Box-Cox Normality Plot for Monthly September Flows on Alafia R.
Using PPCC This is close to 0,  = -0.14

37 Kolmogorov-Smirnov Test
Specifically, it computes the largest difference between the target CDF FX(x) and the observed CDF, F*(X). The test statistic D2 is: where X(i) is the ith largest observed value in the random sample of size n.

38 Box-Cox Normality Plot for Monthly September Flows on Alafia R.
Using Kolmogorov-Smirnov (KS) Statistic This is not as close to 0,  = -0.39

39 shapiro.test(x) in R

40 Box-Cox Normality Plot for Monthly September Flows on Alafia R.
Using Shapiro-Wilks Statistic This is close to 0,  = Same as PPCC.

41 Testing simulated marginal distributions

42 Testing correlation and skewness

43 Testing state dependent correlations


Download ppt "Alafia river: Autocorrelation Autocorrelation of standardized flow."

Similar presentations


Ads by Google