Download presentation
Presentation is loading. Please wait.
Published byKathryn Long Modified over 8 years ago
1
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD
2
Determination of Distribution The underlying distribution can be established in one of the following ways: Drawing a frequency diagram Plotting the data on probability paper Conducting statistical tests known as goodness-of-fit tests for distribution
3
Probability Paper Gumbel (1954) N observations (X 1, X 2, X 3 …X N ) Arrange Data in increasing order ith value is plotted at the CDF of i/(N+1)
4
Probability Paper
5
Plotted versus Normal Dist
6
Goodness of Fit Question: Whether two independent samples come from identical continuous distributions? Dataset compared to the theoretical distribution Restated: Is the theoretical distribution an acceptable representation of the dataset? Chi Square based on PDF Kolmogorov-Smirnov based on the CDF
7
Based on error between the observed and assumed PDF of the distribution Methodology: Arrange N data points in increasing order Break data into m intervals Determine: n i – observed frequency of data points in interval “i” e i – theoretical Frequency of data points in interval “i” Chi-Square Test ( 2 )
8
Methodology: Determine c 1- ,f = Significance Level (usually between 1% and 10%) f = degrees of freedom = m – 1 – k m = # of intervals k = # of distribution parameters (= 2 for normal or lognormal) Obtain c 1- ,f from Appendix 3 The assumed distribution is acceptable at the significance level if: Chi-Square Test ( 2 ) NOTE: m should be > = 5 to obtain satisfactory results
9
Significance Level, Significance level, , represents probability that any differences between sample and theoretical distribution are due to chance A higher value implies a more stringent requirement to accept proposed distribution, i.e., better agreement Values as low as 1% to 10% are common
10
Example (Haldar 5.2)
11
a) Uniform distributed random variables Ordinary graph paper can be prob. paper b)
12
Example (Haldar 5.2) c) f = m – 1 - k
13
Example (Haldar 5.5) Perform Chi-square test on the data from Problem 3.1 n = 30 data points Can the underlying distribution be accepted as normal at a 5% significance level? f = degrees of freedom = m – 1 – k m = # of intervals k = # of distribution parameters
14
Solution (Haldar 5.5a)
15
Kolmogorov-Smirnov (K-S) Test Based on the error between the observed and assumed CDF of the distribution Methodology: Arrange data in increasing order and assign index, m to each data point where m = 1,2,…,n Determine S n (x i ) = manual CDF: S n (x i ) = 0; x < x 1 S n (x i ) = m/n; x m ≤ x ≤ x m+1 S n (x i ) = 1;x ≥ x n Determine F X (x i ) = Assumed distribution
16
K-S Test Methodology: Determine D n = max| F x (x i ) – S n (x i ) | Determine D n = Significance Level D n value found in Appendix 4 The assumed distribution is acceptable at the significance level if the maximum difference D n is less than or equal to the tabulated value of D n
17
Example (Haldar 5.8) Perform K-S test on the data from Problem 3.1. Can the underlying distribution be accepted as normal at a 5% significance level?
18
Solution (Haldar, 5.8)
19
Parameter Estimation
20
Method of Moments Moments are statistical parameters of a dataset 1 st moment (mean = E(X)) 2 nd central moment (Var(X)) 3 rd central moment (skewness) Distribution parameters are derived from the moments PDF forms and parameters for distributions in Table 5.6 on page 118 All are based on first two moments, E(x) and Var(X)
21
Method of Maximum Likelihood
22
Interval Estimation Differences exist between expected values of populations and samples Distribution parameters ( ) are typically Estimated from samples Applied to populations Intervals estimate the range of possible values for the parameter to a specified level of confidence
23
Confidence Intervals Distributions can be linked to probability – making possible predictions and evaluations of the likelihood of a particular occurrence In a normal distribution, the number of standard deviations from the mean tells us the percent distribution of the data and thus the probability of occurrence
24
x = Mean = Standard Deviation n = Sample Size (1 – ) = Confidence Interval k /2 = value of the standard normal variate (z) = -1 (p) (found using Appendix 1) Interval Estimation for the Mean with Known Variance Two tailed interval!
25
Lower Confidence Limit for Upper Confidence Limit for Lower and Upper Confidence Limit for the Mean with Known Variance Each is a one tailed interval!
26
Interval Estimation for the Mean with Unknown Variance t /2,n-1 = value of Student’s t distribution – found using Appendix 5 Standard normal distribution valid for… Known population variance Large n ( > 30) If n is small (< 10), s ≠ use Student’s t
27
Student’s t distribution f = n – 1 = DOF
28
Interval Estimation for Variance C ,n-1 = value of Chi Square distribution – found using Appendix 3
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.