Download presentation
Presentation is loading. Please wait.
Published byAntonio Holmes Modified over 11 years ago
1
Chapter 4. Elements of Statistics # brief introduction to some concepts of statistics # descriptive statistics inductive statistics(statistical inference) # Classification of the field of statistics i) Sampling theory ii) Estimation theory iii) Hypothesis testing iv) Curve fitting or Regression v) Analysis of variance
2
4.2 Sampling Theory–the Sample Mean How many samples are required for a given degree of confidence in the result? # Terminology - population N(size of population) very large or - (random) sample n(size of sample) # one of the most important quantities is the sample mean How close the sample mean might be to the average value of the population?
3
Let the sample have the numerical value of x 1, x 2, … x n Then, the sample mean is given by Note that we are interested in the statistical properties of arbitrary random samples rather than any particular sample. That is, the sample mean becomes a random variable. Therefore, it is appropriate to denote the sample mean as
4
We want the mean value of the sample mean close to the true mean value of the population the mean value of the sample mean = the true mean value of the population The sample mean is a unbiased estimate of the true mean. But, this is not sufficient to indicate whether the sample mean is a good estimator of the true population mean.
5
The variance of the sample mean ? N n (population sampling.) Var mean square of - square of the mean
6
: statistically indep. Var (!)
7
Where is the true variance of the population As n =>, Variance => 0, Which means that large sample sizes lead to a better estimate * : 1)N N sampling with replacement
8
2)N replace Var N-> N = n 0 ( !) `Two examples : pp163 ~165
9
4.3 Sampling Theory – The sample Variance The population variance is needed for determining the sample size required to achieve a desired variance of the sample mean (see eq. 4-4) Definition(Sample Variance): The expected value of the sample variance can be derived easily using not the true variance, that is, a biased estimate rather than an unbiased one
10
Now, we redefine the sample variance for having an unbiased estimate of the population variance : Note that these hold for very large N, that is, N=. How about when the population size is not large?
11
# When N is not large, the expected value of S 2 is given by For obtaining an unbiased estimate, we redefine # The variance of the estimates of the variance : the variance of S 2 : the variance of : where is the 4th central moment of the population
12
4.4 Sampling Distributions & Confidence Intervals what is the probability that the estimates are within specified bounds? p,d,f 2, sample mean ! normalized sample mean Xi Gaussian and independent => Gaussian (0,1)
13
X i not Gaussian n=> Z asymptotically Gaussian by the central limit theorem (n n30 ; A rule of thumb) H.W) Solve the problems in chap.4; 4-2.1, 4-2.5, 4-3.1, 4-4.1, 4-5.1, 4-6.1
14
No longer Gaussian => Student s t distribution with n-1 d.of f. p170 4-2
15
`pdf of student s t distribution Where the gamma heavier tails (n 30) n any = ! integer
16
( ) confidence interval ? interval estimate ( ) q- percent confidence interval (q/100 )
17
k q pdf. k p.172.4-1. (q k )
18
) q=95% -> 0.95. (q=99% !)
19
: q from PDF F Prob. Distribution for Student s + function (See Appendix F or Table 4-2 page 172 for v = 8 )
20
4.5 Hypothesis Testing The question arises; How does one decide to accept or reject a given hypothesis when the sample size and the confidence level are specified?
21
Two steps; i) to make some hypothesis about the population ii) to determine if the observed sample confirms or rejects this hypothesis.
22
Two tests; one-sided or two-sided. The average life time of the light bulb >= 1000 hours 100ohms resisters too high or too low
23
One-sided test ) A capacitor manufacturer claims that a mean value of breakdown voltage >= 300 V a sample of 100 capacitors –> 99% confidence level is used ) Is the manufacturer s claim valid? ) We would reject the hypothesis!
24
Normalized r, v, Z 99%
25
99.5% – accept the hypothesis less likely more severe requirement
26
(level of significance) (100% - ) more severe!
27
) sample size=9, no longer Gaussian -> Student s + distribution v=n-1=8 dof 99%, – accept the hypothesis
28
a small sample size t heavier tail t distribution more likely to exceed the critical value small size less reliable(less severe) than large size tests
29
Two-sided test ) A manufacture of Zener diodes claims that the true mean breakdown voltage = 10V ) hypothesis : the true accepts or rejects? 100 samples -> 95%
30
) Rejected! z is outside the interval,
31
) 9 samples t is inside the interval, accepted! –Less severe than a large sample test
32
4.6 Curve Fitting and Linear Regression ( ), x y. 1 (linear) or 2 (correlation analysis) x y.
33
–Scatter diagram ( ) data -n samples
34
-Curve fitting to find a mathematical relationship regression curve (equation) ; resulting curve
35
-What is the best fit? In a least squares sense –Let be the errors between the regression curve and the scatter diagram – minimum. – the type of equation to be fitted to the data n smoothing
36
Linear regression a, b ?
37
)
38
MATLAB in function, p = polyfit(y, x, n)
39
A second-order regression ( p.180, 4-3, 4-6)
40
4.7 Correlation between Two Sets of Data Two data sets correlated or not?
41
Linear correlation coefficient Pearson s r Usage ; useful in determining the sources of errors ) a point-to-point digital communication link BER(Bit Error Rate) link quality BER may fluctuate randomly due to wind ) error source wind ? wind 20 resulting BER correlation test r=0.891 yes!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.