Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03

Data Analysis and Statistical Software I (323-21-403) Quarter: Autumn 02/03
Daniela Stan, PhD Course homepage: Office hours: (No appointment needed) M, 3:00pm - 3:45pm at LOOP, CST 471 W, 3:00pm - 3:45pm at LOOP, CST 471 11/22/2018 Daniela Stan - CSC323

Outline Chapter 6: Introduction to Inference Confidence Intervals
11/22/2018 Daniela Stan - CSC323

Population: mean µ unknown
Population: mean µ unknown standard deviation  known How do we estimate µ.? SRS1 SRS2 SRSn The center of the distribution of the sample averages is the population average . A measure of the spread is the standard error.  The central theorem says: In repeated sampling, the sample mean x follows the normal distribution centered at the unknown population mean and having standard deviation x = /(n1/2) x1 x2 xn 11/22/2018 Daniela Stan - CSC323

Confidence Intervals Population: mean µ unknown
Sample mean x is an unbiased estimator of µ. How reliable/accurate is this estimator? What is the variability of the estimator? The rule says: In the normal distribution with mean µ and standard deviation : approximately 95% of the observations fall within 2*  of µ Since x has a normal distribution the rule says: approximately 95% of all the samples will capture the true value µ in the interval (x -2 /(n1/2), x+2 /(n1/2)) x ~ N(µ, /(n)1/2) 11/22/2018 Daniela Stan - CSC323

Confidence Intervals (cont.)
In the language of statistical inference, we say that we are 95% confident that the unknown population mean lies in the interval (x -2/(n1/2), x+2 /(n1/2)) 95% confidence interval for µ. In only 5% of all samples, the sample mean x is not in the above interval, that is 5% of all samples give inaccurate results. A Confidence Interval (CI) has the form: (estimate - margin of error, estimate + margin of error) estimate is the guess for the value of the unknown population parameter margin of error gives how accurate the we believe our guess is 11/22/2018 Daniela Stan - CSC323

95% Confidence Intervals
x1 x2 x3 1 out of 25 confidence intervals does not cover the true value of µ x4 x5 x25 11/22/2018 Daniela Stan - CSC323

Properties of Confidence Intervals
It is an interval of the form (a, b), where a and b are numbers computed from the data; its purpose is to estimate an unknown parameter with an indication of how accurate the estimate is and how confident we are that the result is correct. It has a property called a confidence level that gives the probability that the interval covers the parameter; we use C to denote the confidence level in decimal form; for example, 95% confidence level corresponds to C=0.95. Users can choose the confidence level, most often 90% or higher because we want to be quite sure about our conclusions. 11/22/2018 Daniela Stan - CSC323

CI for a population mean
Choose an SRS of size n from a population having unknown mean µ and known standard deviation . A level .95 confidence interval for µ is (x - 2/(n1/2), x + 2/(n1/2)) in general, a level C confidence interval for µ is (x - z* /(n1/2), x + z* /(n1/2)) The relationship between C and z*: P(x - z* /(n1/2)< µ <x + z*/(n1/2))=C 11/22/2018 Daniela Stan - CSC323

Other confidence levels
Other confidence levels are found by checking the normal table (Table D). z* C 90% 95% 99% 1.64 /(n1/2), Margin of error x 90% Confidence Interval 1.96 /(n1/2), x 95 % Confidence Interval 2.57 /(n1/2), x 99 % Confidence Interval 11/22/2018 Daniela Stan - CSC323

Remarks 1. Notice the trade off between the margin of error and the confidence level. The greater the confidence you want to place in your prediction, the larger the margin of error is (and hence less informative you have to make your interval). A C.I. gives the range of values for the unknown population parameter that are plausible, in the light of the observed sample parameter. The confidence level says how plausible. A C.I. is defined for the population parameter, NOT the sample statistic. The confidence intervals are approximate and holds in large samples. This is because they are defined using the normal approximation. To make a margin of error smaller, you can take a larger sample decrease the population standard deviation decrease the confidence level C 11/22/2018 Daniela Stan - CSC323

More on CI’s Problems 6.7, 6.10/page430 Choose the sample size:
In order to obtain both high confidence C and a small desired margin error m when calculating the confidence interval of a normal mean, the following equation should be solved: m = z* /(n1/2) Smaller m gives greater n Sample size: n = (z* /m) 2 Problems 6.16, 6.17/page432 11/22/2018 Daniela Stan - CSC323

Summary for C.I.’s The formula for a C.I. is valid for estimates computed from a simple random sample. May not be valid for other types of samples. A C.I. is used when estimating an unknown parameter from sample data. The C.I. gives a plausible range for the unknown parameter The C.I. is an interval for the population parameter (the true value), NOT for the sample estimate. The C.I. is constructed from the sample and depends on the sample! 11/22/2018 Daniela Stan - CSC323

SAS procedure for the C.I. of a population average
PROC MEANS DATA = data-name N MEAN STD CLM ALPHA=value MAXDEC = number; VAR measurement-variable; RUN; Where ALPHA=value is the (1-confidence level) value. (Thus ALPHA=0.05 for a 95% C.I., ALPHA =0.1. for a 90% CI, ALPHA=0.01 for a 99% C.I. ) CLM is the option for C.I.’s MAXDEC = number defines how many decimal numbers (typically 1 to 4) 11/22/2018 Daniela Stan - CSC323

C.I. for population average
SAS Example proc means data=dist n mean std clm alpha=0.05 maxdec=4 ; var x; title “C.I. for population average"; run; C.I. for population average The MEANS Procedure Analysis Variable : x Lower 95% Upper 95% N Mean Std Dev CL for Mean CL for Mean ________________________________________________________ ________________________________________________________ 11/22/2018 Daniela Stan - CSC323

Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03

Similar presentations

Presentation on theme: "Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03

Similar presentations

Presentation on theme: "Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03"— Presentation transcript:

Similar presentations

About project

Feedback