Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03

Data Analysis and Statistical Software I (323-21-403) Quarter: Autumn 02/03
Daniela Stan, PhD Course homepage: Office hours: (No appointment needed) M, 3:00pm - 3:45pm at LOOP, CST 471 W, 3:00pm - 3:45pm at LOOP, CST 471 1/15/2019 Daniela Stan - CSC323

Outline Chapter 4: Probability – The Study of Randomness
Means and Variances of Random Variables Chapter 5: Sampling 1/15/2019 Daniela Stan - CSC323

Outline Chapter 4: Probability – The Study of Randomness
The Standard Error Chapter 5: Sampling Distributions 1/15/2019 Daniela Stan - CSC323

A measure of the chance error: the standard error
Ex: Assume that a coin is tossed for a repeated number of times. The actual number of heads will be off the expected value for some amount. How big is that amount on average? The number of heads in 4 tosses of a coin will be: number of heads X= expected value +chance error So if we observe 3 heads, the chance error is +1; if we observe 1 head, the chance error is –1. 1/15/2019 Daniela Stan - CSC323

A measure of the chance error: the standard error
The chance error is measured by the standard error. The standard error is calculated by the following 3 steps: Calculate the deviations of each value of X from the expected value X x1 – X, x2 – X ,…, xn – X. Square the deviations and multiply each square deviation by its probability. Add all the products. Take the square root. 1/15/2019 Daniela Stan - CSC323

The expected value of the number of heads in 4 tosses
of a coin will be: The expected value is X =2 X 1 2 3 4 Probability 0.0625 0.25 0.375 Deviations X– X 0–2= –2 1–2 = –1 2–2=0 3–2=1 4–2=2 Square Deviations (-2)2=4 (-1)2=1 (0)2=0 (1)2=1 (2)2=4 Products= Dev2* Probability 4*0.0625= 0.25 1*0.25 =0.25 0*0.375=0 1*0.25=0.25 1/15/2019 Daniela Stan - CSC323

Step 3: sum the products: (0.25+0.25+0+0.25+0.25)=1
Step 4: Take the square root The standard error of X is =1 = 1. The observed number of heads in 4 tosses of a coin is likely to be around 2, give or take 1. 1/15/2019 Daniela Stan - CSC323

Standard Error & Probability Histograms
The standard error measures the spread of the probability histogram. Chance 40 (%) 30 20 10 X–1s.e.=2 – 1=1 X+1s.e.=2+1=3 1 s.e. 1 s.e. X=2 Remark: Observed values are rarely more than 2 or 3 standard errors away!! 1/15/2019 Daniela Stan - CSC323

Mathematical Expressions
Given a random variable X with probability table X x1 x2 x3 x4 … xk Probability p1 p2 p3 p4 pk The expected value is The standard error is 1/15/2019 Daniela Stan - CSC323

Remarks on random processes
An observed value should be somewhere around the expected value; the difference is chance error. The likely size of the chance error is the standard error. Observed values are rarely two or three standard errors away from the expected value. 1/15/2019 Daniela Stan - CSC323

Remarks on random processes (cont.)
The standard deviation is defined for a list of numbers. The standard error is defined for random processes and measures the chance error. (Subtle difference) The standard error “makes more sense” if the probability histogram of the random variable is bell-shaped, (similar to the normal distribution). 1/15/2019 Daniela Stan - CSC323

Recommended problems Problems 4.60, 4.61/page 334
1/15/2019 Daniela Stan - CSC323

Chapter 5: Sampling Distributions
The probability distribution of a statistic from a random sample or randomized experiment is its sampling distribution. Ex: A sample survey asks 2500 adults whether they agree that “I like buying clothes, but shopping is often frustrating and time consuming.” The number of those who say “Yes” is a random variable X. The random variable X is a count of the occurrences of some outcome in a fixed number of observations. If the number of distributions is n, than the sample proportion is P=X/n. For example, if 1650 of the 2500 shoppers say “yes”, the sample proportion is 1650/2500=.66 1/15/2019 Daniela Stan - CSC323

The binomial distributions for sample counts
The binomial setting: There are a fixed number n of observations. The n observations are all independent. Each observation falls into one of just two categories: “success” or “failure” The probability of success, call it p, is the same for each observation. Example: Toss a coin for n times Each toss gives either head or tail. The outcomes of successive tosses are independent The probability p of getting heads is .5 1/15/2019 Daniela Stan - CSC323

The binomial distributions for sample counts
The distribution of the count X of successes in the binomial setting is called the binomial distribution with parameters n (number of observations) and p (probability of success on any one observation). The possible values of X are from 0 to n. X ~ B (n, p) When the population is much larger than the sample, the count X of successes in an SRS of size n has approximately the B (n ,p) distribution if the population proportion of successes is p. How large is large? - we will use the binomial sampling distribution for counts when the population is at least 10 times as large as the sample. 1/15/2019 Daniela Stan - CSC323

Finding Binomial Probabilities: Tables
Table C: the entries in the table are the probabilities P(X=k) of individual outcomes for a binomial random variable X. Example: A quality engineer selects an SRS of 10 switches from a large shipment for detailed inspection. Unknown to the engineer, 10% of the switches in the shipment fail to meet the specifications. What is the probability that no more than 1 of the 10 switches in the sample fails inspection? X = count of bad switches ~ B(10,.1) P(X<=1)=P(X=1)+P(X=0)= =.7361 Therefore, about 74% of all samples will contain no more than 1 bad switch. 1/15/2019 Daniela Stan - CSC323

Finding Binomial Probabilities
Recommended problems: 5.5, 5.6/page 385 1/15/2019 Daniela Stan - CSC323

Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03

Similar presentations

Presentation on theme: "Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03

Similar presentations

Presentation on theme: "Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03"— Presentation transcript:

Similar presentations

About project

Feedback