Presentation is loading. Please wait.

Presentation is loading. Please wait.

Definitions Population: A collection, or set, of individuals, objects, or events whose properties are to be analyzed. Sample: A subset of the population.

Similar presentations


Presentation on theme: "Definitions Population: A collection, or set, of individuals, objects, or events whose properties are to be analyzed. Sample: A subset of the population."— Presentation transcript:

1 Definitions Population: A collection, or set, of individuals, objects, or events whose properties are to be analyzed. Sample: A subset of the population. We desire knowledge about an entire population but is most often the case that it is prohibitively expensive, so we select representative sample from the population and study the individual items in the sample. Descriptive Statistics: The collection, presentation, and description of of the sample data. Inferential Statistics: The technique of of interpreting the values resulting from the descriptive techniques and making decisions and drawing conclusions about the population. 1Section 1.1, Page 4

2 Definitions Parameter: A numerical value summarizing all the data of a population. For example, the average high school grade point of all Shoreline Students is 3.20. We often use Greek letters to identify parameters, μ = 3.20. Statistic: A numerical value summarizing the sample data. For example, the average grade point of a sample of Shoreline Students is 3.18. We would use the symbol, The statistic corresponds to the parameter. We usually don’t know the value of the parameter, so we take a sample and estimate it with the corresponding statistic. Sampling Variation: While the parameter of a population is considered a fixed number, the corresponding statistic will vary from sample to sample. Also, different populations give rise to more or less sampling variability. Considering the variable age, samples of 60 students from a Community college would have less variability than samples of a Seattle neighborhood. 2Section 1.1, Page 4

3 Variables Variable: A characteristic of interest about each element of a population. Data: The set of values collected for the variable from each of the elements that belong to the sample. Variability: The extent to which data values for a particular variable differ from each other. Numerical or Quantitative Variable: A variable that quantifies an element of the population. The HS grade point of a student is a numerical variable. Numerical variables are numbers for which math operations make sense. The average grade point of a sample makes sense. Continuous Numerical Variable: The variable can take on take on an uncountable number of values between to points on the number line. An example is the weight of people. Discrete Numerical Variable: The variable can take on a countable number of values between two points on a number line. An example is the price of statistics text books. 3Section 1.1, Page 8

4 Variables (2) Section 1.1, Page 84 Categorical or Qualitative Variable: A variable that describes or categorizes an element of a population. The gender of a person would be a categorical variable. The categories are male and female. Nominal Categorical Variable: A categorical variable that uses a number to describe or name an element of a population. An example is a telephone area code. It is a number, but not a numerical variable used on math operations. The average area code does not make sense. Ordinal Categorical Variable: A categorical variable that incorporates an ordered position or ranking. An example would be a survey response that ranks “very satisfied” ahead of “satisfied” ahead of “somewhat satisfied.” Limited math operations may be done with ordinal variables.

5 Problems Problems, Page 195

6 Problems Section 1.3, Page 206

7 Observational Studies and Experiments Section 1.3, Page 127 Observational Study: Researchers collect data without modifying the environment or controlling the process being observed. Surveys and polls are observational studies. Observational studies cannot establish causality. Example: For a randomly selected high school researchers collect data on each student, grade point and whether the student has music training, to see if there is a relationship between the two variables. Experiments: Researchers collect data in a controlled environment. The investigator controls or modifies the environment and observes the effect of a variable under study. Experiments can establish causality. Example: Randomly divide a sample of people with migraine headaches into a control and treatment groups. Give the treatment group a experimental medication and the control group a placebo, and then measure and compare the reduction of frequency and severity of headaches for both groups.

8 Single-Stage Sampling Methods Section 1.3, Page 138 Single-stage sampling: A sample design in which the elements of the sampling frame treated equally and there is no subdividing or partitioning of the frame. Simple Random Sample: Sample selected in such a way that every element of the population has an equal probability of being selected and all samples of size n have an equal probability of being selected. Example: Select a simple random sample of 6 students from from a class of 30. 1.Number the students from 1 to 30 on the roster. 2.Get 6 non-recurring random numbers between 1 and 30. 3.The six students who match the six random numbers are the sample.

9 Multistage Sampling Designs Section 1.3, Page 159 Multistage Sampling: A sample design in which the elements of the sampling frame are subdivided and the sample is chosen in more than one stage. Stratified Random Sampling: A sample is selected by stratifying the population, or sampling frame, and then selecting a number of items from each of the strata by means of a simple random sampling technique. The strata are usually subgroups of the sampling frame that are homogeneous but different from each other. Example: Select a sample of six students from a class of 30 so that the sample contains an equal number of males and females. 1.List the males and females separately 2.Take a simple random sample of 3 students from each group. 3.The six students selected are the sample.

10 Multi-Stage Sampling Designs Section 1.3, Page 1610 Cluster Sample: A sample obtained stratifying the population, or sampling frame, and then selecting some or all of the items from some, but not all of the strata. The strata are usually easily identified subgroups of the sampling frame that are similar to each other. This is often the most economical way to sample a large population. Example: Take a sample of 300 Catholics in the Seattle Area. 1. Get a list of the Catholic Parishes in the Seattle area. 2. Take a random sample of 3 parishes. 3. In each parish, select a simple random sample of 100 parishioners.

11 Problem a.Find the mean, variance, and standard deviation. b.Find the 5-number summary. c.Make a box and whisker display and label the numbers. d.Calculate the Interquartile range and the range e.Describe the shape of the distribution 11Problems, Page 50

12 Summary of Probability Formulas Equally Likely Outcomes: P(A) = n(A)/n Complement: P(A) = 1- P(not A); P(not A) =1- P(A) General Addition Rule: P(A or B) = P(A) + P(B) – P(A and B) If A and B are disjoint, P(A and B) = 0 Then the Special Addition Rule: Then P(A or B) = P(A) + P(B) General Multiplication Rule: P(A and B) = P(A)×P(B|A) If A and B are independent, P(B|A) = P(B) Then the Special Multiplication Rule: P(A and B) = P(A)×P(B) Odds If the odds for A are a:b, then the odds against A are b:a. The probability of A is a/(a+b). The probability of not A is b/(b+a) 12Chapter 4

13 Problems 13Problems, Page 95

14 Problem 14Problems, Page 95

15 Problems 15Problems, Page 97

16 Problems 16Problems, Page 99

17 Z Score Problems 17Problems, Page 52

18 Problems 18Problems, Page 132

19 Problems 19Problems, Page 133 6.51 IQ scores are normally distributed with a mean of 100 and a standard deviation of 16. Find the following: a.The 66 th percentile. b.The 80 th percentile. c.The minimum score required to be in the top 10%. d.The minimum score to be in the top 25%. 6.52 Find the two z-scores that bound the middle 30% of the standard normal distribution.

20 Problems 20Problems, Page 149

21 Problems 21Problems, Page 151

22 Problems 22Problems, Page 50

23 Problems 23Problems, Page 179 the standard deviation is 5 seconds.

24 Problems Test the claim that the BMI of the cardiovascular technologists is different than the BMI of the general population. Use α =.05. Assume the population of the BMI of the cardiovascular technologists is normal. a.State the necessary hypotheses. b.Is the sampling distribution normal. Why? c.Find the p-value. d.State your conclusion. e.If you made an error, what type of error did you make? 24Problems Page 181

25 Problems 25Problems, Page 179

26 Problems a.Find the 98% confidence interval. b.Find the critical value c.Find the margin of error. d.Find the standard error. e.What assumption must we make about the the population to have a t-sampling distribution. f.What are the proper words to describe the confidence interval? g.If you wanted to have a margin of error of one minute and the 98% confidence interval for this data, how large must the sample be? 26Problems, Page 205

27 Problems a.Find the p-value. b.State your conclusion. c.What is the name of the probability model used for the sampling distribution d.What is the mean of the sampling distribution? e.What is the value of the standard error? f.If your conclusion is in error, what type of error is it? 27Problems, Page 205

28 Problems 28Problems, Page 208

29 Problems a.Check the conditions for a normal sampling distribution. b.State the hypotheses. c.Find the p-value. d.State your conclusion e.If you make an error in your conclusion, what type is it? f.Find the mean of the sampling distribution. g.Find the standard error of the sampling distribution. 29Problems, Page 207

30 Dependent and Independent Samples 30Section 10.1, Page 208

31 Problems a.Test the hypotheses that the people increased their knowledge. Use α=.05 and assume normality. State the appropriate hypotheses. b.Find the p-value and state your conclusion. c.Find the 90% confidence interval for the mean estimate of the increase in test scores. 31Problems, Page 231

32 Problems a.State the hypothesis (Assume Normality) b.Find the p-value, and state you conclusion. c.Find the 95% confidence interval for the difference of the means; Gouda-Brie. d.Find the mean and standard error of the sampling distribution 32Problems, Page 232

33 Problems a.State the appropriate hypotheses. b.Find the p-value and state your conclusion. c.What model is used for the sampling distribution and what is the mean of the sampling distribution and its standard error? d.Find the 98% confidence interval for the difference in proportions, men – women. 33Problems, Page 234

34 Summary of Chi-Square Applications Goodness of Fit Test Given one categorical variable with a fixed set of proportions for the categories. Ha: The observed data does not fit the proportions. Calculate expected values (Ho true proportion * total observations) Observed and Expected data in List Editor PRGM: GOODFIT Test for Independence Given two categorical variables measured on the same population. Ha: The variables are not independent (They are related) Observed data in Matrix Editor Stat-Tests-χ2 Test Test for Homogeneity Given one categorical variable and two or more populations. Ha: The proportions for the categories are not the same for for all populations. Observed data in Matrix Editor Stat-Tests-χ2 Test 34Chapter 12, Summary

35 Chi-Square Distribution Fair Die Example Now we need a sampling distribution for the Χ 2 statistic = 2.2, so we can calculate the probability of getting a Χ 2 ≥ 2.2 when the true proportions are all equal to 1/6. Χ 2 Distribution for 5 df This is a distribution of all possible Χ 2 statistics calculated from all possible samples of 60 observations when there are 6 proportions or cells. Note that the degree of freedom equals the number of proportions – 1. Finding the p-value on the TI-83, Given Χ 2 Stat, df PRGM – CHI2DIST LOWER BOUND: 2.2 UPPER BOUND: 2 ND E99 df: 5 Output: P-VALUE = 0.8208 The null hypothesis cannot be rejected. 35Section 11.2, Page 240

36 Problems a.Perform a hypotheses test to see if the preferences are not all the same. State the hypotheses. b.Find the p-value and state your conclusion c.What is the name of the model used for the sampling distribution? 36Problems, Page 252

37 Problems a.Perform a hypotheses test to see if the preferences are not all the same. State the hypotheses. b.Find the p-value and state your conclusion c.What is the name of the model used for the sampling distribution? 37Problems, Page 252

38 Problems a.Test the hypotheses that the size of community reared in is independent of the size of community residing in. State the appropriate hypotheses. b.Find the p-value and state your conclusion c.What is the name of the sampling distribution? d.What are the necessary conditions, and are they satisfied? What is the value of the smallest expected cell? 38Section 11.3, Page 254

39 The F-Distribution 39Sec 10.5, Page 226 Each sample must be from a normal distribution 4.

40 Problem 40Problems, Page 234 Set up the problem so that the the F-Stat >1. a.State the necessary hypotheses. b.Find the p-value and state your conclusion. c.What is the name of the model used for the sampling distribution?

41 Problems 41Sec 12.1, Page 268 a.State the necessary hypotheses. b.Sketch the side-by-side box plots. Does it appear that the means are all the same? c.Find the p-value and state your conclusion. d.What is the name of the model used for the sampling distribution?

42 Problems Problems, Page 26842 Sample Size Sample Mean Sample St. Dev. Atlanta624.677.76 Boston733.009.56 Dallas730.867.58 Philadelphia532.207.47 Seattle527.409.40 St. Louis625.8310.03 a.Test the hypotheses that not all the mean commute times are all the same. State the appropriate hypothesis. b.Find the p-value and state your conclusion. c.What is the name of the sampling distribution? d.What is the F-Statistic, the df numerator and df denominator?

43 Problems Problems, Page 26843 Sample Size Sample Mean Sample St. Dev. Atlanta624.677.76 Boston733.009.56 Dallas730.867.58 Philadelphia532.207.47 Seattle527.409.40 St. Louis625.8310.03 a.Test the hypotheses that not all the mean commute times are all the same. State the appropriate hypothesis. b.Find the p-value and state your conclusion. c.What is the name of the sampling distribution? d.What is the F-Statistic, the df numerator and df denominator?


Download ppt "Definitions Population: A collection, or set, of individuals, objects, or events whose properties are to be analyzed. Sample: A subset of the population."

Similar presentations


Ads by Google