STATISTICS-THE SCIENCE OF DATA METHODS FOR: COLLECTING SUMMARIZING, ANALYZING INTERPRETING DATA WHY STUDY STATISTICS? To understand info involving “chance”” polls, advertising, sports, etc.; To read/do research results: tables, graphs, reports; To develop analytic, critical thinking skills.
STATISTICAL PROCESS COLLECT DATA (usually a bunch of numbers, rather messy) SUMMARIZE DATA (graphically or numerically) ANALYZE DATA (use stat methods) DRAW CONCLUSIONS MAKE INFERENCES or DECISIONS ABOUT POPULATION after observing only a subset – a sample from it (use more stat methods)
POPULATION, SAMPLE AND INFERENCE POPULATION – all the data one can collect on a topic of interest Examples 1. Want to know chances of a STAT152 student getting an A. Population: all 152 students. 2. Want to know “average family income” in the US. Population: incomes of all US families (over 100 million). 3. Want to know if a coin is fair, i.e. if the chances of it coming up H or T are 50% each. To figure that out we need to keep tossing the coin, record results. Population: infinite number of results.
SAMPLE Problem: Populations are often difficult or impossible to deal with or observe. Solution: Use a representative subset of a population for analysis – a sample! Representative – select units randomly. For example, simple random sample- every element of the population has the same chance of being selected. Advantages of random sampling: Reduced cost, Possibility of measuring precision in sample estimates Good accuracy Sometimes sampling is the only way to get information about the population.
POPULATION PARAMETERS and SAMPLE STATISTICS Population characteristics like center or spread: parameters Sample characteristics (computed from sample values): statistics. Example: Population mean μ, sample mean We use sample statistics as estimates of the population parameters.
INFERENCE Typical statistical inference: Making statements about population parameters using sample statistics (estimation, testing hypothesis): parametric inference. Inference not involving parameters- nonparametric inference: not included in this class. Population Sample Parameter Statistic Inference
EXAMPLE Analyze Data: Compute sample proportion of H: ps=20/100=0.2 Inference/conclusion: Looks like the coin favors T, so coin not fair. Is a coin fair? p=probability that the coin comes up H. Fair coin → p=0.5 Collect data. Toss the coin 100 times. DATA: H, H, T, H, T, T, T, …. Summarize Data: 20 H, and 80 T.