Lecture 1: Introduction Statistics is concerned with Collecting and presenting data to assist decision making Obtaining reliable forecasts Processing and analyzing data
Examples involving statistics To predict the outcome of an election (Estimation and prediction) To inspect the incoming goods from a supplier (One-sample hypothesis testing) Developers of a new hypertension drug want to determine if the drug lowers blood pressure (Two-sample hypothesis testing)
Examples (cont’d) In marketing, statistics is used to evaluate whether higher spending on advertising is justified (Simple linear regression) To forecast economic indices, such as GNP, GDP, etc related to many factors (Multiple linear regression)
Key Definitions A population (universe) is the collection of all members of a group A sample is a portion of the population selected for analysis A parameter is a numerical measure that describes a characteristic of a population A statistic is a numerical measure that describes a characteristic of a sample
Population vs. Sample Population Sample a b c d b c ef gh i jk l m n o p q rs t u v w x y z b c g i n o r u y Measures used to describe a population are called parameters Measures computed from sample data are called statistics
Examples Population All eligible voters All light bulbs manufactured in a day All patients with high blood pressure for a clinical study Sample 1000 voters polled 100 light bulbs selected 200 hypertension patients enrolled for a clinical study
Two branches of statistics Descriptive Statistics Collecting, presenting, and characterizing data Inferential Statistics Drawing conclusions and/or making decisions concerning a population based only on sample data
Descriptive Statistics Collect data e.g., Survey Present data e.g., Tables and graphs Characterize data e.g., Sample mean =
Inferential statistics Population Sample Use statistics to summarize features Use parameters to summarize features Drawing conclusions about a population based on sample results.
Two types of Inferential Statistics Estimation e.g., Estimate the population mean weight using the sample mean weight Hypothesis testing e.g., Test the claim that the population mean weight is 120 pounds
Reasons for Drawing a Sample Less Time Consuming Than a Census Less Costly to Administer Than a Census Less Cumbersome and More Practical to Administer Than a Census of the Population
Types of Data Examples: Marital Status Political Party Eye Color (Defined categories) Examples: Number of Children Defects per hour (Counted items) Examples: Weight distance (Measured characteristics)
Categorical random variables yield categorical responses E.g. Are you married? Yes or No Numerical random variables yield numerical responses Discrete random variables yield numerical response that arise from a counting process E.g. How many cars do you own? 3 cars Continuous random variables yield numerical responses that arise from a measuring process E.g. What is your weight? 130 pounds