Summary to Exploring Data Lesson 1 - 0 Summary to Exploring Data
Chapter Objectives Use a variety of graphical techniques to display a distribution. These should include bar graphs, pie charts, stemplots, histograms, ogives, time plots, and Boxplots Interpret graphical displays in terms of the shape, center, and spread of the distribution, as well as gaps and outliers Use a variety of numerical techniques to describe a distribution. These should include mean, median, quartiles, five-number summary, interquartile range, standard deviation, range, and variance Interpret numerical measures in the context of the situation in which they occur Learn to identify outliers in a data set
Section Objectives Identify the individuals and variables in a set of data Classify variables as categorical or quantitative
Vocabulary Individuals – objects described by a set of data; maybe people, animals or things Variable – any characteristic of an individual; can take on different values for different individuals Categorical variable – places an individual into one of several groups or categories Quantitative variable – takes numerical values; for which it makes sense to find an average! Distribution – tells us what values the variable takes on and how often it takes these values Inference – using a sample of data to infer (to draw conclusions) about a larger group of data
Setup An airline has finished training 25 first officers, 15 males and 10 females, to be captains. The airline has only 8 new captain positions to fill. Management decides to put all twenty-five names in a hat a draw out the lucky 8. The next day 5 females and 3 males are announced as new captains. Some of the male pilots not selected complain that it was not random and wants to file a grievance. Do they have any statistical evidence?
Activity Hiring Discrimination – It just won’t fly! on pg 5 Captains Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Male Female
Activity – Computer Assisted Using Excel to help calculate the probabilities of having that number of male captains selected with the parameters given in the activity: Capt 1 2 3 4 5 6 7 8 M-Pct 0.000 0.001 0.020 0.106 0.265 0.333 0.208 0.059 0.006 Cum 0.021 0.127 0.392 0.725 0.933 0.992 ~1 Cumulative probabilities don’t always add to 1 from a table due to round-off error. Almost 13% chance of this or more extreme result. Later in the course we will cover this type of problem again in the non-AP portion of discrete random variables.
Categorical Variables From Mr. Starnes data collection sheet: Gender M/F Hair color Br/Bl/Rd/Gr Restaurant Birth date Dominant hand R/L Bathroom Y/N Numbers (1-4) 1/2/3/4 S or Q S/Q Heads or Tails H/T
Quantitative Variables From Mr. Starnes data collection sheet: Hours of sleep 0-10 Number of siblings 0-8 Height 4’9” – 6’5” SAT scores 400-800 Ounces of soda 0-64 Pulse 40-80 Days 0-10 Time spent 0-4 Instructor’s age 50-70
Summary and Homework Summary Homework A data set contains information on a number of individuals Information is often values for one or more variables Variables can be categorical or quantitative Distribution of a variable describes what values it can take on and how often Homework Pg 7-8, problems 1, 3, 5, 7, 8