Unit 1 Mr. Lang’s AP Statistics Power point. Homework Assignment 4 For the A: 1, 3, 5, 7, 8,11- 25 Odd, 27 – 32, 37 – 59 Odd, 60, 69 – 74, 79 – 105 Odd.

Slides:



Advertisements
Similar presentations
Describing Distributions with Numbers
Advertisements

Class Session #2 Numerically Summarizing Data
Chapter 1 & 3.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Means & Medians Chapter 5. Parameter - ► Fixed value about a population ► Typical unknown.
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 1 Exploring Data
Chapter 4 Displaying Quantitative Data. Graphs for Quantitative Data.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
Why is the study of variability important? Allows us to distinguish between usual & unusual values In some situations, want more/less variability –scores.
Chapter 1 The Role of Statistics. Three Reasons to Study Statistics 1.Being an informed “Information Consumer” Extract information from charts and graphs.
Chapter 4 Numerical Methods for Describing Data. Parameter - Fixed value about a population Typical unknown Suppose we want to know the MEAN length of.
Numerical Methods for Describing Data
Means & Medians Unit 2. Parameter - ► Fixed value about a population ► Typically unknown.
Why is the study of variability important? Allows us to distinguish between usual & unusual values In some situations, want more/less variability –medicine.
Why is the study of variability important? Allows us to distinguish between usual & unusual values In some situations, want more/less variability –scores.
Statistics the science of collecting, analyzing, and drawing conclusions from data.
To be given to you next time: Short Project, What do students drive? AP Problems.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.
1 Chapter 4 Numerical Methods for Describing Data.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
Basics of Statistics. Statistics 4 the science of collecting, analyzing, and drawing conclusions from data.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
What is Statistics?. Statistics 4 Working with data 4 Collecting, analyzing, drawing conclusions.
Unit 1 - Graphs and Distributions. Statistics 4 the science of collecting, analyzing, and drawing conclusions from data.
Interpreting Categorical and Quantitative Data. Center, Shape, Spread, and unusual occurrences When describing graphs of data, we use central tendencies.
Basics of Statistics.
Quantitative Data Continued
Numerical Methods for Describing Data
Variability.
Boxplots.
__________.
How to describe a graph Otherwise called CUSS
Basics of Statistics.
Chapter 1 & 3.
NUMERICAL DESCRIPTIVE MEASURES
Distributions and Graphical Representations
Unit 1 - Graphs and Distributions
Basics of Statistics.
Means & Medians Chapter 4.
Variability.
Variability.
Variability.
Means & Medians Chapter 4.
Means & Medians Chapter 5.
Boxplots.
Variability.
Boxplots.
Variability.
Boxplots.
Means & Medians Chapter 4.
What is Statistics? Day 2..
How to describe a graph Otherwise called CUSS
Boxplots.
Boxplots.
Boxplots.
Measures of Center.
Honors Statistics Review Chapters 4 - 5
Means & Medians Chapter 5.
Boxplots.
Variability.
Variability.
Means & Medians.
Types of variables. Types of variables Categorical variables or qualitative identifies basic differentiating characteristics of the population.
Boxplots.
Boxplots.
Means & Medians Chapter 4.
Presentation transcript:

Unit 1 Mr. Lang’s AP Statistics Power point

Homework Assignment 4 For the A: 1, 3, 5, 7, 8, Odd, 27 – 32, 37 – 59 Odd, 60, 69 – 74, 79 – 105 Odd (except 85, 99, 101) 107 – 110, R1-R10 4 For the C: 1, 3, 5, 8, Odd, 37 – 59 Odd, 79 – 103 Odd (except 85, 99, 101) R1- R10 4 For the D- : 1, 3, 5, 11, 15, 19, 23, 37, 41, 45, 49, 79, 83, 87, 91, 97, 103, R1- R10 All problems must be complete, including explanations with complete sentences and or work to show if the question asks for it. All Multiple Choice problems will be graded for correctness.

Statistics 4 the science of collecting, analyzing, and drawing conclusions from data

Descriptive statistics 4 the methods of organizing & summarizing data

Inferential statistics 4 involves making generalizations from a sample to a population

Population 4 The entire collection of individuals or objects about which information is desired

Sample 4 A subset of the population, selected for study in some prescribed manner

Variable 4 any characteristic whose value may change from one individual to another

Data 4 observations on single variable or simultaneously on two or more variables

Types of variables

Categorical variables 4 or qualitative 4 identifies basic differentiating characteristics of the population

Numerical variables 4 or quantitative 4 observations or measurements take on numerical values 4 makes sense to average these values 4 two types - discrete & continuous

Discrete (numerical) 4 listable set of values 4 usually counts of items

Continuous (numerical) 4 data can take on any values in the domain of the variable 4 usually measurements of something

Classification by the number of variables 4 Univariate - data that describes a single characteristic of the population 4 Bivariate - data that describes two characteristics of the population 4 Multivariate - data that describes more than two characteristics (beyond the scope of this course

Identify the following variables: 1. the income of adults in your city 2. the color of M&M candies selected at random from a bag 3. the number of speeding tickets each student in AP Statistics has received 4. the area code of an individual 5. the birth weights of female babies born at a large hospital over the course of a year Numerical Categorical

Self Check #1

Assignment #1

Graphs for categorical data

Bar Graph 4 Used for categorical data 4 Bars do not touch 4 Categorical variable is typically on the horizontal axis 4 To describe – comment on which occurred the most often or least often 4 May make a double bar graph or segmented bar graph for bivariate categorical data sets

Using class survey data: graph birth month graph gender & handedness

Pie (Circle) graph 4 Used for categorical data 4 To make: –Proportion 360° –Using a protractor, mark off each part 4 To describe – comment on which occurred the most often or least often

Graphs for numerical data

Dotplot 4 Used with numerical data (either discrete or continuous) 4 Made by putting dots (or X’s) on a number line 4 Can make comparative dotplots by using the same axis for multiple groups

Stemplots (stem & leaf plots) 4 Used with univariate, numerical data 4 Must have key so that we know how to read numbers 4 Can split stems when you have long list of leaves 4 Can have a comparative stemplot with two groups Would a stemplot be a good graph for the number of pieces of gun chewed per day by AP Stat students? Why or why not? Would a stemplot be a good graph for the number of pairs of shoes owned by AP Stat students? Why or why not?

Example: The following data are price per ounce for various brands of dandruff shampoo at a local grocery store Can you make a stemplot with this data?

Example: Tobacco use in G-rated Movies Total tobacco exposure time (in seconds) for Disney movies: Total tobacco exposure time (in seconds) for other studios’ movies: Make a comparative stemplot.

Graphing Activity

Self Check #2

Assignment #2

Histograms 4 Used with numerical data 4 Bars touch on histograms 4 Two types –Discrete Bars are centered over discrete values –Continuous Bars cover a class (interval) of values 4 For comparative histograms – use two separate graphs with the same scale on the horizontal axis Would a histogram be a good graph for the fastest speed driven by AP Stat students? Why or why not? Would a histogram be a good graph for the number of pieces of gun chewed per day by AP Stat students? Why or why not?

Cumulative Relative Frequency Plot (Ogive) 4... is used to answer questions about percentiles. 4 Percentiles are the percent of individuals that are at or below a certain value. 4 Quartiles are located every 25% of the data. The first quartile (Q1) is the 25th percentile, while the third quartile (Q3) is the 75th percentile. What is the special name for Q2? 4 Interquartile Range (IQR) is the range of the middle half (50%) of the data. IQR = Q3 – Q1

Ogive Activity

Self Check #3

Multiple Choice Test #1

Types (shapes) of Distributions

Symmetrical 4 refers to data in which both sides are (more or less) the same when the graph is folded vertically down the middle 4 bell-shaped is a special type –has a center mound with two sloping tails

Uniform 4 refers to data in which every class has equal or approximately equal frequency

Skewed (left or right) 4 refers to data in which one side (tail) is longer than the other side 4 the direction of skewness is on the side of the longer tail

Bimodal (multi-modal) 4 refers to data in which two (or more) classes have the largest frequency & are separated by at least one other class

Distribution Activity...

Self Check #4

How to describe a numerical, univariate graph

What strikes you as the most distinctive difference among the distributions of exam scores in classes A, B, & C ?

1. Center 4 discuss where the middle of the data falls 4 three types of central tendency –mean, median, & mode

What strikes you as the most distinctive difference among the distributions of scores in classes D, E, & F? Class

2. Spread 4 discuss how spread out the data is 4 refers to the variability of the data –Range, standard deviation, IQR

What strikes you as the most distinctive difference among the distributions of exam scores in classes G, H, & I ?

3. Shape 4 refers to the overall shape of the distribution 4 symmetrical, uniform, skewed, or bimodal

What strikes you as the most distinctive difference among the distributions of exam scores in class K ? K

4. Unusual occurrences 4 outliers - value that lies away from the rest of the data 4 gaps 4 clusters 4 anything else unusual

5. In context 4 You must write your answer in reference to the specifics in the problem, using correct statistical vocabulary and using complete sentences!

Features of the Distribution Activity

Means & Medians

Parameter - 4 Fixed value about a population 4 Typical unknown

Statistic - 4 Value calculated from a sample

Measures of Central Tendency 4 Median - the middle of the data; 50 th percentile –Observations must be in numerical order –Is the middle single value if n is odd –The average of the middle two values if n is even NOTE: n denotes the sample size

Measures of Central Tendency 4 Mean - the arithmetic average –Use  to represent a population mean –Use x to represent a sample mean  Formula:  is the capital Greek letter sigma – it means to sum the values that follow parameter statistic

Measures of Central Tendency 4 Mode – the observation that occurs the most often –Can be more than one mode –If all values occur only once – there is no mode –Not used as often as mean & median

Suppose we are interested in the number of lollipops that are bought at a certain store. A sample of 5 customers buys the following number of lollipops. Find the median The numbers are in order & n is odd – so find the middle observation. The median is 4 lollipops!

Suppose we have sample of 6 customers that buy the following number of lollipops. The median is … The numbers are in order & n is even – so find the middle two observations. The median is 5 lollipops! Now, average these two values. 5

Suppose we have sample of 6 customers that buy the following number of lollipops. Find the mean To find the mean number of lollipops add the observations and divide by n.

Using the calculator...

What would happen to the median & mean if the 12 lollipops were 20? The median is... 5 The mean is What happened?

What would happen to the median & mean if the 20 lollipops were 50? The median is... 5 The mean is What happened?

What would happen to the median & mean if the 20 lollipops were 50? The median is... 5 The mean is What happened?

Resistant - 4 Statistics that are not affected by outliers 4 Is the median resistant? ► Is the mean resistant? YES NO

Now find how each observation deviates from the mean. What is the sum of the deviations from the mean? Look at the following data set. Find the mean Will this sum always equal zero? YES This is the deviation from the mean.

Look at the following data set. Find the mean & median. Mean = Median = Create a histogram with the data. (use x-scale of 2) Then find the mean and median. 27 Look at the placement of the mean and median in this symmetrical distribution.

Look at the following data set. Find the mean & median. Mean = Median = Create a histogram with the data. (use x-scale of 8) Then find the mean and median Look at the placement of the mean and median in this right skewed distribution.

Look at the following data set. Find the mean & median. Mean = Median = Create a histogram with the data. Then find the mean and median Look at the placement of the mean and median in this skewed left distribution.

Recap: 4 In a symmetrical distribution, the mean and median are equal. 4 In a skewed distribution, the mean is pulled in the direction of the skewness. 4 In a symmetrical distribution, you should report the mean! 4 In a skewed distribution, the median should be reported as the measure of center!

Trimmed mean: To calculate a trimmed mean: 4 Multiply the % to trim by n 4 Truncate that many observations from BOTH ends of the distribution (when listed in order) 4 Calculate the mean with the shortened data set

Find a 10% trimmed mean with the following data %(10) = 1 So remove one observation from each side!

Matching Graphs Activity

Mean and Median Assignment

Why use boxplots? 4 ease of construction 4 convenient handling of outliers 4 construction is not subjective (like histograms) 4 Used with medium or large size data sets (n > 10) 4 useful for comparative displays

Disadvantage of boxplots 4 does not retain the individual observations 4 should not be used with small data sets (n < 10)

How to construct 4 find five-number summary Min Q1 Med Q3 Max 4 draw box from Q1 to Q3 4 draw median as center line in the box 4 extend whiskers to min & max

Modified boxplots 4 display outliers 4 fences mark off mild & extreme outliers 4 whiskers extend to largest (smallest) data value inside the fence ALWAYS use modified boxplots in this class!!!

Inner fence Q1 – 1.5IQRQ IQR Any observation outside this fence is an outlier! Put a dot for the outliers. Interquartile Range (IQR) – is the range (length) of the box Q3 - Q1

Modified Boxplot... Draw the “whisker” from the quartiles to the observation that is within the fence!

Outer fence Q1 – 3IQRQ3 + 3IQR Any observation outside this fence is an extreme outlier! Any observation between the fences is considered a mild outlier.

For the AP Exam you just need to find outliers, you DO NOT need to identify them as mild or extreme. Therefore, you just need to use the 1.5IQRs

A report from the U.S. Department of Justice gave the following percent increase in federal prison populations in 20 northeastern & mid- western states in Create a modified boxplot. Describe the distribution. Use the calculator to create a modified boxplot.

Evidence suggests that a high indoor radon concentration might be linked to the development of childhood cancers. The data that follows is the radon concentration in two different samples of houses. The first sample consisted of houses in which a child was diagnosed with cancer. Houses in the second sample had no recorded cases of childhood cancer. (see data on note page) Create parallel boxplots. Compare the distributions.

Cancer No Cancer Radon The median radon concentration for the no cancer group is lower than the median for the cancer group. The range of the cancer group is larger than the range for the no cancer group. Both distributions are skewed right. The cancer group has outliers at 39, 45, 57, and 210. The no cancer group has outliers at 55 and 85.

Matching Box Plots, Histograms, and Summary Statistics Activity

Self Check #5

Comparative Boxplots Assignment

Why is the study of variability important? 4 Allows us to distinguish between usual & unusual values 4 In some situations, want more/less variability –scores on standardized tests –time bombs –medicine

Measures of Variability 4 range (max-min) 4 interquartile range (Q3-Q1) 4 deviations 4 variance 4 standard deviation Lower case Greek letter sigma

Suppose that we have these data values: Find the mean. Find the deviations. What is the sum of the deviations from the mean?

Square the deviations: Find the average of the squared deviations:

The average of the deviations squared is called the variance. PopulationSample parameter statistic

Calculation of variance of a sample df

Degrees of Freedom (df) 4 n deviations contain (n - 1) independent pieces of information about variability

A standard deviation is a measure of the average deviation from the mean.

Use calculator

Which measure(s) of variability is/are resistant?

Mean and Variance Activity

Mean and Variance Worksheet

Self Check #6

Show me the Money Assignment

Multiple Choice Test #2

Assignment #3

Linear transformation rule 4 When adding a constant to a random variable, the mean changes but not the standard deviation. 4 When multiplying a constant to a random variable, the mean and the standard deviation changes.

An appliance repair shop charges a $30 service call to go to a home for a repair. It also charges $25 per hour for labor. From past history, the average length of repairs is 1 hour 15 minutes (1.25 hours) with standard deviation of 20 minutes (1/3 hour). Including the charge for the service call, what is the mean and standard deviation for the charges for labor?

Rules for Combining two variables 4 To find the mean for the sum (or difference), add (or subtract) the two means 4 To find the standard deviation of the sum (or differences), ALWAYS add the variances, then take the square root. 4 Formulas: If variables are independent

Bicycles arrive at a bike shop in boxes. Before they can be sold, they must be unpacked, assembled, and tuned (lubricated, adjusted, etc.). Based on past experience, the times for each setup phase are independent with the following means & standard deviations (in minutes). What are the mean and standard deviation for the total bicycle setup times? PhaseMeanSD Unpacking Assembly Tuning

Self Check #7