Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 9: Statistics By: Jamie Fu and Neha Surapaneni.

Similar presentations


Presentation on theme: "Unit 9: Statistics By: Jamie Fu and Neha Surapaneni."— Presentation transcript:

1 Unit 9: Statistics By: Jamie Fu and Neha Surapaneni

2 Central Tendency ❖ Mean/average( ): sum of n numbers divided by n ❖ Median: The middle number of n numbers that are written in numerical order (If n is even, take the average of the middle two numbers) ❖ Mode: The number or numbers that occur most in a set of n numbers ❖ Range: The difference between the largest and smallest numbers of a set of n numbers ❖ Standard Deviation(σ): Describes the typical difference between a data value and mean N=number of values = mean X 1-n =each of the values of the set

3 Sample Problem You are competing in a hockey tournament. The winning scores for the first 10 games are: 11, 12, 13, 13, 14, 15, 15, 15, 15, 17 Find the mean, mode, median, range,standard deviation Mean: (11+12+13…+17)/10 = 14 Median: (14+15)/2= 14.5 Mode: 15 Range: 17-11 = 6 Std. Dev.: σ= √[(11-14)^2 + (12-14)^2 …+ (17-14)^2]/10 ≈ 1.7

4 ❖ X-axis is independent variable ❖ Y-axis is the dependent variable ❖ Correlation coefficient: How strong the relationship is between points Scatter plots

5 Box and Whisker plots ❖ Lower quartile=median between median and lowest value ❖ Upper quartile=median between median and highest value

6 Histograms ❖ The bars should be touching ❖ Bars should be the same width ❖ Width should represent a quantitative value not categorical ❖ The height indicates frequency

7 Sample Problem: Box & Whisker vs. Histogram The annual incomes for 6 professions are shown below: Farming: 19,630 Sales: 28,920 Architecture: 56,330 Professional Athlete: 2,476,590 Legal: 69,030 Teaching: 39, 130 Which graph will better represent the data? (histogram/box and whisker) Box and Whisker, due to the big outlier,the athlete Which graph will better represent the data when the outlier is removed? Histogram, because the data is closer together

8 Frequency Distribution ❖ Lower Class limit: lowest value in a class ❖ Upper Class Limit: Highest value in a class ❖ Class Mark: midpoint of a class ❖ Formula: (Lower class limit + upper class limit)/2 ❖ Class Width: the difference between the lower and upper class limits to the next highest whole number ❖ Formula: (high-low)/number of classes you want ❖ Class Boundaries: the endpoints of the bars in a histogram

9 Frequency tables ❖ Use frequency distribution to organize data ❖ Used in creating histograms Class Limits Class Boundaries Lower-Upper Tally Frequenc y Class Midpoint 1-80.5-8.511111 11111 1111 144.5 9-168.5-16.511111 11111 11111 11111 1 2112.5 17-2416.5-24.511111 11111 1 1120.5 25-3224.5-32.511111 1 628.5 33-4032.5-40.51111436.5 41-4840.5-48.51111444.5

10 ❖ Sample mean: = [∑(x.f)]/n ❖ Sample Standard Deviation: ❖ Variance: sample std. dev. without the square root Sample Frequency Tables x= class mark n= # of entries f= frequency

11 ∑=80 ∑=1290 ∑=9468.8 Mean: 16.1 Standard Deviation:√119.9 Variance: 119.9 Sample Frequency Tables Class LimitsMarks (class midpoint) Frequencyx✖fx✖f v ✖ f 1-105.534187112.363820.24 11-2015.518279.366.48 21-3025.517433.588.361502.12 31+35.511390.5376.364139.96

12 Bell Curves ❖ Normal curve: A normal distribution is a frequency distribution where there are large number of values in a set of data: a symmetric, bell-shaped curve ❖ Middle value is the mean, which is 0 standard deviations away from the mean ❖ Probability: When randomly choosing a value from a set, the probability of choosing a number 1 S.D. away is 68% Mean

13 Choosing a Graph: Use...when... ❖ Scatter plot: Set of points to find correlation ❖ Box & Whisker: Outliers ❖ Histogram: numbers are close to each other, showing for frequency of #s in a certain range ❖ Bell Curve: Showing where majority of data is; finding probability of a certain # picked out of set

14 Data Classification Population: Group of people or objects you want info about Sample: Subset of population Types of Samples ❖ Self-selected sample: Members volunteer to be samples ❖ Systematic sample: A rule is used to select members ❖ Convenience sample: Easy-to-reach members ❖ Random sample: Each member has equal chance of selection Bias and Unbias ❖ Biased sample: Over or under represents parts of population ❖ Unbiased sample: Representative of population you want info on

15 Margin of Error ❖ Gives limit on how much the responses of your sample differ from responses from the whole population ❖ ±(1/√N) ❖ Percent of the population that responds in the same way as the sample responding a certain way(p) is likely to be between... ❖ P-(1/√N) and P+(1/√N)

16 Calculating Probability ❖ What is the probability that a number(x) will be ≥, ≤, +nσ (≤ and are the same) ❖ Example: P(x≥ +σ) ❖ The percent between +1 and +2 is 13.5% ❖ The percent between +2 and +3 is 2.35% ❖ After +3 is.15% ❖ Therefore, P(x≥ +σ) is 13.5+2.35+.15 =16% Mean 13. 5 2.35.1 5 34

17 Calculating Probability Example: Normal distribution has a mean of 27 and a standard deviation of 5 What is P(22<x<32)? 22 is 27( )-5(σ) and 32 is 27+5 so it is one standard deviation away Therefore, the percentage should be in between -σ and +σ, which is 68% Mean 13. 5 2.35.1 5 34

18 Calculating Probability using z- score ❖ The standard normal distribution is the normal distribution with mean 0 and standard deviation 1 ❖ The formula (x- )/σ can be used to transform x-values from a normal distribution with mean and standard deviation σ into z-values having a standard normal distribution with mean 0 and standard deviation 1 ❖ The z-value for a particular x-value is called the z-score for the x-value and is the number of standard deviations the x- value lies above or below the mean

19 Calculating Probability using z- score A normal distribution has a mean of 75 and a standard deviation of 10 Use standard normal table(find z-score first) to find P(x≤70) Answer: Use (x- )/σ to find z-score [ (70-75)/10 = -0.5 ] The z-score is -0.5; Choose row -0 and column.5 to form -0.5 The cross section shows the number.3085 which is 30.85% which is ≈31% The probability of P(x≤70) ≈31% When using z-score and it asks for P(x≥n), then take 1 MINUS the z-score

20 Standard Error of the Mean ❖ Standard Error( ) measures how well a sample mean estimates the true mean of a population ❖ Formula: Q: The mean weight of 36 boys on the wrestling team is 136.4lbs and the standard deviation is 4.1 what is the standard error of the mean? A: 4.1÷√36 = 4.1÷6 ≈.68 = 68%

21 Confidence Interval ❖ Lower Confidence Limit ❖ Upper Confidence Limit ❖ formula for confidence interval around pop, mean(μ): ❖ critical confidence value(z c )is found using this table

22 Confidence Interval Example Q: In a sample of 100 families a Tv reporter found that kids watch an avg. of 4.6hrs of TV a day. The σ is 1.4hrs. a) Find the ? 1.4÷√100 = 1.4÷10=.14 = 14% b) Find the 50% and 99% confidence intervals for the number of hours all kids in Texas watch T.V. 50%: 4.6±(.6745)(.14) = 4.51 ≤μ≤4.69 99%: 4.6±(.2.58)(.14) = 4.24 ≤μ≤4.96 c) Which interval has a wider range? 99%

23 Common Mistakes ❖ Finding the proper class limits: sometimes the formula doesn’t work ❖ Remember: when using z-score and the probability asks for ≥ a number, then you take 1 MINUS the z- score ❖ Deciding the formulas to use in real life situations ❖ Deciding on the classification of the data(which sample to use)

24 Terminology = mean σ = standard deviation f = frequency x = midpoint = standard error of the mean Z c = Critical confidence value N = number of participants, number of entries, etc. ∑ = summation, sum of all the numbers z = z-score, fractional part of data that lies in the interval ±tσ t = number of standard deviations away from the mean

25 Using a Graphing Calculator Correlation Coefficient: Stat, 1, 2nd, 0, D, diagnostics on, enter, enter. stat, calc, 4 (It is r) Central Tendency: Stat, 1, stat, calc Get numbers in order: stat, 1, 2nd, stat, ops, 1, 2nd, 1, stat, 1. Line of best fit: stat, 1, 2nd, y=, plot on, stat, calc, 4

26 Statistics Formulas Margin of Error: ± 1 /√N Class WIdth: (high - low) / # of classes Class Midpoint: (upper limit + lower limit) / 2 Mean of Frequency Distribution: [∑(x.f)]/n Sample standard deviation: Z-score: (x- )/σ Standard Error of the Mean: Confidence Interval: Standard Deviation mean


Download ppt "Unit 9: Statistics By: Jamie Fu and Neha Surapaneni."

Similar presentations


Ads by Google