Download presentation
Presentation is loading. Please wait.
1
Unit 1A - Modelling with Statistics
2
Key Concepts Normal Distribution The Normal Curve The Empirical Rule
Interquartile Range Z-score Sampling Study Design
3
Normal Distribution A normal distribution is a very important statistical data distribution pattern occurring in many natural phenomena, such as height, blood pressure, lengths of objects produced by machines, etc. Certain data, when graphed as a histogram (data type on the horizontal axis, amount of data on the vertical axis), creates a bell-shaped curve known as a normal curve, or normal distribution. The greater the amount of data, the more similar the bell curve will be to normal distribution. Normal distribution is a standard pattern.
5
The Empirical Rule The Empirical Rule states that:
68% of the distribution lies within one standard deviation of the mean. 95% of the distribution lies within two standard deviations of the mean. 99.7% of the distribution lies within three standard deviations of the mean.
7
The Interquartile Range
The Interquartile Range is a measure of variability, based on dividing a data set into quartiles based on medians. Quartiles divide a rank-ordered data set into four equal parts.
8
How to Find the Interquartile Range
Step 1: Put numbers in order. Step 2: Find the median Step 3: Find Q1 and Q3 Q1 is the lower-half of the data (below the median) Q3 is the upper-half of the data (above the median) Step 4: Subtract Q1 from Q3 to find the Interquartile Range. 1,2,5,6,7,9,12,15,18,19,27 1,2,5,6,7,9,12,15,18,19,27 (1,2,5,6,7), 9, ( 12,15,18,19,27). Q1=5 and Q3=18. 18-5=13. IQR = 13
9
Z-Score A z-score is the number of standard deviations away from the mean a data value is. The formula to find z-score is Z = (x-m)/s Where z is z score, x is given value, m is mean, and s is standard deviation.
10
How to Find Z-score Example: The mean score on the SAT is 1500, with a standard deviation of 240. The ACT has a mean score of 21 with a standard deviation of 6. If Donald Trump scored a 2 on the SAT and Joydeep Mukherjee scored a 36 on the ACT, who did better relatively? Use Z-score to find out. Trump’s z score: (2-1500)/240 = -6.24 Joydeep z score: (36-21)/6 = 2.5 Joydeep scored better (good job).
11
Sampling and Study Design
12
Types of Sampling Methods
Simple random- items are selected in random Systematic- items are selected according to a fixed rule Convenience- items are selected because of convenient accessibility Stratified- population is broken into categories and a random sample is taken of each Cluster- rather than taking a sample from each group, a few clusters are fully surveyed Voluntary Response- volunteers
13
Examples Simple random- picking numbers out of a hat
Systematic- survey every 5th person that walks out the door Convenience- survey the closest people Stratified- divide the group by eye color and have each eye color group pick numbers out of a hat Cluster- divide the group by eye color and randomly choose 2 eye color groups to survey everyone in them Voluntary Response- volunteers volunteer for volunteering
14
Bias Bias occurs when the sample is not evenly, randomly, or fairly distributed among the population. Bias can also occur when an answer is suggested within the question, like this: “Don’t you think that Donald Trump would be a good President?” The answer “yes” is suggested within the question.
15
Observational Study vs Experiment
Observational studies are when the conductor of the study does not influence the actions of the participants in any way; for example, if you track the weight of several people over time Experiments are when the conductor instructs the participants to do something, or when a variable is intentionally changed; for examples, if you tell several people to eat different types of foods, then track their weight over time.
16
Using a Calculator For TI-83/84, the normal distribution menu is found by 2nd VARS (DISTR). normalpdf pdf = probability density function. normalpdf(x, mean, standard deviation) This function returns the probability of a single value of the random variable x. Use this to graph a normal curve. Using this function returns the y-coordinates of the normal curve. normalcdf cdf = cumulative distribution function. normalcdf(lower bound, upper bound, mean, standard deviation) This function returns the cumulative probability from lowest to highest value of the random variable x. Technically, it returns the percentage of area under a continuous distribution curve from negative infinity to the x. invNorm inv = inverse normal probability distribution function. invNorm(probability, mean, standard deviation) This function returns the x-value given the probability region to the left of the x-value. The inverse normal probability distribution function will find the precise value at a given percent based upon the mean and standard deviation. ShadeNorm ShadeNorm(lower bound, upper bound, mean, standard deviation) This function shades the area under a normal distribution graph between particular values, representing the probabilities of events occurring within that specific range. To find this, press 2nd VARS, then right arrow.
17
THE END
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.