Download presentation
Presentation is loading. Please wait.
1
EMPA P MGT 630
2
Workload ratios, efficiency measures, equity measures
You can create variables from other variable in useful ways that will allow you to understand more about the outcomes and performance indicators that interest you. Workload ratios are ratios of outputs to inputs (calls per call center worker, for example, or clients per social worker). Efficiency measures are ratios of outcomes or outputs per unit input. These are generally the INVERSE of a cost ratio, which is input per unit produced. Equity measures are measures that examine the differences between groups (male to female ratio, etc.)
6
Thinking About Distributions
How would a histogram of coin flips differ from a histogram of dice rolls? How would a histogram of rolls of a single die compare with a histogram of rolls of two or three dice? How would a histogram of years spent in graduate school look for the general population? How would a histogram of mean annual household income (by city) in a state differ from a histogram of annual household income (by individual; actual values)? What would a histogram of population look like, if you graphed frequency on the y-axis and time in years on the x-axis?
7
Distributions For each distribution, look at an image (wikipedia generally has images) and identify at least one relevant example of a variable that might result in that distribution Uniform distribution Normal distribution Binomial distribution Student’s T distribution Poisson distribution Hypergeometric distribution Exponential distribution Chi-squared distribution Zero-inflated poisson distribution
9
How would you expect a histogram of sample averages to differ from a histogram of raw sample scores?
10
Because mean values are distributed differently than raw values, they have their own distribution curve. This curve is called the “student’s t- distribution.” Unlike a normal curve, the t-distribution shape changes depending on the size of the sample. The t-distribution can be used to make estimates about the population based on a sample (assuming the sample was drawn randomly)
12
Based on what we know about a sample, we can draw a distribution of what we would expect to see in the population from which the sample was drawn. Using this new distribution as a tool, we can answer a variety of questions about the population based on our sample. We answer these questions using probabilities rather than absolutes.
13
Based on the mean or proportion observed in our sample, what would we expect the mean or proportion to be in the larger population? We answer this question with a range of values called a confidence interval. The “margin of error” is the distance, in the units of measurement of the variable, from the mean (or proportion) to either end of the confidence interval. “Within the margin of error” means “within the confidence interval” or “statistically equivalent.”
14
A confidence interval is the range of values that is bound by the two values that identify the boundaries between being “likely” to be observed in the population based on a sample and “unlikely” to be observed. “Likely” and “unlikely” are subjective, and correspond to a particular tolerance for uncertainty. This level of tolerance is called the “confidence level.” The amount of uncertainty tolerated is called α. This confidence interval (range of population estimates) is centered on the sample mean or sample proportion.
15
α represents the uncertainty you are willing to live with. An α of 0
α represents the uncertainty you are willing to live with. An α of 0.05 represents a tolerance of 5 percent uncertainty, and corresponds to a 95 % confidence level. This is the standard. α=0.10 corresponds to confidence level 90% α=0.05 corresponds to confidence level 95% α=0.01 corresponds to confidence level 99%
16
Generating a Confidence Interval
MPA Statistics/Descriptive statistics/summarize data set automatically generates confidence intervals (assuming normally distributed variables) for all numeric variables. MPA Statistics/Confidence intervals/ will allow you to select whether you have interval data (this will generate the same information as above) or binary data (which will use a binomial distribution instead of a t-distribution).
17
Using Confidence intervals
If you are extrapolating from a sample to a general population, you should report a confidence interval as your estimate (rather than, say, a mean or proportion). Confidence intervals account for sampling error. Any value INSIDE the confidence interval is STATISTICALLY EQUIVALENT to the mean or proportion. You can test whether or not a value is inside or outside a confidence interval by just determining whether the value is between the confidence interval cutoff values.
18
One-sample tests Are we likely to observe a particular mean or proportion value in the population based on what we have observed in our sample? We use probabilities to answer this question, but the answer itself is either a “yes” or “no.”
19
What is the probability of observing a particular value in the population based on what we have observed in our sample? This probability is known as the p-value. If this probability is sufficiently small, (i.e. it is unlikely that we would have observed this value by chance), we say that the value is significantly different from the mean or proportion in our sample and unlikely to be observed. The value for “sufficiently small” is α.
20
α represents the uncertainty you are willing to live with. An α of 0
α represents the uncertainty you are willing to live with. An α of 0.05 represents a tolerance of 5 percent uncertainty, and corresponds to a 95 % confidence level. This is the standard. α=0.10 corresponds to confidence level 90% α=0.05 corresponds to confidence level 95% α=0.01 corresponds to confidence level 99%
21
One important concept in tests of this kind is whether you use a one-tailed or two-tailed test of significance. A one-tailed test uses all of your available uncertainty (“wiggle room”) on one side of the distribution or the other. A two-tailed test divides the uncertainty between the two tails. A two-tailed test is always appropriate, and is a more conservative test. Therefore, use the two- tailed test. If you want a less conservative test, just change α.
23
Benchmarking One-sample tests are a great way to test benchmark values. Use the target benchmark as the hypothesized value. The benchmark is a single value (not a distribution of values) so it makes a good hypothesis. A one-sample test allows you to determine whether or not your sample is statistically equivalent (as compared with significantly above or below) the benchmark value.
24
Performing One-Sample Tests
Identify your benchmark value (the value you want to determine as statistically “the same” or “different” from your sample. Go to MPA Statistics/single sample tests and select the one for interval or binary. Enter the hypothesized value (mean or proportion) and select the variable you’d like to compare it with. Use a two-tailed test. The p-value tells you the probability that your value is statistically equivalent to the mean or proportion observed in your sample. If the p-value is smaller than 0.05, the value is DIFFERENT.
25
In statistics, the difference between two values is called “statistically significant” if two values are found to be different enough that they are unlikely to occur on the same distribution. “Statistically significant” is very different from the word “significant” in English, which invokes concepts like important, substantive, noteworthy, etc. DO NOT use the term “significant” or “statistically significant” in a statistical report as though it shares the English meaning.
26
sample size, sampling error, and statistical power
The larger the number of observations, the more precise your mean and proportion values will be—in other words, they will more accurately portray the sampled population. The more observations, the smaller the confidence interval. The number of observations you have is thus related to “statistical power” or the ability to distinguish actual differences between values or variables and noise due to sampling error. You can pre-determine your tolerated margin of error (width of the confidence interval) by calculating how many observations you would need in order to have a certain level of precision. The formula for this differs from test to test, but the formulas can easily be found online. Note that you can use almost any sample size—but it will change the amount of sampling error “noise” you have to deal with in trying to answer your questions. The more noise, the less reliable the results of the tests.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.