10/3/20151 PUAF 610 TA Session 4
10/3/20152 Some words My –Things to be discussed in TA –Questions on the course and problem sets
10/3/ Today Problem Sets 1 * Probability Sampling * Standard Error STATA
interval continuous numericalproportion discrete data dichotomous nominal non-dichotomous categorical ordinal
Measurement scales 5 All measurement in science are conducted using four different types of scales: "nominal", "ordinal", "interval" and "ratio” Qualitative data (unordered or ordered discrete categories): 1. Nominal - numbers are used as labels for the elements (e.g. gender, party affiliation, states of a country, etc.) 2. Ordinal – elements in the dataset can be ordered on the amount of the property being measured and values are assigned in this same order (e.g. ratings)
Measurement scales 6 Quantitative data (variables have underlying continuity): 3. Interval A measurement scale in which a certain distance along the scale means the same thing no matter where on the scale you are, but where "0“ (zero) on the scale does not represent the absence of the thing being measured. (temperature) 4. Ratio A measurement scale in which a certain distance along the scale means the same thing no matter where on the scale you are, and where "0" (zero) on the scale represents the absence of the thing being measured. (money)
10/3/20157 Events Event vs. Observation (any collection of outcomes vs. a single observed outcome) –Simple event: any event that cannot be subdivided into other events. –Compound event: any event that is composed of two or more simple events. –Sample space: an event that contains all possible outcomes.
10/3/20158 Events Union of events contains simple events that are members of either one of the original events. Intersection of events contains simple events that are members of both of the original events.
10/3/20159 Events Mutually exclusive events: have neither observations nor simple events in common. Independent events: the probability of one is not affected if the other has happened.
10/3/ Probability Probability deals with the long-term likelihood of the occurrence of particular outcomes on variables of interest. Probability of an event is the ratio of the number of outcomes including the event to the total number of outcomes (simple events). P(A) = Number of outcomes that include A / Total number of possible outcomes
10/3/ Probability probability of event = p 0 <= p <= 1 0 = certain non-occurrence 1 = certain occurrence
10/3/ Example Choose a number at random from 1 to 5. –What is the probability of each outcome? –What is the probability that the number chosen is even? –What is the probability that the number chosen is odd?
10/3/ Example A glass jar contains 6 red, 5 green, 8 blue and 3 yellow marbles. If a single marble is chosen at random from the jar, what is the probability of choosing a red marble? a green marble? a blue marble? a yellow marble?
10/3/ Rules of probability Probability of the union of two events The probability of event A OR event B is equal to the sum of their respective probabilities minus the probability of the intersection of the events. if A and B are mutually exclusive events, then P(A or B) = P(A) + P(B)
10/3/ Rules of probability Probability of the intersection of two events The probability that Both A And B occur is equal to the probability A occurs times the probability that B occurs, given that A has occurred. If events A and B are independent then P(A and B) = P(A)*P(B)
10/3/ Rules of probability The probability that event A will occur is equal to 1 minus the probability that event A will not occur. P(A) = 1 - P(A')
10/3/ Rules of probability Conditional probability The probability of event A given that event B has occurred is equal to the probability of the intersection of the events divided by the probability of event B.
10/3/ Example Suppose a high school consists of 25% juniors, 15% seniors, and the remaining 60% is students of other grades. What’s the relative frequency of students who are either juniors or seniors ?
Example Suppose we have two dice. A is the event that 6 shows on the first die, and B is the event that 6 shows on the second die. If both dice are rolled at once, what is the probability that two 6s occur? 10/3/201519
Example A box contains 6 red marbles and 4 black marbles. Two marbles are drawn without replacement from the box. What is the probability that both of the marbles are black? 10/3/201520
10/3/ Example Suppose we repeat the experiment of; but this time we select marbles with replacement. That is, we select one marble, note its color, and then replace it in the box before making the second selection. When we select with replacement, what is the probability that both of the marbles are black ?
10/3/ Example A student goes to the library. The probability that she checks out (a) a work of fiction is 0.40, (b) a work of non-fiction is 0.30,, and (c) both fiction and non-fiction is What is the probability that the student checks out a work of fiction, non-fiction, or both?
10/3/ Example At Kennedy Middle School, the probability that a student takes Technology and Spanish is The probability that a student takes Technology is What is the probability that a student takes Spanish given that the student is taking Technology?
10/3/ Sampling Simple random sampling Systematic sampling Cluster sampling Stratified random sampling Multistage sampling
10/3/ Sampling distribution of the mean When using samples we inevitably face the problem of sampling error which is defined as the difference between the population mean (μ) and the sample mean ( ). We can provide a probabilistic estimate of the accuracy of the sample mean through a theoretical sampling distribution.
Sampling distribution of the mean 26 Central tendency The expected value of the mean of the distribution of sample means is equal to the population mean. Variance The expected value of the variance for the sampling distribution of the mean is where σ 2 is the variance in the population and n is the sample size. 10/3/2015
Sampling distribution of the mean 27 Standard deviation of the sampling distribution of the mean where σ is the standard deviation in the population (can be approximated by the sample standard deviation) and n is the sample size. Standard deviation of the sampling distribution of the mean is called the standard error of the mean. 10/3/2015
Sampling distribution of the mean As n increases, the standard error decreases. As n increases, the shape of SDM becomes more like the normal distribution even if the variable is not normally distributed in the population. 10/3/201528
Standard error of a proportion 29 Standard error of a proportion is the standard deviation of its sampling distribution. Since proportions have two possible outcomes, the sampling distribution is binomial, however with relatively large sample sizes it approximates the normal distribution. 10/3/2015
Standard errors and statistical precision 30 Statistical precision is reflected in standard errors as measures of variability of the sampling distribution of a statistic. Small standard errors imply greater accuracy of the estimate. When the sample is representative, the standard error will be small. 10/3/2015
STATA Beginner’s Guide to SAS & STATA Software (Dept. of Agricultural & Applied Economics, UGA) de.pdfhttp:// de.pdf Learning by practice !
Stata Commands 32 summarize univar
Stata Commands 33 univar, boxplot graph box
Stata Commands 34 univar, boxplot graph box
STATA commands 35 hist varname, norm
Stata Commands 36 set obs # generate varname=rbinomial(1,p) table varname