Download presentation
Presentation is loading. Please wait.
Published byMichael Snow Modified over 9 years ago
1
URBP 204A QUANTITATIVE METHODS I Statistical Analysis Lecture II Gregory Newmark San Jose State University (This lecture accords with Chapters 6,7, & 8 of Neil Salkind’s Statistics for People who (Think They) Hate Statistics)
2
Populations and Samples Populations – All the people in a specified group of people The population of Students at SJSU The population of Students in Urban Planning at SJSU The population of Students in 204A this semester Samples – A portion of a larger population selected for study A 500 person Sample of Students at SJSU A 50 person Sample of Students in Urban Planning A 15 person Sample of Students in 204A this semester
3
Populations and Samples Ideally, research covers entire populations – “Medicine X always cures the common cold” Financially, research is expensive – “We can’t afford to test Medicine X on everyone” Practically, we test samples of a population – “We can afford to test Medicine X on 1,000 people” Hopefully, those samples well represent the actual population – “For our results to be generalizable, our 1,000 people should approximate the characteristics of everyone”
4
Populations and Samples
5
Sampling Error – A measure of how well a sample approximates the characteristics of the larger population – The difference between a sampling statistic (i.e., values in the sample) and a population parameter (i.e., values in the population) – Low sampling error means higher precision – Higher precision means more generalizability – Valuable research has a high degree of generalizability
6
Questions and Hypotheses Research Questions (Problem Statements) – What you are trying to investigate Hypotheses – Translates research question into a testable form
7
Hypotheses Null Hypothesis – Assumption that no relationship exists in population – Statements of equality – Examples “There is no relationship between reaction time and problem solving ability” “There is no difference in the average GRE scores of women and men” – Purposes (Null Hypothesis can not be tested directly) Starting point for research – Until you prove a difference you have to assume none exists Benchmark to compare observations – Defines a range within which observed difference may be due to change
8
Hypotheses
9
Research Hypothesis – Definitive statement that a relationship exists in a sample – Statements of inequality – Examples “There is a positive relationship between reaction time and problem solving ability” “There is a difference in the average GRE scores of women and men” – Two Types Non-directional – there is a difference but its direction is unspecified Directional – there is a difference and its direction is specified – Purpose – to provide a hypothesis for direct testing
10
Hypotheses Should be stated in a clear, forceful, declarative form – “Students who complete all assignments will get higher grades in 204A than those who do not.” Should be expressed succinctly – Avoid excessive verbiage that can confuse your readers Should posit an expected relationship between variables – This will focus the research and avoid ‘scattershot’ approach Should reflect theory or literature – This ensures that the researcher has investigated the issue in advance Should be testable – One can actually carry out the research – Defines how measurement will happen
11
Hypotheses Quotes The great tragedy of Science - the slaying of a beautiful hypothesis by an ugly fact. – Thomas H. Huxley (1825 - 1895) There are two possible outcomes: If the result conforms the hypothesis, then you've made a measurement. If the result is contrary to the hypothesis, then you've made a discovery. – Enrico Fermi (1901-1954) It is a good morning exercise for a research scientist to discard a pet hypothesis every day before breakfast. It keeps him young. – Konrad Lorenz (1903 - 1989) For every fact there is an infinity of hypotheses. – Robert M. Pirsig (1928 - )
12
Inferential Statistics Descriptive Statistics describe a data set – “The average height in this class is 5’6” with a standard deviation of 3”.” Inferential Statistics are used to make inferences from sample data to populations – “Based on our class data, we infer that the average height at SJSU is 5’6” with a standard deviation of 3”.”
13
Inferential Statistics
14
The Normal Curve Visual representation of a distribution of scores with the following characteristics – Mean, median, and mode are the same – Symmetry around the mean (or mode or median) – Tails of curve approach zero asymptotically
15
The Normal Curve
16
We can exploit these properties of the normal curve to compare distributions with different means and standard deviations, by putting them into standard scores based on the standard deviation Basically, we can compare curves by discussing their standard deviations
17
Z-Scores A commonly used standardized score Represent the number of standard deviations a raw score falls from the mean Result of dividing the amount that a raw score differs from the mean of a distribution by the standard deviation of that distribution Z = z score; X = individual score; Xbar = mean; s = standard deviation
18
Z-Scores Characteristics – Z scores above the mean are: Positive To the right of the mean In the upper half of the distribution – Z scores below the mean are: Negative To the left of the mean In the lower half of the distribution – Z scores have associated probabilities
19
Z-Scores Every z score has an associated probability We can use that property to test hypotheses This property enables inferential statistics We can assess whether an event is due to chance or reflects some research finding Typically, we reject the null hypothesis if an event has less than a 5% chance of occurring In that case, the research hypothesis likely makes more sense
20
Class Lab Have everyone report their height in inches Determine class mean Determine class standard deviation Calculate z score for your height What percentage of the class is taller than you? (see chart in back of book or online) Have everyone move the data into SPSS and repeat the experiment
21
The Normal Curve The Normal Law by W.J. Youden (1900 - 1971) THE NORMAL LAW OF ERROR STANDS OUT IN THE EXPERIENCE OF MANKIND AS ONE OF THE BROADEST GENERALIZATIONS OF NATURAL PHILOSOPHY... IT SERVES AS THE GUIDING INSTRUMENT IN RESEARCHES IN THE PHYSICAL AND SOCIAL SCIENCES AND IN MEDICINE, AGRICULTURE, AND ENGINEERING. IT IS AN INDISPENSABLE TOOL FOR THE ANALYSIS AND THE INTERPRETATION OF THE BASIC DATA OBTAINED BY OBSERVATION AND EXPERIMENT
22
Statistical Significance Refers to whether or not an observed effect is due to chance or to systematic influence. – “There is a positive statistically significant relationship between GDP and average life span.” – Statistical significance makes the null hypothesis less attractive an explanation than the research hypothesis Ideally, research would control for all other factors, but in practice there will be uncontrolled error. – “There is a chance that a low GDP nation will have a higher average life span, due to unaccounted for factors.” Researchers ultimately define the level of certainty they are willing to accept in determining significance. – “There is a 1 in 20 chance that the observed effect is not due to the hypothesized reason, and we can live with that.” – This is called significance level (or critical p-value).
23
Significance Levels can Vary
24
Statistical Significance To review: – First, hypothesize a relationship Null Hypothesis means no relationship (often implied) Research Hypothesis means there is a relationship – Second, test the research hypothesis Define your significance level Do your experiment – Third, based on your findings either: Reject the null and accept the research hypothesis Accept the null and reject the research hypothesis
25
Statistical Significance Data and Dating – Is this enough to reject the null hypothesis?
26
Statistical Significance Null Hypotheses can be either true or false – If true, there is an equality – If false, there is an inequality The Null Hypothesis can not be directly tested – This presents a problem because one might reject the null when it is true (Type I) or accept it when it is false (Type II) – Four options: No Problem Accept the Null Hypothesis when there is truly no difference between groups Type I Error (False Positive) Reject the Null Hypothesis when there is truly no difference between groups Type II Error (False Negative) Accept the Null Hypothesis when there truly are differences between groups No Problem Reject the Null Hypothesis when there truly are differences between groups
27
Significant vs. Meaningful Statistically significant does not always imply the finding is meaningful – “There is a statistically significant ¼ inch difference in the heights of women and men.” – “There is a statistically significant $0.50 difference in the per capita tax returns of married couples versus singles.” Large samples will almost always find statistically significant differences. The researcher needs to assess the meaning of the outcomes by considering their context.
28
Statistical Significance Revisited Steps: – State hypothesis – Set significance level associated with null hypothesis – Select statistical test (we will learn these soon) – Computation of obtained test statistic value – Computation of critical test statistic value – Comparison of obtained and critical values If obtained > critical reject the null hypothesis If obtained < critical stick with the null hypothesis
29
Statistical Significance Revisited One Tailed Test
30
Statistical Significance Revisited Two Tailed Test
31
Inferential Statistics Revisited Inference allows decisions to be made about populations based on information about samples. Steps: – Take a representative sample – Test each member of the sample – Analyze data to determine if variation is due to chance (accept null hypothesis) or statistically significant (accept research hypothesis) – Conclusions inferred about population
32
Inferential Statistics Revisited
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.