Psychology 230: Psychological Measurement and Statistics There are three kinds of lies: Lies, damned lies, and statistics. -Benjamin Disraeli
Interesting Statistics More than 60% of all accidents occur within 2 miles of one’s home. The national median salary is $30,000. There are more right-handed people than left-handed people. The safest way to travel is….FLYING!
How bad is this going to be? Statistics is not math (it’s a way or organizing and interpreting info…but it does use some mathematical procedures). Math fears: Basic Mathematical Review (Appendix A in your book). Don’t be overwhelmed by research papers/articles/reports. Symbols (i.e. ). Jargon (i.e. ANOVA = analysis of variance). Based in logic.
The Basics Terms and Concepts
Foundations and Common Terms: Populations vs. Samples Data: numbers, measurements collected. Population: complete set of people/objects (scores) having some common characteristic. The entire collection of events - denoted by N. Sample: subset of population that share same characteristic used to infer characteristics of the pop - denoted by n.
Describing Data: Parameters vs. Stats Parameter: value summarizing characteristic of population; constants; Greek letters are used to represent parameters. Statistic: value summarizing characteristic of a sample; variables; use Roman letters to represent. Sampling error: the discrepancy, of amount of error that exists between a sample statistic and the corresponding population parameter.
Research Methodology Statistics & research methods are intricately tied. The stats you perform are partially determined by the design of your experiment. Some research projects are designed for use with particular stats procedures. This is NOT a research methods class. However…
Foundations and Common Terms: Variables Variable: measurable characteristic that changes with person, environment, experiment [e.g. height, IQ, learning (X or Y)]. Constant: a characteristic or condition that does NOT change (e.g. , time of day, religion). Discrete variable: One that has limited number of values (e.g. children & cars). Continuous variable: One that has an infinite number of values (e.g. exam scores, time, age).
Correlational Method Correlational method: two variables are measured and compared to see if there is a relationship - observational. $24,000 $50,000 $30,000 $55,000 $40,000 Subject Education (yrs) A B C D E F 12 18 16 9 14 Salary 9 12 15 18 60 50 40 30 20
More about Variables Independent Variable (IV): variable examined to determine its effect on outcome of interest (DV) manipulated variable (e.g. a dose of a drug). Subject or organismic variable / time variable: naturally occurring IV; not controlled (e.g. eye color, time of day). Dependent Variable (DV): outcome of interest measured to assess the effects of IV not under experimental control (e.g. how a person reacts to a drug) . Confound: DV is affected by a variable related to IV don’t know what caused effect on DV (e.g. herbs taken in addition to drug). e.g. confound – test to see whether having a lab and lecture in stats class is better for learning than just a lecture, but don’t equate the amount of time spent in each group. Group with lab gets 2 extra hours of statistics per week…don’t know if the manipulation caused the outcome or the difference in amount of time studying stats
Experimental Method Experimental: one variable is manipulated while changes are observed in the other. Looking for cause and effect relationships. Includes a control group and an experimental group. Random sample: assign subjects to treatments in EQUAL and INDEPENDENT manner to avoid bias.
Remaining Methodological Terms Theory: statements about underlying mechanisms of behavior. Hypothesis: a prediction about the relationship between variables. Constructs: hypothetical concepts used to organize behavior that can’t be observed (e.g. intelligence, attention). Operational definition: defines a construct in terms of specific procedures or measurements that result from them (e.g. an IQ test; eye movement to a display item).
Example: Experimental Method Which therapy works best for depressed patients? Difference? IV (treatment) DV (test) Talk therapy Meditation Mood Score Construct = Depression. Operational Def. = Mood score. Hypothesis = Meditation will be as good a treatment for depression as talk therapy.
Experiment with Confound Talk therapy Meditation Mood Score IV (treatment) DV (test) Confound Dr. Smith Dr. Brown
Quasi-Experimental Method Quasi-experimental: comparing groups like experimental methodology but not manipulating one of the groups. Uses subject/time variables. Subject variable DV (test) 6th grade boys Basic Skills 6th grade girls
Example Does the number of hours of sleep before a test affect your performance on that test? Correlation to look at each variables relationship to the other. Experiment if you divide participants into sleep groups. Sleep time = 4, 6, or 8 hours (Independent variable). Performance on the exam (Dependent variable).
Example 2 Does stress affect memory? Is it differentially affected for men vs. women. 1st question is requires an experimental design Independent variable is the amount of stress (a construct). This variable needs to be operationally defined, so look at the amount of cortisol a correlate of stress. Give 3 groups different amounts. Can you think of another way of doing this? Dependent variable is the performance on a memory test (memory is also a construct and so needs an operational def.). Question 2 requires a quasi-experimental design dividing the groups given cortisol into groups of men vs. women and looking at their memory test scores.
Gross Overview All Statistics Descriptive Statistics Inferential Statistics
Descriptive Statistics Procedures for summarizing and describing the characteristics of a data set. Examples: Frequency or count. Ratio 13:7 women to men (do not reduce). Proportion 1/4 = .25. Percentage.
Descriptive Statistics Employs exploratory data analysis, uses visual aids, abbreviations, etc. percentage of words memorized from a list over time ratio of students in this class who support the death penalty
Inferential Statistics Predictive; allows drawing inferences about the characteristics of a pop. based on a sample. Risk: Claiming to know values that were never really measured. When done correctly, highly reliable and valid. Reliability = degree to which repeated measures give the same results. Validity = accuracy test/ measure actually measures the thing of interest. Men are generally taller than women. Shoe size is not a predictor of intelligence.
Descriptive vs. Inferential What would you need to know to determine whether these statements were descriptive or inferential? Do you think these statements are descriptive or inferential? People who show up for class get better grades. Students eat pizza 2.354 times per week. Blondes have more fun. The average temperature at noon in Tucson in June is 103. In groups of 3-4 generate 3 examples of each type of statistic: descriptive and inferential.
Putting it all together: The Role of Stats in Research Talk therapy Meditation Mood Score Step 1: Exp. Collect data Step 2: Descriptive Organize and simplify Step 3: Inferential Interpret results 2 ways: Actually is no diff. it’s due to chance Sample reflects a true diff. 31 25 31
The Stevens System Method of classifying data introduced by S. S. Stevens in 1946. Helps in determining which statistical test applies (this will be useful later on in the course). 4 categories of data. 1) Nominal scale. 2) Ordinal scale. 3) Interval scale. 4) Ratio scale.
The Nominal Scale Could be called labeling. Numbers are assigned to define a category. Therefore, all cases in the same category receive the same designation, the same number. Categories are independent/mutually exclusive. Makes no assumptions regarding the relationship among measures. e.g.: political party affiliation, SSNs, kinds of pets.
The Ordinal Scale Orders cases along a predetermined continuum. The distance between two successive points are not assumed to be equal. provides “greater than” and “less than” information, without indicating how much. Andy is taller than Jane. Marva is taller than George. Is the difference between Andy & Jane’s heights greater than, less than or equal to the differences between Marva & George’s? Most common use: rank ordering. e.g. grade level, military rankings.
The Interval Scale The distance between two successive points are assumed to be equal. Can’t use ratio information because there is no true zero. Can’t claim that one value is twice as large as another. No true zero point. e.g.: temperature in degrees, calendar.
The Ratio Scale The distance between two successive points are assumed to be equal. True zero point: can lack whatever is being measured. e.g.: Height, weight, temperature in Kelvin.
General Properties of the Stevens Scale Order of complexity: Ratio > Interval > Ordinal > Nominal The more complex the scale, the more sophisticated the statistical test/procedure that can be performed. All scales have the properties of the scales with less complexity. By and large, most psychological studies examine interval or ratio data. See Example 1.1 in the book pg. 22.
Qualitative vs. Quantitative data Qualitative: a set of observations where any single observation is a word or code that represents a class or category. Quantitative data: a set of observations where any single observation is a number that represents an amount or count. Examples: Handedness. Gender. # of men vs. women attending a jazz concert (careful here: always focus on the status of any single observation rather than the entire set.
Data Type and the Stevens Scale Qualitative: Nominal Ordinal Quantitative Interval Ratio State where you are from SATs size of french fry order response time points on an exam
Real Limits or True Limits Reminder: Continuous variables - infinite number of possible values fall between 2 observed values. Discrete variables - separate indivisible categories. Real Limits - measurements of continuous variables require assigning individuals to an interval on a number line rather than a single point (e.g. rounding your data). But…the real limits are not necessarily part of the interval due to rounding. 5 6 7 8 9 4.5 5.5 6.5 7.5 8.5 9.5
Calculating Real Limits Real Limit - #s that limit where the true value lies. +/- 1/2 the unit of measurement. To get unit of measurement: 3,4,5,6 => unit = 1; 1/2 = 0.5 (limit value). 3 + 0.5 = 3.5 (upper limit of a value). 3 - 0.5 = 2.5 (lower limit of the value).
Calculating Real Limits 5,10,15,20 => unit = 5; 5/2 = 2.5 (limit value). 10 + 2.5 = 12.5 (upper limit). 10 - 2.5 = 7.5 (lower limit). Intervals 15-20 14.5 and 20.5 decimals: Anything to the left = 0. Last # on the right = 1; all others = 0. 13.63 => unit of measure= 0.01; 0.01/2 = 0.005 (limit value. 13.63 + 0.005 = 13.635 (upper limit). 13.63 - 0.005 = 13. 625 (lower limit).
Statistical Notation Summation notation - Many statistical computations require adding up a set of scores. The Greek letter sigma, or stands for summation. X or Y: symbol for a variable. Xi or Yi: represents an individual observation. N or n: # data points in a set, number.
Order of Operations Mnemonic : Please Excuse My Dear Aunt Sally (1) Parentheses (2) Exponents (squaring and the like) (3) Multiplication / Division (should be done in order of left to right (4) Addition / Subtraction (any order you like) Caveat: Summation notion should be done after step 3 but before step 4 (summation notation is just another mathematical operation).
Examples Scores: X1= 4, X2= 6, X3=1, X4=5, X5=2, X6=7 Y1=3, Y2=4, Y3=6, Y4=1, Y5=2, Y6=1 (a) Xi (where i=3 to 6) (b) Xi (where i=1 to3) (c) Xi2 (where i= 4 to 6) (d) ( Xi)2 (where i = 4 to 6) (e) XY (where i = 1 to 4) (f) XY - 1
Homework - Chapter 1 2, 3, 6, 8, 9, 12, 16, 17, 18,19, 22 Find the real limits for: The interval between 60-69. 10.5, 11.5 -1.02, -1.03 The interval between 25.5-27.5