Presentation is loading. Please wait.

Presentation is loading. Please wait.

*What is Test Theory? The study of measurement problems, influence of these measurement problems on psychological inventories, and how to create methods.

Similar presentations


Presentation on theme: "*What is Test Theory? The study of measurement problems, influence of these measurement problems on psychological inventories, and how to create methods."— Presentation transcript:

1 *What is Test Theory? The study of measurement problems, influence of these measurement problems on psychological inventories, and how to create methods to minimize these problems

2 UNIT I INTRODUCTION TO MEASUREMENT THEORY
CHAP 1: WHAT IS TEST THEORY CHAP 2: STATISTICAL CONCEPTS FOR TEST THEORY CHAP 3: INTRODUCTION TO SCALLING CHAP 4: PROCESS OF TEST CONSTRUCTION CHAPTER 5: TEST SCORES AS COMPOSITES

3 UNIT II RELIABILITY CHAP 6: RELIABILITY AND THE CLASSICAL TRUE SCORE MODEL CHAP 7: PROCEDURES FOR ESTIMATING RELIABILITY CHAP 8: INTRODUCTION TO GENERALIZABILITY THEORY CHAP 9: RELIABILITY COEFFICIENTS FOR CRITERION-REFERENCED TESTS

4 UNIT III VALIDITY CHAP 10: INTRODUCTION TO VALIDITY
CHAP 11: STATISTICAL PROCEDURES FOR PREDICTION AND CLASSIFICATION CHAP 12: BIAS IN SELECTION CHAP 13: FACTOR ANALYSIS

5 UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT
CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS

6 UNIT V TEST SCORING AND INTERPRETATION
CHAP 17: CORRECTING FOR GUESSING AND OTHER SCORING METHODS CHAP 18: SETTING STANDARDS CHAP 19: NORMS AND STANDARD SCORES CHAP 20: EQUATING SCORES FROM DIFFERENT TESTS

7 Introduction to Classical and Modern Test Theory
Chapter 1

8 Pioneer countries in test theory are:
Historic Origins Pioneer countries in test theory are: Germany, England, France, and the United States

9 Germany Wilhelm Wundt, Ernest Weber, and Gustavo Fechner used procedures for collection of observations in a standard way for all subjects, such as reading the instructions at the top of the test page (see next slide).

10 Germany Cont.. Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The type of sensation you experience depends on which area of the brain is activated. This is known as a. sensory localization. b.transduction. c.sensory adaptation.d.cerebralization. 2. A hypnic jerk usually occurs during a.light sleep.b.deep sleep.c.episodes of hypersomnia.d.episodes of sleep apnea. See p.14 Exercise 4-b

11 Germany p.14 Exercise 4-b 4.Consider the following testing practices and indicate which nineteenth-century psychological researcher probably should be credited with the origin? b. A teacher about to give a test reads aloud from the test manual: “Please read the instructions at the top of the page silently while I read them aloud…..” (see previous slide)

12 England Karl Pearson-----Pearson Correlation
Charles Spearman-----Spearman Correlation. Used Factor Analysis in his “Theory of Intelligence.” Galton----Categorizing half cousin to Darwin

13 *The Difference between Ratio IQ and Deviation IQ or Normative IQ
France Alfred Binet & Theodore Simon (1905) Developed the first IQ test. IQ=MA/CAx100 MA=Mental Age CA= Chronological Age *The Difference between Ratio IQ and Deviation IQ or Normative IQ

14 James McKeen Cattell  “Mental Testing”
United States James McKeen Cattell  “Mental Testing” Thorndike -- An Introduction to the Theory of Mental and Social Measurement Trail and Error  A Theory of Learning

15 Key Terms Test Optimal Performance Typical Performance
Observable Performance Constructs Measurement

16 Key Terms Test: Test is a Procedure for obtaining a sample of an individual’s performance. Optimal Performance: Refers to the performance on Aptitude Tests (GRE,SAT,ACT), or Achievement Tests (WRAT, WIAT)

17 Key Terms Typical Performance:
Refers to the performance on questioners and inventories to report one’s feelings, attitudes, interests, or reactions to a situation. Observable Performance: Refers to perform in an observable behavior (watching children interacting with each others, natural observation).

18 Key Terms Measurement: Quantifying an observable behavior or when quantitative value is given to a behavior. See Exercise 1 & 2 on P.14

19

20

21 Confounding Variables
Confounding variables are variables that the researcher failed to control, or eliminate, damaging the internal validity of an experiment. Also, known as a third variable or a mediator variable, can adversely affect the relation between the independent variable and dependent variable. Ex. Next

22 Ex. A research group might design a study to determine if heavy drinkers die at a younger age. Heavy drinkers may be more likely to smoke, or eat junk food, all of which could be factors in reducing longevity. A third variable may have adversely influenced the results.

23 Heavy drinkers die at a younger age

24 Intervening Variables
A variable that explains a relation or provides a causal link between other variables. Also called “Mediating Variable” or “intermediary variable.” Ex. Next slide

25 Intervening Variables
Ex: The statistical association between income and longevity needs to be explained because just having money does not make one live longer. Other variables intervene between money and long life. People with high incomes tend to have better medical care than those with low incomes. Medical care is an intervening variable. It mediates the relation between income and longevity.

26 Key Terms They are difficult to measure.
Constructs: Constructs are hypothetical concepts or psychological attributes/traits, such as personality, anxiety, depression etc. They are difficult to measure. Constructs are not physical attributes such as height and weight.

27 *Why do we have Measurement Problems in Psychology??
1.There is no single universal way of defining psychological construct 2. Psychological measurements are based on samples of behavior 3. Sampling of behavior results in errors in measurement 4.The units (scales) of measurements are not well defined. 5. The measurements must have demonstrated relationship to other variables to have meaning.

28 Role of Test Theory in Research & Evaluation
Selecting a Problem Operational Definitions of Variables Instruments Accuracy of the Instruments Data Collection Use of Statistics Optometrists and Ophthalmologists

29 Statistical Concepts for
Chapter 2 Statistical Concepts for Test Theory

30 Population Sample

31 Population and Sample Population: Population is the set of all individuals of interest for a particular study. Measurements related to Population are PARAMETERS. Sample: Sample is a set of individuals selected from a population. Measurements related to sample are STATISTICS.

32 Statistics The people chosen for a study are its subjects or participants, collectively called a sample The sample must be representative

33 Statistics Descriptive Inferential
Describes the distribution of scores and values such as mean, median, and mode Inferential Infer or draw a conclusion from a sample.

34 Key Terms Constant I.e. temp in learning and hunger Variable IV  manipulate DV  measure Discrete Numbers 1, 2 , 3, 14 Continues Numbers 1.3, 3.6

35 CONTINUOUS VERSUS DISCRETE VARIABLES
Discrete variables (categorical) Values are defined by category boundaries E.g., gender Continuous variables Values can range along a continuum E.g., height

36 Statistics Scales of Measurement Frequency Distributions and Graphs
Measures of Central Tendency Standard Deviations and Variances Z Score 1- Pearson Correlations 2- Spearman

37 Scales of Measurement (NOIR)
Nominal Scale Qualities Example What You Can Say What You Can’t Say Assignment of labels Gender— (male or female) Preference— (like or dislike) Voting record—(for or against) Each observation belongs in its own category An observation represents “more” or “less” than another observation

38 ORDINAL SCALE Rank in college Order of finishing a race
Qualities Example What You Can Say What You Can’t Say Assignment of values along some underlying dimension (order) Rank in college Order of finishing a race One observation is ranked above or below another. The amount that one variable is more or less than another

39 INTERVAL SCALE Number of words spelled correctly on
Qualities Example What You Can Say What You Can’t Say Equal distances between points arbitrary zero Number of words spelled correctly on Intelligence test scores Temperature One score differs from another on some measure that has equally appearing intervals The amount of difference is an exact representation of differences of the variable being studied

40

41 RATIO SCALE Age Weight Time? Absolute zero
Qualities Example What You Can Say What You Can’t Say Meaningful and non-arbitrary zero Absolute zero Age Weight Time? One value is twice as much as another or no quantity of that variable can exist Not much!

42 LEVELS OF MEASUREMENT Level of Measurement For Example Quality of Level Ratio Rachael is 5’ 10” and Gregory is 5’ 5” Absolute zero Interval Rachael is 5” taller than Gregory An inch is an inch is an inch Ordinal Rachael is taller than Gregory Greater than Nominal Rachael is tall and Gregory is short Different from Variables are measured at one of these four levels Qualities of one level are characteristic of the next level up The more precise (higher) the level of measurement, the more accurate is the measurement process

43 WHAT IS ALL THE FUSS? Measurement should be as precise as possible
In psychology, most variables are probably measured at the nominal or ordinal level But—how a variable is measured can determine the level of precision

44 Frequency Distributions and Graphs

45 histogram

46 *Histogram for Test Scores

47

48 Polygon

49 Frequency Distributions and Graphs

50

51

52

53 PERCENTILES When the results of a test for a specific person are presented in terms of Percentiles, we have direct information about that person’s performance relative to a group.

54 Quartiles and Z-Score

55

56

57

58

59 Platykurtic Mesokurtic, Leptokurtic

60 Frequency Distributions
2, 4, 3, 2, 5, 3, 6, 1, 1, 3, 5, 2, , 2 Σƒ=N=14 Ρ=ƒ/N P=Proportion %=P x 100

61 Frequency Distributions
X f fX Ρ=ƒ/N %=P x 100 Cum% /14= %

62 Frequency Distribution Table
X f fX P=f/n %= px100 Cumulative % 6 1 1/14=.07 7% 5 2 10 2/14=.14 14% 21% 4 8 35%

63 How do you Calculate Cumulative Percent ?
Add each new individual percent to the running tally of the percentages that came before it. For example, if your dataset consisted of the four numbers: 100, 200, 150, 50 then their individual values, expressed as a percent of the total (in this case 500), are 20%, 40%, 30% and 10%. The cumulative percent would be:1.Proportion 2.percentage 100/500=0.2x100: 20% 200: (i.e. 20% from the step before + 40%)= 60% 150: (i.e. 60% from the step before + 30%)= 90% 50: (i.e. 90% from the step before + 10%) = 100%

64 Frequency Distributions
X=2, f=4, N=14 Ρ=ƒ/N P=4/14=.29 %=P x 100= 29% X=3, f=3, N=14 P=3/14=.21 %= 21% μ=ΣƒX/Σƒ

65 Measures of Central Tendency
Mean Interval or Ratio scale The sum of the values divided by the number of values--often called the "average." μ=ΣX/N Add all of the values together. Divide by the total number of values to obtain the mean. Example: X 7 12 24 20 19 ????

66 Statistics The Mean is: μ=ΣX/N= 82/5=16.4 ( ) / 5 = 16.4.

67 Median Measures of Central Tendency
Median or Middle Ordinal Scale Divides the values into two equal halves, with half of the values being lower than the median and half higher than the median. Sort the values into ascending order. If you have an odd number of values, the median is the middle value. If you have an even number of values, the median is the arithmetic mean (see above) of the two middle values. Ex: The median of the same five numbers (7, 12, 24, 20, 19) is ???.

68 Mode The median is 19. Mode ----Nominal Scale
The most frequently-occurring value (or values). Calculate the frequencies for all of the values in the data. The mode is the value (or values) with the highest frequency. Example: For individuals having the following ages -- 18, 18, 19, 20, 20, 20, 21, and 23, the mode is ????

69 CHARACTERISTICS OF MODE
Nominal Scale Discrete Variable Describing Shape

70 The Range The Mode is 20 The Range: The Range is the difference between the highest number –lowest number +1 2, 4, 7, 8, and > Discrete Numbers 2, 4.6, 7.3, 8.4, and > Continues Numbers The difference between the upper real limit of the highest number and the lower real limit of the lowest number.

71 Variability

72 1. Describes the distribution
Variability Range, Interquartile Range, Semi-Interquartile Range, Standard Deviation, and Variance are the Measures of Variability Variability is a measure of dispersion or spreading of scores around the mean, and has 2 purposes: 1. Describes the distribution Next slide

73 Variability 2. How well an individual score (or group of scores) represents the entire distribution. i.e. in Z Score Ex. In inferential statistics we collect information from a small sample then, generalize the results obtained from the sample to the entire population. Next slide

74 Variability SS, Standard Deviations and Variances
X σ² = ss/N Pop σ = √ss/N 2 s² = ss/n-1 or ss/df Standard deviation s = √ss/df Sample SS=Σx²-(Σx)²/N  Computation SS=Σ( x-μ)²  Definition Sum of Squared Deviation from Mean Variance (σ²) is the Mean of Squared Deviations=MS

75 Suppose you earned a score of
X = 54 on an exam. Which set of parameters would give you the highest grade? a. μ= 50 and σ= σ²=4 b. μ= 50 and σ= σ²=16 c. μ= 54 and σ= σ²=4 d. μ= 54 and σ= σ²=16

76 Suppose you earned a score of
X = 46 on an exam. Which set of parameters would give you the highest grade? a. μ= 50 and σ= σ²=4 b. μ= 50 and σ= σ²=16 c. μ= 54 and σ= σ²=4 d. μ= 54 and σ= σ²=16

77 Covariance

78 Covariance Correlation is based on a statistic called Covariance (Cov xy or S xy) ….. COVxy=SP/N-1 Correlation-- r=sp/√ssx.ssy Covariance is a number that reflects the degree to which 2 variables vary together. Original Data X Y

79 Spearman Correlation rank order data then proceed
X Y

80 @Ranking/Monotonic Transformation
Score Rank position Final Rank

81

82 Z Scores Z=x-μ/ σ Single score Z=M-μ/ σm  Sample Mean for research
σm= σ/√n we use Z score when σ is known.

83 Z-Scores X= σ(Z)+µ µ= X- σZ σ= (X-µ)/Z If X=60 µ=50 σ= Z=?

84 Computations/ Calculations / Collect Data and Compute test Statistics Z Score for a Sample M=115, n=25

85 Z Score for Research Standard Error (σm )

86

87

88 Stanines Stanines are used to compare an individual student’s achievement with the results obtained by a national reference sample chosen to represent a certain year level i.e. 2nd level, 3rd level a nine-point scale used for normalized test scores, with 1-3 below average, 4-6 average, and 7-9 above average. It is a nine-point scale of standard score with mean of 5 and SD of 2.

89 The Correlational Method
Correlational data can be graphed and a “line of best fit” can be drawn 1- Pearson Correlations 2-Spearman

90 The Correlational Method
Correlation is the degree to which events or characteristics vary from each other. Measures the strength of a relationship Does not imply cause and effect

91 The Correlational Method
Correlation has 3 characteristics: 1. The Form of the Relationship 2. The Direction of the Relationship 3. The strength or Consistency of the Relationship

92 1. The Form of the Relationship
The most common use of correlation is to measure straight-line (linear form) relationship. However, other forms of relationships do exist and there are special correlations used to measure them.

93 2. The Direction of the Relationship
Correlational data can be graphed and a “line of best fit” can be drawn

94 Positive correlation = variables change in the same direction

95 Positive Correlation

96 Negative correlation = variables change in the opposite direction

97 Negative Correlation

98 Unrelated = No consistent relationship
No Correlation Unrelated = No consistent relationship

99 No Correlation

100 The Correlational Method
The magnitude (strength) of a correlation is also important High magnitude = variables which vary closely together; fall close to the line of best fit Low magnitude = variables which do not vary as closely together; loosely scattered around the line of best fit

101 3. The strength or Consistency of the Relationship
Direction and magnitude of a correlation are often calculated statistically Called the “Correlation Coefficient,” symbolized by the letter “r” Sign (+ or -) indicates direction Number (from 0.00 to 1.00) indicates magnitude 0.00 = no consistent relationship +1.00 = perfect positive correlation -1.00 = perfect negative correlation Most correlations found in psychological research fall far short of “perfect”

102 The Correlational Method
Correlations can be trusted based on statistical probability “Statistical significance” means that the finding is unlikely to have occurred by chance By convention/agreement, if there is less than a 5% probability that findings are due to chance or (p < 0.05), results are considered “significant,” and thought to reflect the larger population Generally, confidence increases with the size of the sample (n) and the magnitude of the correlation (r)

103 The Correlational Method
Advantages of correlational studies: Have high external validity Can generalize findings Can repeat (replicate) studies on other samples Difficulties with correlational studies: Lack internal validity Results describe but do not explain a relationship

104 External & Internal Validity
*External Validity External validity addresses the ability to generalize your study to other people and other situations. *Internal Validity Internal validity addresses the "true" causes of the outcomes that you observed in your study. Strong internal validity means that you not only have reliable measures of your independent and dependent variables BUT a strong justification that causally links your independent variables (IV) to your dependent variables (DV).

105 The Correlational Method Pearson
r=sp/√ssx.ssy Original Data X Y SP requires 2 sets of data SS requires only one set of data

106 The Correlational Method Spearman
r=sp/√ssx.ssy Original Data  Ranks X Y X Y SP requires 2 sets of data SS requires only one set of data

107

108 Regression and Prediction
Y=bX+a Regression Line e

109

110 Three Levels of Analysis for Prediction/Validity
INPUTS PROCESSES OUTCOMES Ex. Stress (INPUT) is an unpleasant psychological (PROCESS) that occurs in response to environmental pressures (job) and can lead to withdrawal/quit job (OUTCOME).

111 prognosis

112

113 Please read chapter 3 and 4 for the next week


Download ppt "*What is Test Theory? The study of measurement problems, influence of these measurement problems on psychological inventories, and how to create methods."

Similar presentations


Ads by Google