1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos. 2 REVIEW.

1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos

2 REVIEW

3 Test (& types of tests) REVIEW

4 Test (& types of tests) Item response scoring paradigms REVIEW

5 Test (& types of tests) Item response scoring paradigms Data paradigm of test theory (typical) REVIEW

6 DATA PARADIGM

7 Latent Trait   Re (unidimensional) REVIEW: Latent Trait

8 Latent Trait   Re (unidimensional) Real Examples of Latent Traits REVIEW: Latent Trait

9 Item Response Function (IRF) REVIEW: IRF

10 Item Response Function (IRF) –Represents different theories about latent traits. REVIEW: IRF

11 Item Response Function (IRF) –Dichotomous response: P j (  ) = Pr[X j = 1] = Pr[Correct Response to item j |  ] REVIEW: IRF

12 Item Response Function (IRF) –Polychotomous response: P jk (  ) = Pr[X j > k |  ] = Pr[Exceed category k of item j |  ] REVIEW: IRF

13 Item Response Function (IRF) –Dichotomous or Polychotomous response: E j (  ) = [Expected Rating for item j |  ] 0 < E j (  ) < K REVIEW: IRF

14 IRF: Dichotomous items

15 IRF: Polychotomous items

16 The unweighted total score X +n stochastically orders the latent trait  (Hyunh, 1994; Grayson, 1988) REVIEW: SCALES

17 4 Scales of Measurement –Conjoint Measurement REVIEW: SCALES

18 Conjoint Measurement –Row Independence Axiom REVIEW

19 Conjoint Measurement –Row Independence Axiom Property: Ordinal Scaling and unidimensionality of  (test score) REVIEW

20 INDEPENDENCE AXIOM (row)

21 Conjoint Measurement –Row Independence Axiom Property: Ordinal Scaling and unidimensionality of  (test score) IRF: Non-decreasing over  REVIEW

22 Conjoint Measurement –Row Independence Axiom Property: Ordinal Scaling and unidimensionality of  (test score) IRF: Non-decreasing over  Models: MH, 2PL, 3PL, 4PL, True Score, Factor Analysis REVIEW

23 2PL:

24 3PL:

25 4PL:

26 Monotone Homogeneity (MH)

27 Conjoint Measurement –Column Independence Axiom (adding) REVIEW

28 Conjoint Measurement –Column Independence Axiom (adding) Property: Ordinal Scaling and unidimensionality of both  (test score) and item difficulty (item score) REVIEW

29 INDEPENDENCE AXIOM (column)

30 Conjoint Measurement –Column Independence Axiom (adding) Property: Ordinal Scaling and unidimensionality of both  (test score) and item difficulty (item score) IRF: Non-decreasing and non-intersecting over  REVIEW

31 Conjoint Measurement –Column Independence Axiom (adding) Property: Ordinal Scaling and unidimensionality of both  (test score) and item difficulty (item score) IRF: Non-decreasing and non-intersecting over  Models: DM, ISOP REVIEW

32 DM/ISOP (Scheiblechner 1995)

33 Conjoint Measurement –Thomsen Condition (adding) REVIEW

34 Conjoint Measurement –Thomsen Condition (adding) Property: Interval Scaling and unidimensionality of both  (test score) and item difficulty (item score) REVIEW

35 Thomsen condition (e.g.,double cancellation)

36 Conjoint Measurement –Thomsen Condition (adding) Property: Interval Scaling and unidimensionality of both  (test score) and item difficulty (item score) IRF: Non-decreasing and parallel (non- intersecting) over  REVIEW

37 Conjoint Measurement –Thomsen Condition (adding) Property: Interval Scaling and unidimensionality of both  (test score) and item difficulty (item score) IRF: Non-decreasing and parallel (non- intersecting) over  Models: Rasch Model, ADISOP REVIEW

38 RASCH-1PL:

39 5 Challenges of Latent Trait Measurement REVIEW

40 5 Challenges of Latent Trait Measurement Test Theory attempts to address these challenges REVIEW

41 Test Construction (10 Steps) REVIEW

42 Test Construction (10 Steps) Basic Statistics of Test Theory REVIEW

43 Total Test Score (X + ) variance = Sum[Item Variances] + Sum[Item Covariances] REVIEW

44 EPSY 546: LECTURE 2 TRUE SCORE TEST THEORY AND RELIABILITY George Karabatsos

45 TRUE SCORE MODEL Theory: Test score is a random variable. X +n Observed Test Score of person n, T n True Test Score (unknown) e n Random Error (unknown)

46 TRUE SCORE MODEL The Observed person test score X +n is a random variable (according to some distribution) with mean T n = E(X +n ) and variance  2 (X +n ) =  2 (e n ).

47 TRUE SCORE MODEL The Observed person test score X +n is a random variable (according to some distribution) with mean T n = E(X +n ) and variance  2 (X +n ) =  2 (e n ). Random Error e n = X +n – T n is distributed with mean E(e n ) = E(X +n –T n ) = 0, and variance  2 (e n ) =  2 (X +n ).

48 TRUE SCORE MODEL True Score: T n true score of person n E (X n )expected score of person n sPossible score s  {0,1,…,s,…,S} p ns Pr[Person n has test score s]

49 TRUE SCORE MODEL 3 Assumptions: 1)Over the population of examinees, error has a mean of 0. E[e] = 0 2)Over the population of examinees, true scores and error scores have 0 correlation.  [T, e] = 0

50 TRUE SCORE MODEL 3 Assumptions: 3)For a set of persons, the correlations of the error scores between two testings is zero.  [e 1, e 2 ] = 0 –“Two testings”: when a set of persons take two separate tests, or complete two testing occasions with the same form. –The two sets of person scores are assumed to be randomly chosen from two independent distributions of possible observed scores.

51 TRUE SCORE ESTIMATION

54 TRUE SCORE ESTIMATION is test reliability. The proportion of variance of observed scores that is explained by the variance of the true scores.

55 TEST RELIABILITY is the error of measurement.

56 TEST RELIABILITY is the standard error of measurement. (random error)

57 TEST RELIABILITY is the standard error of measurement. (random error) Estimated ((1–  )*100)% confidence interval around the test score:

58 TEST RELIABILITY It is desirable for a test to be Reliable.

59 TEST RELIABILITY Reliability – the degree to which the respondents’ test scores are consistent over repeated administrations of the same test.

60 TEST RELIABILITY Reliability – the degree to which the respondents’ test scores are consistent over repeated administrations of the same test. Indicates the precision of a set of test scores in the sample.

61 TEST RELIABILITY Reliability – the degree to which the respondents’ test scores are consistent over repeated administrations of the same test. Indicates the precision of a set of test scores in the sample. Random and systematic error can affect the reliability of a test.

62 TEST RELIABILITY Reliability – the degree to which the respondents’ test scores are consistent over repeated administrations of the same test. Test developers have a responsibility to demonstrate the reliability of scores obtained from their tests.

63 ESTIMATING RELIABILITY Estimated item variance Estimated total test score variance

64 ESTIMATING RELIABILITY Estimated covariance between items i and j Estimated total test score variance

65 OTHER FORMS OF RELIABILITY Test-Retest Reliability: The correlation between persons’ test scores over two administrations of the same test.

66 OTHER FORMS OF RELIABILITY Split-Half Reliability (using Spearman-Brown correction for test length):  AB Correlation between scores of Test A and Test B

67 TEST VALIDITY VALIDITY: A test is valid if it measures what it claims to measure. Types: Face, Content, Concurrent, Predictive, Construct.

68 Face validity: When the test items appear to measure what the test claims to measure. Content Validity: When the content of the test items, according to experts, adequately represent the latent trait that the test intends to measure. TEST VALIDITY

69 Concurrent validity: When the test, measuring a particular latent trait, correlates highly with another test that measures the same trait. Predictive validity: When the scores of the test predict some meaningful criterion. TEST VALIDITY

70 Construct validity: A test has construct validity when the results of using the test fit hypotheses concerning the nature of the latent trait. The higher the fit, the higher the construct validity. TEST VALIDITY

71 RELIABILITY & VALIDITY Up to a point, reliability and validity increase together, but then any further increase in reliability (over ~.96) decreases validity. For e.g., when there is perfect reliability (perfect correlations between items), the test items are essentially paraphrases of each other.

72 RELIABILITY & VALIDITY “If the reliability of the items were increased to unity, all correlations between items would also become unity, and a person passing one item would pass all items and and another failing one item would fail all the other items. Thus all the possible scores would be a perfect score of one or zero…Is the dichotomy of scores the best that would be expected for items with equal difficulty?” (Tucker, 1946, on the attenuation paradox) (see also Loevinger, 1954)

1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos. 2 REVIEW.

Similar presentations

Presentation on theme: "1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos. 2 REVIEW."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos. 2 REVIEW.

Similar presentations

Presentation on theme: "1 EPSY 546: LECTURE 1 SUMMARY George Karabatsos. 2 REVIEW."— Presentation transcript:

Similar presentations

About project

Feedback