Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items.

Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items. Item Covariance: the extent to which two items rank order individuals similarly. Item Discrimination: The correct rank ordering of individuals. p(h) – p(l). Item Difficulty: Proportion of people getting an item correctly.

12345 123.2515.1115.0317.0915.35 215.1125.1116.1015.597.52 315.0316.1025.7215.52.94 417.0915.5915.5225.2011.51 515.35 7.52.9411.51 222.07 12345 1 1.00 2.63**1.00 3.62**.63**1.00 4.71**.62**.61**1.00 5.21**.10**.01**.15** 1.00 SD 4.82 5.01 5.07 5.0214.90

Item Difficulty Item Variance

The relationship between item difficulty and discrimination. Maximum item discrimination D Item difficulty p

Combinations of P h and P k for which the Maximum Correlation is 1.0, 0.7, 0.5, 0.3, 0.1, and 0.0 PkPk PhPh 0.0 0.1 0.3 0.5 0.7 1.0 0.7 0.5 0.3 0.1 0.0

TX Reliability is the extent to which your observed score represents your true score X

Reliability is the extent to which individual differences or rank ordering of individuals based on the observed scores represent that based on the true scores. One operations of this definition is the correlation between observed scores and true scores,  XT, which is called reliability index. Another operation is the squared correlation between observed score and true score or the proportion of observed score variance that is true score variance, or proportion of the consistent rank ordering,  XT 2 T X

In reality, it is the extent to which two tests yield similar results or similar rank ordering of the individuals,   XX’ X’ X Test-retest Parallel form Split half Internal consistency

Systematic error: 1. Systematic in relation to all the examinees: e.g., an interviewer overrates everyone. 2. systematic in relation to different groups of people. This is test bias or extraneous variance and is thus validity threat: e.g., non English speaker systematically suffer in IQ test because of the language difficulties. 3. systematic in relation to individuals, test anxiety. This is also not considered in classic theory. Random Error: E ET = 0, e.g., not all high ability persons only make lucky errors.    is constant, i.e., the average fluctuation from the true score is same across individuals, although each time, your observed score may deviate much or not much from the true score. E EE’ = 0, e.g., unlikely that you have the luck to happen to see the answer key again when you take the same test.

When ρxx' = 1, 1.the measurement has been made without error (e=0 for all examinees). 2.X = T for all examinees. 3.all observed score variance reflects true-score variance. 4.all differences among observed scores are true score differences. 5.the correlation between observed scores and true scores is 1. 6.the squared correlation between observed scores and true scores is 1. 7.the correlation between observed scores and errors is zero. When ρxx’ = 0, 1.only random error is included in the measurement. 2.X = E for all examinees. 3.all observed score variance reflects error variance. 4.all differences among observed scores are errors of measurement. 5.the correlation between observed scores and true scores is 0. 6.the squared correlation between observed and true is 0. 7.the correlation between the observed scores and errors is 1. When ρxx’ is between zero and 1, 1.the measurement include some error and some truth. 2.X = T + E. 3.observed score variance include true-score variance and error variance. 4.differences among observed scores reflect true-score differences and error differences. 5.the correlation between observed scores and true scores, i.e., reliability index, ranges between 0 and 1. 6.the squared correlation between observed scores and true scores, i.e., reliability coefficient, ranges between 0 and 1. 7.the correlation between observed scores and error is the square root of 1 – reliability.

Validity The Classics Different kinds of validity depending on test use. Content validity Construct validity (criterion related validity) Predictive validity Concurrent validity The Contemporary A unified approach to all validity issues in validating a test. Construct Underrepresentation Construct Irrelevant Variance Validity Threat from Response Process Validity Generalization Consequence Validity

Content Validity 1. Define domain content Intelligence tests and theories,20% Personality tests,20% Item characteristics,10% Reliability,20% Validity,15% Test development,15% Table of specifications HighLowTotal Intelligence tests and theories5%15%20% Personality tests,5%15%20% Item characteristics,5% 10% Reliability,15%5%20% Validity,10%5%15% Test development,5%10%15%

0.66 0.82 0.74 0.67 0.82 0.79 0.72 0.38 0.80 0.56 0.43 -0.04 0.65 0.64 0.65 0.05 0.18 0.42 Easygoing Responsivenes s 0.54 0.65 0.40 0.61 0.58 0.46 0.39 0.52 0.64 0.37 0.82 0.57 0.63 0.74 0.67 0.68 0.64 0.69 0.65 0.48 0.59 0.68 0.62 0.48 0.59 0.68 0.62 Authoritative Parenting Authoritarian Parenting 0.84 0.94 0.85 1.04 0.88 0.93 0.78 1.01 - 0.38 Physical Punishment Non Reasoning Authoritarian Directiveness Verbal Hostility Warmth Inductive Reasoning Democratic Participation Construct Validity: Internal Structure

Communication Avoidance Social Withdrawal Assertive Leadership Behavioral Aggression Verbal Aggression.55.58.73.96.94.90.70.60.65.67.69.87.89.82 Perceived Social Competence Time 1 Perceived Social Competence Time 2 Peer Acceptance Time 1 Peer Acceptance Time 2.59.50.54.65.62.66 Single Indicator.54.24 -.38 -.16 -.24 -.13 -.35 -.13.23.27.17 -.27 -.15 -.17 -.20 Construct Validity: Network Relations

A-LevelUniversity GPA SAT A-Level Criterion-Related Validity Concurrent Predictive

Rejected Selected Qualifying score Selected group Test Scores Criterion Distribution of criterion scores for selected group Distribution of scores on the criterion if no examinees were excluded Restriction of Range Effect

An Integrated View of Intelligence Theory

Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items.

Similar presentations

Presentation on theme: "Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items.

Similar presentations

Presentation on theme: "Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items."— Presentation transcript:

Similar presentations

About project

Feedback