Presentation is loading. Please wait.

Presentation is loading. Please wait.

Measurement Joseph Stevens, Ph.D. © 2005.  Measurement Process of assigning quantitative or qualitative descriptions to some attribute Operational Definitions.

Similar presentations


Presentation on theme: "Measurement Joseph Stevens, Ph.D. © 2005.  Measurement Process of assigning quantitative or qualitative descriptions to some attribute Operational Definitions."— Presentation transcript:

1 Measurement Joseph Stevens, Ph.D. © 2005

2  Measurement Process of assigning quantitative or qualitative descriptions to some attribute Operational Definitions  Assessment Collection of measurement information Interpretation Synthesis Use  Evaluation Value added to assessment information (e.g. good, poor, “ought”, “needs improvement”)

3 Assessment Decisions/Purposes  Instructional  Curricular  Treatment/Intervention  Placement/Classification  Selection/Admission  Administration/Policy-making  Personal/Individual  Personnel Evaluation

4 Scaling  Process of systematically translating empirical observations into a measurement scale  Origin  Units  Information  Types of scales

5 Score Interpretation  Direct interpretation  Need for analysis, relative interpretation  Normative interpretation  Anchoring/Standards

6 Frames of Reference for Interpretation  Current versus future performance  Typical versus maximum or potential  Standard of comparison To self To others To standard  Formative versus summative

7 Domains  Cognitive Ability/Aptitude Achievement Memory, perception, etc.  Affective Beliefs Attitudes Feelings, interests, preferences, emotions  Behavior

8 Cognitive Level  Knowledge  Comprehension  Application  Analysis/Synthesis  Evaluation

9 Assessment Tasks  Selected Response – MC, T-F, matching  Restricted Response – cloze, fill-in, completion  Constructed Response - essay  Free Response/Performance Assessments Products Performances  Rating  Ranking  Magnitude Estimation

10 CRT versus NRT  Criterion Referenced Tests (CRT) Comparison to a criterion/standard Items that represent the domain  Relevance  Representativeness  Norm Referenced Tests Comparison to a group Items that discriminate one person from another

11 Kinds of Scores  Raw  Standard scores  Developmental Standard Scores  Percentile Ranks (PR)  Normal Curve Equivalent (NCE)  Grade Equivalent (GE)

12 Scoring Methods  Objective  Subjective Holistic Analytic

13

14

15 Aggregating Scores  Total scores  Summated scores  Composite scores  Issues Intercorrelation of components Variance Reliability

16 Theories of Measurement  Classical Test Theory (CTT) X = T + E  Item Response Theory (IRT) http://work.psych.uiuc.edu/irt/tutorial.asp

17

18

19 Reliability  Consistency  Consistency of Decisions  Prerequisite to validity  Errors in measurement

20 Reliability  Sources of errors Variations in physical and mental condition of person measured Changes in physical or environmental conditions Tasks/Items Administration conditions Time Skill to skill Raters/judges Test forms

21 Estimating Reliability  Reliability versus standard error of measurement (SEM)  Internal Consistency Cronbach’s alpha Split-half Example  Test-Retest  Inter-rater

22 Estimating Reliability  Correlations, rank order versus exact agreement  Percent Agreement Exact versus close (number of agreements/number of scores x 100) Problem of chance agreements

23 Estimating Reliability  Kappa Coefficient Takes chance agreements into account Calculate expected frequencies and subtract Kappa ≥.70 acceptable Examine pattern of disagreements  Example Example Percent agreement = 63.8% r =.509 Kappa =.451

24 BelowMeetsExceedsTotal Below93113 Meets48214 Exceeds2169 Total1512936

25 Estimating Reliability  Spearman-Brown prophecy formula  More is better

26 Reliability as error  Systematic error  Random error  SEM _______ SEM = SD x √ 1 - r xx

27 Factors affecting reliability  Time limits  Test length  Item characteristics Difficulty Discrimination  Heterogeneity of sample  Number of raters, quality of subjective scoring

28 Validity  Accuracy  Unified View (Messick) Use and Interpretation Evidential basis  Content  Criterion  Concurrent-Discriminant  Construct Consequential basis

29 Validity  Internal, structural  Multitrait-Multimethod (Campbell & Fiske)  Predictive

30 Test Development  Construct Representation Content analysis Review of research Direct observation Expert judgment (panels, ratings, Delphi) Instructional objectives

31 Test Development  Blueprint Content X Process Domain sampling Item frames Matching item type and response format to purpose  Item writing  Item Review (grammar, readability, cueing, sensitivity)

32 Test Development  Writing instructions  Form design (NAEP brown ink)  Field and pilot testing  Item analysis  Review and revision

33 Equating  Need to link across forms, people, or occasions  Horizontal equating  Vertical equating  Designs Common item Common persons

34 Equating  Equipercentile  Linear  IRT

35 Bias and Sensitivity  Sensitivity in item and test development  Differential results versus bias Differential Item Functioning (DIF) Importance of matching, legal versus psychometric Understanding diversity and individual differences

36 Item Analysis  Difficulty, p  Means and standard deviations  Discrimination, r-point biserial  Omits  Removing or revising “bad” items  Example Example

37 Factor Analysis  Method of evaluating structural validity and reliability  Exploratory (EFA) exampleexample  Confirmatory (CFA) exampleexample


Download ppt "Measurement Joseph Stevens, Ph.D. © 2005.  Measurement Process of assigning quantitative or qualitative descriptions to some attribute Operational Definitions."

Similar presentations


Ads by Google