Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

Similar presentations


Presentation on theme: "UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT"— Presentation transcript:

1 UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT
CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS

2 CHAPTER 14 ITEM ANALYSIS
*The goal of test construction is to create a test with minimum length and good reliability and validity. *Item Analysis is the computation and examination of any statistical property of an item response distribution. *Item Analysis is a process that we go through when constructing a new test or subtests from a pool of items with good reliability and validity.

3 *Categories of Item Parameter
*Item parameters fall into 3 categories or indices. 1. Indices that describe the distribution of responses to a single item (e. g. mean and variance of item responses). 2. Indices that describe the degree of relationship between the response to the item and some criterion of interest. Ex. next

4 CHAPTER 14  ITEM ANALYSIS
Ex. The relationship between the questions (items) and the criterion of interest i.e., depression in Factor Analysis. 3. Indices that are a function of both, meaning relationship to item variance/mean and a criterion of interest. Ex. First, find the variance/mean for your items then, calculate the relationship between these items variance and the criterion of interest (i.e., depression) for two groups..

5 item difficulties (P) It is one of the 7 steps in Item Analysis. We use Item difficulties to select the best items.

6 Item difficulties (P) P= f/N or Number of examinees who answered an item correctly / Total number of participants (See your midterm item analysis and Chap 5). The higher the P value the easier the item

7

8 *Steps in Item Analysis
CHAPTER 14 ITEM ANALYSIS *Steps in Item Analysis In a typical item analysis the test developer will take 7 steps (they are similar to the process of test construction in Chapter 4). Next Slide

9 FYI Process of Test Construction Chap IV
1-Identifying purposes of test scores use 2-Identifying behaviors to represent the construct 3- Preparing test specification i.e., Bloom Taxonomy 4- Item construction 5- Item Review

10 Process of Test Construction
6- Preliminary item tryouts 7- Field test 8- Statistical Analysis 9- Reliability and Validity 10- Guidelines

11 *7 steps in item analysis (P)
1. Describe what proportions of the test score are of greatest important. Ex. when I select questions for your midterm/final exam I look for the similarities of the questions with those of qualifying/comprehensive or EPPP exams (Portions on Reliability and Validity).

12 7 steps in item analysis (P)
2. Identify the item parameters (e.g. mean, variance) most relevant to these proportions. 3. Administer the items to a sample of examinees representative of those for whom the test is intended. Ex. IQ test for children, or depression test for adults.

13 7 steps in item analysis (P)
4. Estimate for each item the parameters identified in step 2 i.e., variance). 5. Establish a plan for item selection. Ex. Using item difficulties (P) as in Item Analysis to select the items.

14 7 steps in item analysis (P)
6. Select the final subset of items, or use the data (Items in your Item Analysis) for test revision. Ex. Takeout all questions with very high or very low item difficulties (P). 7. Conduct a cross validation (validity) study. Ex. Use SPSS and compare the results of 2 tests or 2 classes (e. g. this year class and last year class).

15 UNIT V TEST SCORING AND INTERPRETATION
CHAP 17: CORRECTING FOR GUESSING AND OTHER SCORING METHODS CHAP 18: SETTING STANDARDS CHAP 19: NORMS AND STANDARD SCORES CHAP 20: EQUATINGSCORESFROM DIFFERENT TESTS

16 CHAPT 19 NORMS AND STANDARDS SCORES

17 NORMS AND STANDARD SCORES
1910 *Alfred Binet  Ratio IQ = Ratio of MA/CA

18 1912 Standardized it. In 1912 in Germany Psychologist
Wilhelm Stern proposed the following formula: IQ = [Mental Age/Chronological Age]100 Standardized it. This formula works fairly well for children but not for adults. *The abbreviation "IQ" was coined by the Stern for the German term Intelligenz-quotient Ratio IQ

19 NORMS AND STANDARD SCORES
1916 *Lewis Terman from Stanford University, publishes the Stanford-Binet Intelligence Test. He used the standardized version IQ = [Mental age/Chronological age]100

20 NORMS AND STANDARD SCORES
*Deviation IQ = Uses Norms to estimate the IQ We use Norms when we want to compare an examinee’s score (raw score) or score on a test to the distribution of scores (scaled or standard scores) for a sample from a well-defined population. Ex. next

21 NORMS AND STANDARD SCORES
Ex. When we want to estimate the IQ of a 20 year-old persons, We compare their raw score on the subtest of an IQ test with the people of their age, which is “their norm” (standard score). Using this technique tells us where they stand among the people of their age.

22 *9 Basic Steps in Conducting a Norming Study (p.432)
1. Identify the population of interest Ex. Students, employees of a company, inmates, patients, etc. 2. Identify the most critical statistics that will be computed for the sample data. Ex. Standard deviation σ, σ² , M, SS, p

23 3. Decide on the tolerable amount of sampling error
NORMS AND STANDARD SCORES *9Basic Steps in Conducting a Norming Study (p.432) 3. Decide on the tolerable amount of sampling error That is the discrepancy between the sample statistic (M) and population parameter, (µ) (Central Tendency M=µ). The Central Limit Theorem has 3 characteristics; 1. Central Tendency 2.The Shape of the Distribution (normal) and 3. Variability or Standard Error of Mean (σm). M-µ

24 9Basic Steps in Conducting a Norming Study (p.432)
4. Device a procedure for drawing a sample from the population of interest. There are 4 types of probability sampling I Simple Random Sampling Give everyone in the population an equal chance to be selected Ex. Draw names from a hat. II Systemic Sampling N/n Select every Kth name on the list. Ex. CAU Pop N=1500 and your sample size n=150 N/n=1500/150=10 Select every 10th student.

25 9Basic Steps in Conducting a Norming Study (p.432) Sampling cont..
III Stratified Sampling “Strata” means different layers. We use Stratified Sampling when we want to compare 2 different groups (e.g. Males and females CAU Doctoral Students). First we randomly select males then, randomly select females.

26 9Basic Steps in Conducting a Norming Study(p.432) Sampling cont..
IV Cluster Sampling We use Cluster sampling when the population consists of units not individuals, such as classes. Ex. Miami Dade School Districts. If we want to conduct a research with the Miami Dade 2nd graders ( nd grade classes). We’ll randomly select about 10 of these nd grade classes to be in our sample, then we conduct research.

27 9Basic Steps in Conducting a Norming Study (p.432)
5.Estimate the minimum sample size (n) required to hold the sampling error within the specific limits. There are different statistical procedures to estimate the (n). (n) should be ≥30. (Law of large number). 1. n= (σ/d)² d=effect size d=M-µ/σ 2. n= (σ/σm) ² σm= σ/√n Standard error of mean for pop Ex. Z score Sm=S/√n Estimated Standard Error of the Mean for a sample. Ex. t-distribution

28 NORMS AND STANDARD SCORES

29 The Effect Size Ex. Two Independent t-test

30 NORMS AND STANDARD SCORES

31 9Basic Steps in Conducting a Norming Study (p.432)
6. Draw the Sample and collect the Data 7. Compute the Values of the Group Statistics of interest and their standard error. Sm=S/√n or σm = σ/√n Calculate the standard error of measurement, which is the difference between M and µ. Also known as sampling error.

32 9Basic Steps in Conducting a Norming Study (p.432)
8. Identify the Types of Normative Scores that will be needed, and prepare the Normative Score Conversion table (see next 2 slide). 9. Prepare written documentation of the Normative Scores.

33 NORMS AND STANDARD SCORES
Types of Normative Scores Raw Score Score on a subtest or a test. Scaled Score Normative score for specific age.

34 Normative Scores “Wex-ler” Wex-ler

35 *Normative Scores

36 NORMS AND STANDARD SCORES
*Usefulness of Scaled Scores Scaled Scores are useful for two purpose: 1. Scaled scores relate the examinee’s performance to percentile rank scores of the norm group and their grade level. 2. In evaluation and research the mean scaled score is a better estimation of average group performance than the mean raw score.

37

38


Download ppt "UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT"

Similar presentations


Ads by Google