UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

Slides:



Advertisements
Similar presentations
Chapter 9 Introduction to the t-statistic
Advertisements

Chapter 6 Introduction to Inferential Statistics
Estimation in Sampling
Chapter 10: Sampling and Sampling Distributions
Chapter 7 Introduction to Sampling Distributions
Topics: Inferential Statistics
Chapter 7 Sampling and Sampling Distributions
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Introduction to Statistics Chapter 7 Sampling Distributions.
Chapter Sampling Distributions and Hypothesis Testing.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Today Concepts underlying inferential statistics
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Chapter 7 Probability and Samples: The Distribution of Sample Means
INTELLIGENCE AND PSYCHOLOGICAL TESTING. KEY CONCEPTS IN PSYCHOLOGICAL TESTING Psychological test: a standardized measure of a sample of a person’s behavior.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Chapter 5 DESCRIBING DATA WITH Z-SCORES AND THE NORMAL CURVE.
Characteristics of Psychological Tests
Week 9 Chapter 9 - Hypothesis Testing II: The Two-Sample Case.
Collecting Quantitative Data
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS 1.
Estimates and Sample Sizes Lecture – 7.4
Chap 6-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 6 Introduction to Sampling.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through.
CHAPTER 7 Probability and Samples: Distribution of Sample Means.
Sampling Distributions & Standard Error Lesson 7.
Chapter 11 – 1 Chapter 7: Sampling and Sampling Distributions Aims of Sampling Basic Principles of Probability Types of Random Samples Sampling Distributions.
LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University.
Chapter 7: Sampling and Sampling Distributions
Confidence Intervals: The Basics BPS chapter 14 © 2006 W.H. Freeman and Company.
Determination of Sample Size: A Review of Statistical Theory
Chapter 7 Probability and Samples: The Distribution of Sample Means.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
BUS216 Spring  Simple Random Sample  Systematic Random Sampling  Stratified Random Sampling  Cluster Sampling.
Chapter Eight McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. Sampling Methods and the Central Limit Theorem.
Sampling in Research Suter, Chapter 8. Questions about sampling Sample size – do I have enough participants? Is it the right kind of sample? Is it representative?
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Sampling Theory and Some Important Sampling Distributions.
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
The Statistical Imagination Chapter 7. Using Probability Theory to Produce Sampling Distributions.
Chapter 6 INFERENTIAL STATISTICS I: Foundations and sampling distribution.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Sampling Methods and the Central Limit Theorem
Chapter 7 Probability and Samples
CHAPT 7 Hypothesis Testing Applied to Means Part A
Sampling Distributions
Sampling Distribution Estimation Hypothesis Testing
Normal Distribution and Parameter Estimation
Chapter 6, Introduction to Inferential Statistics
Tips for exam 1- Complete all the exercises from the back of each chapter. 2- Make sure you re-do the ones you got wrong! 3- Just before the exam, re-read.
AP Unit 11 Testing and Individual Differences pt. 1
EXPLORING PSYCHOLOGY Unit 6 – Part 2 Intelligence Ms. Markham.
Chapter 7 Sampling Distributions
Econ 3790: Business and Economics Statistics
UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT
What are their purposes? What kinds?
Estimates and Sample Sizes Lecture – 7.4
Chapter Nine: Using Statistics to Answer Questions
Chapter 10: Intelligence & Testing
61.1 – Discuss the history of intelligence testing.
Reasoning in Psychology Using Statistics
Unit 11: Testing and Individual Differences
Statistics Review (It’s not so scary).
MBA 510 Lecture 4 Spring 2013 Dr. Tonya Balan 10/30/2019.
Presentation transcript:

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS

CHAPTER 14 ITEM ANALYSIS *The goal of test construction is to create a test with minimum length and good reliability and validity. *Item Analysis is the computation and examination of any statistical property of an item response distribution. *Item Analysis is a process that we go through when constructing a new test or subtests from a pool of items with good reliability and validity.

*Categories of Item Parameter *Item parameters fall into 3 categories or indices. 1. Indices that describe the distribution of responses to a single item (e. g. mean and variance of item responses). 2. Indices that describe the degree of relationship between the response to the item and some criterion of interest. Ex. next

CHAPTER 14  ITEM ANALYSIS Ex. The relationship between the questions (items) and the criterion of interest i.e., depression in Factor Analysis. 3. Indices that are a function of both, meaning relationship to item variance/mean and a criterion of interest. Ex. First, find the variance/mean for your items then, calculate the relationship between these items variance and the criterion of interest (i.e., depression) for two groups..

item difficulties (P) It is one of the 7 steps in Item Analysis. We use Item difficulties to select the best items.

Item difficulties (P) P= f/N or Number of examinees who answered an item correctly / Total number of participants (See your midterm item analysis and Chap 5). The higher the P value the easier the item

*Steps in Item Analysis CHAPTER 14 ITEM ANALYSIS *Steps in Item Analysis In a typical item analysis the test developer will take 7 steps (they are similar to the process of test construction in Chapter 4). Next Slide

FYI Process of Test Construction Chap IV 1-Identifying purposes of test scores use 2-Identifying behaviors to represent the construct 3- Preparing test specification i.e., Bloom Taxonomy 4- Item construction 5- Item Review

Process of Test Construction 6- Preliminary item tryouts 7- Field test 8- Statistical Analysis 9- Reliability and Validity 10- Guidelines

*7 steps in item analysis (P) 1. Describe what proportions of the test score are of greatest important. Ex. when I select questions for your midterm/final exam I look for the similarities of the questions with those of qualifying/comprehensive or EPPP exams (Portions on Reliability and Validity).

7 steps in item analysis (P) 2. Identify the item parameters (e.g. mean, variance) most relevant to these proportions. 3. Administer the items to a sample of examinees representative of those for whom the test is intended. Ex. IQ test for children, or depression test for adults.

7 steps in item analysis (P) 4. Estimate for each item the parameters identified in step 2 i.e., variance). 5. Establish a plan for item selection. Ex. Using item difficulties (P) as in Item Analysis to select the items.

7 steps in item analysis (P) 6. Select the final subset of items, or use the data (Items in your Item Analysis) for test revision. Ex. Takeout all questions with very high or very low item difficulties (P). 7. Conduct a cross validation (validity) study. Ex. Use SPSS and compare the results of 2 tests or 2 classes (e. g. this year class and last year class).

UNIT V TEST SCORING AND INTERPRETATION CHAP 17: CORRECTING FOR GUESSING AND OTHER SCORING METHODS CHAP 18: SETTING STANDARDS CHAP 19: NORMS AND STANDARD SCORES CHAP 20: EQUATINGSCORESFROM DIFFERENT TESTS

CHAPT 19 NORMS AND STANDARDS SCORES

NORMS AND STANDARD SCORES 1910 *Alfred Binet  Ratio IQ = Ratio of MA/CA

1912 Standardized it. In 1912 in Germany Psychologist Wilhelm Stern proposed the following formula: IQ = [Mental Age/Chronological Age]100 Standardized it. This formula works fairly well for children but not for adults. *The abbreviation "IQ" was coined by the Stern for the German term Intelligenz-quotient Ratio IQ

NORMS AND STANDARD SCORES 1916 *Lewis Terman from Stanford University, publishes the Stanford-Binet Intelligence Test. He used the standardized version IQ = [Mental age/Chronological age]100

NORMS AND STANDARD SCORES *Deviation IQ = Uses Norms to estimate the IQ We use Norms when we want to compare an examinee’s score (raw score) or score on a test to the distribution of scores (scaled or standard scores) for a sample from a well-defined population. Ex. next

NORMS AND STANDARD SCORES Ex. When we want to estimate the IQ of a 20 year-old persons, We compare their raw score on the subtest of an IQ test with the people of their age, which is “their norm” (standard score). Using this technique tells us where they stand among the people of their age.

*9 Basic Steps in Conducting a Norming Study (p.432) 1. Identify the population of interest Ex. Students, employees of a company, inmates, patients, etc. 2. Identify the most critical statistics that will be computed for the sample data. Ex. Standard deviation σ, σ² , M, SS, p

3. Decide on the tolerable amount of sampling error NORMS AND STANDARD SCORES *9Basic Steps in Conducting a Norming Study (p.432) 3. Decide on the tolerable amount of sampling error That is the discrepancy between the sample statistic (M) and population parameter, (µ) (Central Tendency M=µ). The Central Limit Theorem has 3 characteristics; 1. Central Tendency 2.The Shape of the Distribution (normal) and 3. Variability or Standard Error of Mean (σm). M-µ

9Basic Steps in Conducting a Norming Study (p.432) 4. Device a procedure for drawing a sample from the population of interest. There are 4 types of probability sampling I Simple Random Sampling Give everyone in the population an equal chance to be selected Ex. Draw names from a hat. II Systemic Sampling N/n Select every Kth name on the list. Ex. CAU Pop N=1500 and your sample size n=150 N/n=1500/150=10 Select every 10th student.

9Basic Steps in Conducting a Norming Study (p.432) Sampling cont.. III Stratified Sampling “Strata” means different layers. We use Stratified Sampling when we want to compare 2 different groups (e.g. Males and females CAU Doctoral Students). First we randomly select males then, randomly select females.

9Basic Steps in Conducting a Norming Study(p.432) Sampling cont.. IV Cluster Sampling We use Cluster sampling when the population consists of units not individuals, such as classes. Ex. Miami Dade School Districts. If we want to conduct a research with the Miami Dade 2nd graders (1000- 2nd grade classes). We’ll randomly select about 10 of these 1000- 2nd grade classes to be in our sample, then we conduct research.

9Basic Steps in Conducting a Norming Study (p.432) 5.Estimate the minimum sample size (n) required to hold the sampling error within the specific limits. There are different statistical procedures to estimate the (n). (n) should be ≥30. (Law of large number). 1. n= (σ/d)² d=effect size d=M-µ/σ 2. n= (σ/σm) ² σm= σ/√n Standard error of mean for pop Ex. Z score Sm=S/√n Estimated Standard Error of the Mean for a sample. Ex. t-distribution

NORMS AND STANDARD SCORES

The Effect Size Ex. Two Independent t-test

NORMS AND STANDARD SCORES

9Basic Steps in Conducting a Norming Study (p.432) 6. Draw the Sample and collect the Data 7. Compute the Values of the Group Statistics of interest and their standard error. Sm=S/√n or σm = σ/√n Calculate the standard error of measurement, which is the difference between M and µ. Also known as sampling error.

9Basic Steps in Conducting a Norming Study (p.432) 8. Identify the Types of Normative Scores that will be needed, and prepare the Normative Score Conversion table (see next 2 slide). 9. Prepare written documentation of the Normative Scores.

NORMS AND STANDARD SCORES Types of Normative Scores Raw Score Score on a subtest or a test. Scaled Score Normative score for specific age.

Normative Scores “Wex-ler” Wex-ler

*Normative Scores

NORMS AND STANDARD SCORES *Usefulness of Scaled Scores Scaled Scores are useful for two purpose: 1. Scaled scores relate the examinee’s performance to percentile rank scores of the norm group and their grade level. 2. In evaluation and research the mean scaled score is a better estimation of average group performance than the mean raw score.