Download presentation
Presentation is loading. Please wait.
1
Developing a CAT
2
Holistic view? Pretty much every presentation at IACAT assumes one of two situations Fundamental research on CAT methodology Large org already doing CAT What about the other 99% (image of 99% protest)
3
Holistic view? Development of a CAT assessment requires a holistic view Project management Financials Content development Launch, marketing, branding Integrations
4
Holistic view? If we are going to get CAT used more often, the whole package needs to be sold to stakeholders There’s no shortage of resources on technical aspects like IRT
5
Holistic view? First step: planning the process
Define goal and work backward with needs (no surprise) Evaluate feasibility and cost
6
5 step model Framework, not complete recipe
Identify choices for your org and best way to investigate/decide Quality mgmt and project mgmt Also the foundation for validity arguments Why did you choose certain things?
7
The 5 step model Seq. Stage Primary work 1
Feasibility, applicability, and planning studies Monte carlo simulation; business case evaluation 2 Develop item bank content or utilize existing bank Item writing and review 3 Pretest and calibrate item bank Pretesting; item analysis 4 Determine specifications for final CAT Post-hoc or hybrid simulations 5 Publish live CAT Publishing and distribution; software development
8
CAT Components 1. Calibrated item bank 2. Starting rule
Test development side 1. Calibrated item bank 2. Starting rule 3. Item selection rule 4. Scoring rule 5. Stopping rule We must provide validity documentation on each Algorithms inside your testing engine
9
1. Feasibility, applicability, planning
Big question: is CAT worth the investment? If so, how can we develop a project plan and timeline?
10
1. Feasibility, applicability, planning
First answer: simulations to evaluate psychometric feasibility Simulate how a CAT would operate under specified conditions IVs Item bank size Item quality Desired precision DVs Average test length Accuracy: CAT θ vs. true θ (or full bank)
11
1. Feasibility, applicability, planning
For those newer to CAT… Three types of simulations Monte Carlo Post hoc (real data) Hybrid
12
1. Feasibility, applicability, planning
At this point, real data not likely, so Monte Carlo Generate plausible situations Item bank: 100, 200, 300… Item quality: a = 0.7, 0.8…; spread of b Desired precision: SEM = 0.2, 0.3, 0.4… Compare results to each other and fixed forms Base values on reality (e.g., mean a)
13
1. Feasibility, applicability, planning
Think of the results table you want to see Bank size Target SEM Mean test length Mean SEM (current test) - 100 .32 200 0.30 ? 0.40 300
14
1. Feasibility, applicability, planning
Think of the results table you want to see Bank size Target SEM Mean test length Mean SEM (current test) - 100 0.32 200 0.30 47 0.40 33 300 44 32
15
1. Feasibility, applicability, planning
Big question: Business Case Evaluation Costs Benefits Simulations focus on psychometric/content costs and benefits You must be thoughtful on the rest
16
1. Feasibility, applicability, planning
Sales Seat time Security Time away Benefits: Harder to estimate dollars and cents on some of these than others. Time away = work or instruction (k12) Seat time: $20/hour at 100,000 is $2M
17
1. Feasibility, applicability, planning
Sales Support Dev Tech Costs Tech = hardware, devops, and related issues. Mention that Silicon Valley just had episode complaining about AWS fees. (image?)
18
2. Develop item bank Now that we have an idea what we need, we need to build it CAT-based considerations: Difficulty spread Anticipated exposure/security issues TIF adequacy Normal considerations Content blueprints Cognitive level
19
3. Pretesting and analysis
Must pretest items to obtain bank calibration Two situations New test, new scale: present large amounts of items to examinees Existing test, old scale: seed items Obviously will take longer time to pilot Requires a linking study
20
3. Pretesting and analysis
Then calibrate, usually IRT Also perform other due diligence Dimensionality DIF Model fit Distractor analysis Remove/revise items based on stats? Etc.
21
4. Determine final specifications
To publish a CAT, we need to specify algorithms Starting point Item selection Scoring Termination criterion Also subalgorithms, such as item exposure, content, test length constraints
22
4. Determine final specifications
But we must have a reason for selecting specifications Validity documentation Defensibility Again, we turn to simulation studies Define competing conditions Big difference now: we have real data! Post Hoc or Hybrid simulations
23
4. Determine final specifications
After determining psychometric specifications, evaluate more practical issues For example, time limits; can’t really set until you know how many items CAT-ASVAB approach: set limits for 90-95% of population
24
5. Publish live CAT Once you have finalized your item bank and CAT design, time to publish Need to put everything into item banker and CAT engine First: obtain the item banker and CAT engine If developing your own, this can be the biggest step If purchasing, this is the easiest step
25
Epilogue: Maintaining CAT
Like fixed form testing, maintenance is usually necessary Check that performing as expected Is termination criterion being satisfied? Examinees hitting test length or other constraints? Average test length what you expected? Exposure or security issues?
26
Back to the big picture The 5 step model focuses on the test itself
That is but one component
27
Example Launch: K12 National assessment for medium-sized country
Teacher buy-in Proctor training Bandwidth and load testing: worst case? PR and stakeholder education Any of your experience?
28
Example Launch: Employment
Suite of employment tests Train your sales/support staff Load testing Proctor or other training? Integration with ATS or virtual proctoring?
29
Example Launch: Credentialing
National certification/licensure Rollout with vendor QA on their engine Stakeholder education Sales of a sort
30
Thank you! See PARE, Volume 16, #1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.