Starting a Performance Based Testing Program:

Slides:



Advertisements
Similar presentations
The Network of Dynamic Learning Communities C 107 F N Increasing Rigor February 5, 2011.
Advertisements

Chapter 2 Analyzing the Business Case.
BSBIMN501A QUEENSLAND INTERNATIONAL BUSINESS ACADEMY.
You’ve Been Shown: Now It’s Your Turn to Ask! CLEAR 2004Kansas City, Missouri Responding to Your Questions About Testing.
Some Practical Steps to Test Construction
CAP 252 Lecture Topic: Requirement Analysis Class Exercise: Use Cases.
Analyzing the Business Case
Instructional Design Dr. Lam TECM 5180.
La Naturaleza.  The generic term for the five-phase instructional design model consisting of Analysis, Design, Development, Implementation, and Evaluation.
Student Learning Objectives: Setting Goals for Student Growth Countywide Professional Development Day Thursday, April 25, 2013.
Instructional Elements Key Components to the Lesson.
NEXT GENERATION BALANCED ASSESSMENT SYSTEMS ALIGNED TO THE CCSS Stanley Rabinowitz, Ph.D. WestEd CORE Summer Design Institute June 19,
BTS730 Communications Management Chapter 10, Information Technology Management, 5ed.
Edward A. Shafer, Director, CTE Technical Assistance Center of New York,
Student Learning Objectives: Setting Goals for Student Growth Countywide Professional Development Day Thursday, April 25, 2013 This presentation contains.
WELNS 670: Wellness Research Design Chapter 5: Planning Your Research Design.
S I R F U R Q A N U L H A Q U E T H E C I T Y S C H O O L G U S H A N B O Y S C A M P U S S E N I O R S E C T I O N P R I N C I P L E S O F A C C O U N.
Project Life Cycle.
1 Technical & Business Writing (ENG-315) Muhammad Bilal Bashir UIIT, Rawalpindi.
User Management: Understanding Roles and Permissions for Schoolnet Schoolnet II Training – Summer 2014.
Formative and Summative Assessment in the Classroom
Critical Thinking Lesson 8
Classroom Assessment (1) EDU 330: Educational Psychology Daniel Moos.
NEFIS (WP5) Evaluation Meeting, November 2004 Evaluation Metadata Aljoscha Requardt, University of Hamburg Response rate: 93% (14 of 15 partners.
Program Evaluation Making sure instruction works..
Session 6: Data Flow, Data Management, and Data Quality.
California Assessment of Student Performance and Progress CAASPP Insert Your School Logo.
P3 Business Analysis. 2 Section F: Project Management F1.The nature of projects F2. Building the Business Case F4. Planning,monitoring and controlling.
Shared Services Initiative Summary of Findings and Next Steps.
Good Morning and welcome. Thank you for attending this meeting to discuss assessment of learning, pupil progress and end of year school reports.
Designing Quality Assessment and Rubrics
NZQA Update Registration. Topics The new structure What do the levels mean Proposed entry requirements Levels and Prescriptions Transition timeline and.
Good teaching for diverse learners
APICS Certification and Endorsement Comparison Chart Designation Name
Quality Assurance processes
Helping Students Examine Their Reasoning
3 Chapter Needs Assessment.
SWIMMING IN THE DATA LAKE
The Managerial Process of Crafting and Executing Strategy
Working in Groups in Canvas
Human-Machines Systems Engineering
The importance of project management
SAMPLE Develop a Comprehensive Competency Framework
IT Project Management Version IT Industry Apprenticeship System
Intermediate Small Business Programs, Part B SBP 202 Lesson 1: Introduction February 2017 Lesson 1: Introduction.
Discover the Secrets of ITSM Licensing
APICS Certification and Endorsement Comparison Chart Designation Name
Personal Management Skills
Advance Candidacy December 19, 2017 Follow us on
By Kean Tak, MSc, Lecturer at RUPP
Buy CompTIA PK0-004 Exam Real Questions Dumps PDF - PK0-004 Study Material - Realexamdumps.com
ECONOMETRICS ii – spring 2018
A+ certification 2015 Guidelines.
High-Leverage Practices in Special Education: Assessment ceedar.org
Counseling with Depth of Knowledge
APICS Certification and Endorsement Comparison Chart Designation Name
Integrating Outcomes Learning Community Call February 8, 2012
Elicit Written Responses to an online text
Analyze Student Work Sample 2 Instructional Next Steps
Chapter 21: Completing the Research Process
A Model for Successful Instructional Design Copyright © Media
Studio day : Monday February 4
MAP-IT: A Model for Implementing Healthy People 2020
Project Based Learning
Existing Franchisor Example
Portfolio Information PPT
Project Management.
Time Scheduling and Project management
CLASS KeysTM Module 6: Informal Observations Spring 2010
Microsoft MB-330 Microsoft Dynamics 365 Unified Operations Core.
Presentation transcript:

Starting a Performance Based Testing Program: From Inception to Delivery

We are a non-profit trade association advancing the global interests of IT professionals and companies. Certification- We are the leading provider of technology neutral and vendor neutral IT certifications. 21 Certification programs in the IT space; 4 of which have performance based items.

Why Performance Based Items? Our customers are demanding it. We have the resources and bandwidth to do it. Differentiate in the marketplace. “It looks cool.” More reasons to come later….. Face validity- our tests look like it is going to measure what it is supposed to measure.

The “Why” for CompTIA Testing Reasons Business Reasons More appropriately validates skills and knowledge areas on our certifications. Higher level objectives that are critical to the certification job role are challenging to test with text based items. Align with real-world job situations Keep up with IT industry demands Leverage new and current technology Stay aligned with our organization’s mission statement: innovation Testing Reasons Business Reasons

The “Why NOT” for CompTIA Development- Extended Timeline for development, higher number of revisions, more fixed forms for beta testing. Delivery- Channel upgrades, translations Analysis- custom data tools, longer complicated analysis due to data capture. The purpose of using innovative items should be very clear. Your ROI might be comparable if the PBT is successful but if not planned it will be a budget buster and result in extended project schedules. What is the ROI compared to Text based items.

Where do we begin? Beta test/feedback Item Idea Review 2nd draft/time Vendor Review 1st draft Review 2nd draft/time Beta test/feedback Live Test

Item Process Item Idea Select appropriate objectives based on taxonomy and weighting Group brainstorm and idea development Propose idea to entire group Revise idea and incorrect paths Propose new or adjusted idea to entire group

Item Specification Sheet

Item Details What is the rationale behind the item? What should the initial start point of the item look like? Are there any special environment instructions? Is anything different from real life? Can multiple objectives be referenced? Should the instructions be embedded in the graphic or just in the stem?

Item Details – spec sheet

Correct/Incorrect Details Correct Paths: Map out the multiple correct paths (in order, if it is important). If you find you have too many correct paths, add more criteria/requirements to the stem. You do not want to make the item too large. Incorrect Paths: Should it appear as a correct path? How many distracter paths are necessary for the item?

Item Details – spec sheet

Item Details – spec sheet

Item Details - Scoring How many points are appropriate? Does each point contain several steps? If so, how are you tracking these steps? Do you have different levels of score points? Should you take away points for certain tasks?

Item Details – spec sheet

Item Details – spec sheet

Process: Review Phases Core SMEs review items with no directions to confirm the candidate’s experience / scoring Compile notes of what works and areas of improvement Review all expert feedback and implement final feedback with development vendor. Time and responses recorded on second draft in order to seed.

Process: Beta/Test Feedback Collect candidate comments and statistics Discuss findings with core SME group and psychometrician Revise item if needed Roll out to live test forms as scored item

Process: Live Test Balance scored and unscored simulations by time data and estimates for all forms Balance points and objective areas Fixed unscored simulations vs. unfixed unscored simulations Acceptable performance based items can become multiple versions

Response Data S1HostA|Router[PC[0.0.0.0; S2HostB|Switch3[Switch4[Switch5[Switch6[1.2.3.4; S3HostC|Printer[Virus[Malware[2.2.2.2; ~S1|1.5[1; S2|3.2[3; S3|1.5[1; ~Score|5 Delimiters: These are critical to analyzing a response and to tweak items where necessary. Make sure you choose delimiters that will not be used during the exam. When planning response data? Here is what you need to ask yourself: What data points do you want to capture? Do you have enough response data to analyze what candidates are doing during the beta process? Are you sure that you did not miss any possible correct paths? Do all the distracter paths function or are paths not being used? Is the scoring working as expected? Is the item taking too long or is it too difficult? Delimiters: | separates key and value (meaningful name and response data) [ separates values; ; separates one full response string from the next, ~ separates the scoring from the responses (child from parent score).

Psychometric Considerations Why include simulations? Validity Cognitive complexity Mitigate security issues Can take a lot of valuable time Can be good or bad, just like MC Need to be analyzed and evaluated Proper data need to be collected Use a consistent data format Use delimiters that can’t otherwise be used

Psychometric Considerations 2 observations… Sims are more difficult… and more stable

Sample SIM Analysis Parent/child overall points Possible Points 5 Scoring Opportunities 3 (worth 1, 2, & 2 points) Average Item Score 1.81 Item-Score Correlation .46 Median Response Time 355 seconds But are they measuring the right people and skills?

Sample SIM Analysis p-value correlation avg. time ScoreOpp1 0.438 0.475 147 ScoreOpp2 0.384 0.314 159 ScoreOpp3 0.301 0.293 129 ScoreOpp1 option p-value correlation avg. time 12 to 49 50 to 60 61 to 68 69 to 74 75 to 92 S2Server|A 0.014 -0.112 312 1 S2Server|B -0.140 215 S2Server|C 0.041 -0.063 147 S2Server|D 0.098 100 S2Server|H1 0.192 -0.083 163 5 3 > S2Server|H2 0.438 0.475 7 10 S2Server|H3 -0.176 225 2 S2Server|H4 0.055 -0.206 111 S2Server|H5 -0.021 77 S2Server|H6 0.110 -0.194 151 S2Server|null 0.068 -0.111 82 Scoring opportunities 2 and 3 have similar analyses.

Sample SIM Analysis SIM21 Points p-value correlation avg. time 12 to 49 50 to 60 61 to 68 69 to 74 75 to 92 00-zero 0.428 -0.200 328 38 23 37 40 8 01-one 0.152 -0.100 407 14 12 11 10 5 02-two 0.100 -0.019 304 6 7 03-three 0.196 0.154 381 9 13 24 04-four 0.003 -0.048 190 1 05-five 0.120 0.253 338 4 20 SIM21 option p-value correlation avg. time 12 to 49 50 to 60 61 to 68 69 to 74 75 to 92 5 "<response status=""… 0.137 0.283 350 1 3 4 0.014 0.071 385 -0.048 252 0.062 345 0.272 808 0.027 0.211 376 0.043 327 0.110 0.119 379 2 -0.016 658 -0.167 335 … and so on for 3 dozen more rows.

Lessons Learned Plan response data during the initial draft phase (i.e. correct/ incorrect response samples). Communicate with all vendors and stakeholders before starting your performance based items. Realize this is a lengthy time consuming process. Do NOT make multiple versions of an item until it is in a final acceptable state. Revisit scoring-All or nothing vs. partial. Translations create new challenges.

Questions?