Lessons learned in assessment

Slides:



Advertisements
Similar presentations
Presented by: Ray McNulty and Joe Shannon Its Time to Lead.
Advertisements

Continued Competence Maryann Alexander, PhD, RN Chief Officer, Nursing Regulation NCSBN.
Performance Assessment
Medical Education Outcomes Research Frederick Chen, MD, MPH Center for Primary Care Research Agency for Healthcare Research and Quality June 26, 2003.
ICN INTERNATIONAL CONFERENCE. SOUTH AFRICA JUNE 2006 Assessing Clinical Competence at Masters Level the case for the long case Helen Ward, Senior.
Building Leadership Team October Agenda Big Picture Formative Overview PLC Overview SMART Goal and Action Plan Plan.
One or two Is in development: individual and/or institutional? Dr Saranne Weller Kings Learning Institute.
Educational Supervision & Find Your Way Around in the E-portfolio Dr Jane Mamelok RCGP WPBA Clinical Lead.
Peer peer-assessment & peer- feedback
Where are we with assessment and where are we going?
Moving away from the lecture? BVMS programme 2013 Jim Anderson School of Veterinary Medicine MVLS.
Feedback in Clinical Skills Session in Pre-clinical Years Dr. Steve Martin Island Medical Program.
1 SESSION 5- RECORDING AND REPORTING IN GRADES R-12 Computer Applications Technology Information Technology.
The SCPS Professional Growth System
Wynne Harlen. What do you mean by assessment? Is there assessment when: 1. A teacher asks pupils questions to find out what ideas they have about a topic.
Action Research Not traditional educational research often research tests theory not practical Teacher research in classrooms and/or schools/districts.
Problems, Skills and Training Needs in Nonprofit Human Service Organizations Dr. Rick Hoefer University of Texas at Arlington School of Social Work.
Promoting Regulatory Excellence Self Assessment & Physiotherapy: the Ontario Model Jan Robinson, Registrar & CEO, College of Physiotherapists of Ontario.
Core Curriculum for Clinical Coaching Intro - VNIP Model
Workplace assessment Dr. Kieran Walsh, Editor, BMJ Learning. 2.
Flexible Assessment, Tools and Resources for PLAR Get Ready! Go! Presenter: Deb Blower, PLAR Facilitator Red River College of Applied Arts, Science and.
Frameworks for Assessment of Student Learning: Questions of Concept, Practice, and Detail Christine Siegel, Ph.D. Associate Professor of School Psychology.
Alternative Strategies for Evaluating Teaching How many have used end-of-semester student evaluations? How many have used an alternative approach? My comments.
Provisions for Training and Professional Development 1 Florida Digital Instructional Materials Work Group November 13, 2012.
Assessment of Professionals M. Schürch. How do you assess performance? How do you currently assess the performance of your residents? What standards do.
دکتر فرشید عابدی. Competence competence in medicine : “the habitual and judicious use of communication, knowledge, technical skills, clinical reasoning,
Workplace-based Assessment. Overview Types of assessment Assessment for learning Assessment of learning Purpose of WBA Benefits of WBA Miller’s Pyramid.
Educational Outcomes: The Role of Competencies and The Importance of Assessment.
An overview of Assessment. Aim of the presentation Define and conceptualise assessment Consider the purposes of assessment Describe the key elements of.
An overview of Assessment. Aim of the presentation Define and conceptualise assessment Consider the purposes of assessment Describe the key elements of.
Constructing a test. Aims To consider issues of: Writing assessments Blueprinting.
Assessment of Clinical Competence in Health Professionals Education
C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,
Training the OSCE Examiners
Assessment of Communication Skills in Medical Education
LOGO Teacher evaluation Dr Kia Karavas Session 5 Evaluation and testing in language education.
Assessment Tools. Contents Overview Objectives What makes for good assessment? Assessment methods/Tools Conclusions.
Work based assessment Challenges and opportunities.
Adolescent Sexual Health Work Group (ASHWG)
Evaluation: A Challenging Component of Teaching Darshana Shah, PhD. PIES
International Conference on Enhancement and Innovation in Higher Education Crowne Plaza Hotel, Glasgow 9 – 11 June 2015 Welcome.
Moving beyond the psychometric discourse: A model for programmatic assessment Researching Medical Education Association for the Study of Medical education.
R 3 P Colloquium American Board of Pediatrics Jan. 31 – Feb. 2, 2007 The Past, Present and Future Assessments of Clinical Competence A Canadian Perspective.
Learner Assessment Win May. What is Assessment? Process of gathering and discussing information from multiple sources to gain deep understanding of what.
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
MBBS, MPH, MCPS, MRCGP (UK), FRIPH (UK), FHAE (UK) TRAINEE EVALUATION METHOD Ass. Prof. Dr. Abdul Sattar KHAN Family & Community Medicine Department College.
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
Direct Observation of Clinical Skills During Patient Care NEW INSIGHTS – REYNOLDS MEETING 2012 Direct Observation Team: J. Kogan, L. Conforti, W. Iobst,
Student assessment AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.
Twilight Training October 1, 2013 OUSD CCSS Transition Teams.
Universiteit Maastricht Barcelona, 6 – 9 July th Ottawa conference on Medical Education.
Developing an Assessment System B. Joyce, PhD 2006.
Assessment Tools.
Assessing Your Learner Lawrence R. Schiller, MD, FACG Digestive Health Associates of Texas Baylor University Medical Center, Dallas.
Workplace based assessment for the nMRCGP. nMRCGP Integrated assessment package comprising:  Applied knowledge test (AKT)  Clinical skills assessment.
Assessing Competence in a Clinical Setting GRACE Session 12.
30/10/2006 University Leaders Meeting 1 Student Assessment: A Mandatory Requirement For Accreditation Dr. Salwa El-Magoli Chair-Person National Quality.
INTRODUCTION TO ASSESSMENT METHODS USED IN MEDICAL EDUCATION AND THEIR RATIONALE.
Reliability in assessment Cees van der Vleuten Maastricht University Certificate Course on Assessment 6 May 2015.
Maria Gabriela Castro MD Archana Kudrimoti MBBS MPH David Sacks PhD
Copyright © 2005 Avicenna The Great Cultural InstituteAvicenna The Great Cultural Institute 1 Student Assessment.
Medical Education Research A new Area
ASSESSMENT OF STUDENT LEARNING
Clinical Assessment Dr. H
On the relevance of (more) data-based decision making in education
THE JOURNEY TO BECOMING
Assessment 101 Zubair Amin MD MHPE.
Competency and Performance: What Should We Measure and How
Assessment of Clinical Competencies
Presentation transcript:

Lessons learned in assessment History, Research and Practical Implications Cees van der Vleuten Maastricht University MHPE, Unit 1, 3 June 2010 Powerpoint at: www.fdg.unimaas.nl/educ/cees/mhpe

Medical Education Anno 2008 Steep explosion of knowledge in medical education 10 international journals and many national ones 2 large international conferences, 2 large regional conferences and many national ones Mix of practice, research and theory Many training programmes, including masters and PhDs Groups of professionals appointed in medical schools A flourishing community of practice!

Overview of presentation Where is education going? Lessons learned in assessment Areas of development and research

Where is education going? School-based learning Discipline-based curricula (Systems) integrated curricula Problem-based curricula Outcome/competency-based curricula

Where is education going? Underlying educational principles: Continuous learning of, or practicing with, authentic tasks (in steps of complexity; with constant attention to transfer) Integration of cognitive, behavioural and affective skills Active, self-directed learning & in collaboration with others Fostering domain-independent skills, competencies (e.g. team work, communication, presentation, science orientation, leadership, professional behaviour….).

Where is education going? Constructivism Cognitive psychology Underlying educational principles: Continuous learning of, or practicing with, authentic tasks (in steps of complexity; with constant attention to transfer) Integration of cognitive, behavioural and affective skills Active, self-directed learning & in collaboration with others Fostering domain-independent skills, competencies (e.g. team work, communication, presentation, science orientation, leadership, professional behaviour….). Collaborative learning theory Cognitive load theory Empirical evidence

Where is education going? Work-based learning Practice, practice, practice…. Optimising learning by: More reflective practice More structure in the haphazard learning process More feedback, monitoring, guiding, reflection, role modelling Fostering of learning culture or climate Fostering of domain-independent skills (professional behaviour, team skills, etc).

Where is education going? Work-based learning Practice, practice, practice…. Optimising learning by: More reflective practice More structure in the haphazard learning process More feedback, monitoring, guiding, reflection, role modelling Fostering of learning culture or climate Fostering of domain-independent skills (professional behaviour, team skills, etc). Deliberate Practice theory Emerging work-based learning theories Empirical evidence

Where is education going? Educational reform is on the agenda everywhere Education is professionalizing rapidly A lot of ‘educational technology’ is available How about assessment?

Overview of presentation Where is education going? Lessons learned in assessment Areas of development and research

Miller’s pyramid of competence Does Lessons learned while climbing this pyramid with assessment technology Shows how Knows how Knows Miller GE. The assessment of clinical skills/competence/performance. Academic Medicine (Supplement) 1990; 65: S63-S7.

Assessing knowing how Does Shows how 60-ies: Knows how Knows how Written complex simulations (PMPs) Knows how Knows Knows

Key findings written simulations (Van der Vleuten, 1995) Performance on one problem hardly predicted performance on another High correlations with simple MCQs Experts performed less well than intermediate experts Stimulus format more important than the response format

Assessing knowing how Specific Lessons learned! Does Shows how Simple short scenario-based formats work best (Case & Swanson, 2002) Validity is a matter of good quality assurance around item construction (Verhoeven et al 1999) Generally, medical schools can do a much better job (Jozewicz et al 2002) Sharing of (good) test material across institutions is a smart strategy (Van der Vleuten et al 2004). Does Shows how Knows how Knows how Knows

Moving from assessing knows What is arterial blood gas analysis most likely to show in patients with cardiogenic shock? A. Hypoxemia with normal pH B. Metabolic acidosis C. Metabolic alkalosis D. Respiratory acidosis E. Respiratory alkalosis

To assessing knowing how A 74-year-old woman is brought to the emergency department because of crushing chest pain. She is restless, confused, and diaphoretic. On admission, temperature is 36.7 C, blood pressure is 148/78 mm Hg, pulse is 90/min, and resp are 24/min. During the next hour, she becomes increasingly stuporous, blood pressure decreases to 80/40 mm Hg, pulse increases to 120/min, and respirations increase to 40/min. Her skin is cool and clammy. An ECG shows sinus rhythm and 4 mm of ST segment elevation in leads V2 through V6. Arterial blood gas analysis is most likely to show: A. Hypoxemia with normal pH B. Metabolic acidosis C. Metabolic alkalosis D. Respiratory acidosis E. Respiratory alkalosis

http://www.nbme.org/publications/item-writing-manual.html

Maastricht item review process item analyses student comments anatomy physiology int medicine surgery psychology review committee test administration item pool Info to users item bank Pre-test review Post-test review

Assessing knowing how General Lessons learned! Does Shows how Competence is specific, not generic Assessment is as good as you are prepared to put into it. Does Shows how Knows how Knows how Knows

Assessing showing how Does 70-ies: Shows how Shows how Performance assessment in vitro (OSCE) Shows how Knows how Knows how Knows

Key findings around OSCEs1 Performance on one station poorly predicted performance on another (many OSCEs are unreliable) Validity depends on the fidelity of the simulation (many OSCEs are testing testing fragmented skills in isolation) Global rating scales do well (improved discrimination across expertise groups; better intercase reliabilities; Hodges, 2003) OSCEs impacted on the learning of students 1Van der Vleuten & Swanson, 1990

Reliabilities across methods Case- Based Short Essay2 0.68 0.73 0.84 0.82 Testing Time in Hours 1 2 4 8 MCQ1 0.62 0.76 0.93 PMP1 0.36 0.53 0.69 0.82 Oral Exam3 0.50 0.69 0.82 0.90 Long Case4 0.60 0.75 0.86 0.90 OSCE5 0.47 0.64 0.78 0.88 1Norcini et al., 1985 2Stalenhoef-Halling et al., 1990 3Swanson, 1987 4Wass et al., 2001 5Petrusa, 2002

Reliability oral examination (Swanson, 1987) Same Examiner for All Cases 0.31 0.47 0.48 New Examiner for Each Case 0.50 0.69 0.82 0.90 Two New Examiners for Each Case 0.61 0.76 0.86 0.93 Number of Cases 2 4 8 12 Testing Time in Hours 1 2 4 8

Checklist or rating scale reliability in OSCE1 1Van Luijk & van der Vleuten, 1990

Assessing showing how Specific Lessons learned! Does Shows how OSCE-ology (patient training, checklist writing, standard setting, etc.; Petrusa 2002) OSCEs are not inherently valid nor reliable, that depends on the fidelity of the simulation and the sampling of stations (Van der Vleuten & Swanson, 1990). Does Shows how Shows how Knows how Knows

Assessing showing how General Lessons learned! Does Shows how Objectivity is not the same as reliability (Van der Vleuten, Norman, De Graaff, 1991) Subjective expert judgment has incremental value (Van der Vleuten & Schuwirth, in prep) Sampling across content and jugdes/examiners is eminently important Assessment drives learning. Does Shows how Shows how Knows how Knows

Assessing does 90-ies: Performance assessment in vivo by judging work samples (Mini-CEX, CBD, MSF, DOPS, Portfolio) Does Does Shows how Shows how Knows how Knows

Key findings assessing does Ongoing work; this is where we currently are Reliable findings point to feasible sampling (8-10 judgments seems to be the magical number; Williams et al 2003) Scores tend to be inflated (Govaerts et al 2007) Qualitative/narrative information is (more) useful (Govaerts et al 2007) Lots of work still needs to be done How (much) to sample across instruments? How to aggregate information?

Reliabilities across methods Case- Based Short Essay2 0.68 0.73 0.84 0.82 Mini CEX6 0.73 0.84 0.92 0.96 Practice Video Assess- ment7 0.62 0.76 0.93 In- cognito SPs8 0.61 0.76 0.92 0.93 Testing Time in Hours 1 2 4 8 MCQ1 0.62 0.76 0.93 PMP1 0.36 0.53 0.69 0.82 Oral Exam3 0.50 0.69 0.82 0.90 Long Case4 0.60 0.75 0.86 0.90 OSCE5 0.47 0.64 0.78 0.88 1Norcini et al., 1985 2Stalenhoef-Halling et al., 1990 3Swanson, 1987 4Wass et al., 2001 5Petrusa, 2002 6Norcini et al., 1999 7Ram et al., 1999 8Gorter, 2002

Assessing does Specific Lessons learned! Does Does Shows how Knows how Reliable sampling is possible Qualitative information carries a lot of weight Assessment impacts on work-based learning (more feedback, more reflection…) Validity strongly depends on the users of these instruments and therefore on the quality of implementation. Does Shows how Knows how Knows

Assessing does General Lessons learned! Does Does Shows how Knows how Work-based assessment cannot replace standardised assessment (yet), or, no single measure can do it all (Tooke report, UK) Validity strongly depends on the implementation of the assessment (Govaerts et 2007) But, there is a definite place for (more subjective) expert judgment (Van der Vleuten & Schuwirth, under ed review). Does Shows how Knows how Knows

Competency/outcome categorizations CanMeds roles Medical expert Communicator Collaborator Manager Health advocate Scholar Professional ACGME competencies Medical knowledge Patient care Practice-based learning & improvement Interpersonal and communication skills Professionalism Systems-based practice

Measuring the unmeasurable “Domain independent” skills Does Shows how Knows how Knows “Domain specific” skills

Measuring the unmeasurable Importance of domain-independent skills If things go wrong in practice, these skills are often involved (Papadakis et 2005; 2008) Success in labour market is associated with these skills (Meng 2006) Practice performance is related to school performance (Padakis et al 2004).

Measuring the unmeasurable Assessment (mostly in vivo) heavily relying on expert judgment and qualitative information “Domain independent” skills Does Shows how Knows how Knows “Domain specific” skills

Measuring the unmeasurable Self assessment Peer assessment Co-assessment (combined self, peer, teacher assessment) Multisource feedback Log book/diary Learning process simulations/evaluations Product-evaluations Portfolio assessment

Eva, K. W., & Regehr, G. (2005). Self-assessment in the health professions: a reformulation and research agenda. Acad Med, 80(10 Suppl), S46-54.

Falchikov, N. , & Goldfinch, J. (2000) Falchikov, N., & Goldfinch, J. (2000). Student peer assessment in higher education: A meta-analysis comparing peer and teacher marks. Review of Educational Research, 70(3), 287-322.

Driessen, E. , van Tartwijk, J. , van der Vleuten, C. , & Wass, V Driessen, E., van Tartwijk, J., van der Vleuten, C., & Wass, V. (2007). Portfolios in medical education: why do they meet with mixed success? A systematic review. Med Educ, 41(12), 1224-1233.

General lessons learned Competence is specific, not generic Assessment is as good as you are prepared to put into it Objectivity is not the same as reliability Subjective expert judgment has incremental value Sampling across content and judges/examiners is eminently important Assessment drives learning No single measure can do it all Validity strongly depends on the implementation of the assessment

Practical implications Competence is specific, not generic One measure is no measure Increase sampling (across content, examiners, patients…) within measures Combine information across measures and across time Be aware of (sizable) false positive and negative decisions Build safeguards in examination regulations.

Practical implications Assessment is as good as you are prepared to put into it Train your staff in assessment Implement quality assurance procedures around test construction Share test material across institutions Reward good assessment and assessors Involve students as a source of quality assurance information

Practical implications Objectivity is not the same as reliability Don’t trivialize the assessment (and compromise on validity) with unnecessary objectification and standardization Don’t be afraid of holistic judgment Sample widely across sources of subjective influences (raters, examiners, patients)

Practical implications Subjective expert judgment has incremental value Use expert judgment for assessing complex skills Who is an expert depends on assessment context (i.e. peer, patient, clerk, etc) Invite assessors to provide qualitative information or mediation of feedback

Practical implications Sampling across content and judges/examiners is eminently important Use efficient test designs: use single examiners per test item (question, essay, station, encounter…) and different examiners across items Psychometrically analyse sources of variance affecting the measurement to optimise the sampling plan and sample sizes needed

Practical implications Assessment drives learning For every evaluative action there is an educational reaction Verify and monitor the impact of assessment (evaluate the evaluation); many intended effects are not actually effective -> hidden curriculum No assessment without feedback! Embed the assessment within the learning programme (cf. Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13(2), 181-208.) Use the assessment strategically to reinforce desirable learning behaviours

Practical implications No single measure can do it all Use a cocktail of methods across the competency pyramid Arrange methods in a programme of assessment Any method may have utility (including the ‘old’ assessment methods depending on its utility within the programme) Compromises on the quality of methods should be made in light of its function in the programme Compare assessment design with curriculum design Responsible people/committee(s) Use an overarching structure Involve your stakeholders Implement, monitor and change (assessment programmes ‘wear out’)

Practical implications Validity strongly depends on the implementation of the assessment Pay special attention to implementation (good educational ideas often fail due to implementation problems) Involve your stakeholders in the design of the assessment Many naive ideas exist around assessment; train and educate your staff and students.

Overview of presentation Where is education going? Where are we with assessment? Where are we going with assessment? Conclusions

Areas of development and research Understanding expert judgment

Understanding human judgment How does the mind work of expert judges? How is it influenced? Link between clinical expertise and judgment expertise? Clash between psychology literature on expert judgment and psychometric research.

Areas of development and research Understanding expert judgment Building non-psychometric rigour into assessment

Qualitative methodology as an inspiration Strategies for establishing trustworthiness: • Prolonged engagement • Triangulation • Peer examination • Member checking • Structural coherence • Time sampling • Stepwise replication • Dependability audit • Thick description • Confirmability audit Procedural measures and safeguards: • Assessor training & benchmarking • Appeal procedures • Triangulation across sources, saturation • Assessor panels • Intermediate feedback cycles • Decision justification • Moderation • Scoring rubrics • ………. Quantitative Qualitative Criterion approach approach Truth value Internal validity Credibility Applicability External validity Transferability Consistency Reliability Dependability Neutrality Objectivity Confirmability

Driessen, E. W. , Van der Vleuten, C. P. M. , Schuwirth, L. W. T Driessen, E. W., Van der Vleuten, C. P. M., Schuwirth, L. W. T., Van Tartwijk, J., & Vermunt, J. D. (2005). The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: a case study. Medical Education, 39(2), 214-220.

Areas of development and research Understanding expert judgment Building non-psychometric rigour into assessment Construction and governance of assessment programmes (Van der Vleuten 2005)

Assessment programmes How to design assessment programmes? Strategies for governance (implementation, quality assurance)? How to aggregate information for decision making? When is enough enough?

A model for designing programmes1 1Dijkstra, J. et al, in preparation.

Areas of development and research Understanding expert judgment Building non-psychometric rigour into assessment Construction and governance of assessment programmes Understanding and using assessment impacting learning

Assessment impacting learning Lab studies convincingly show tests improve retention and performance (Larsen et al., 2008) Relatively little empirical research supporting educational practice Absence of theoretical insights.

Theoretical model under construction1 CONSEQUENCES OF IMPACT Theoretical model under construction1 Cognitive processing strategies choice effort persistence Dr Hanan Al-Kadri Metacognitive regulation strategies choice effort persistence OUTCOMES OF LEARNING DETERMINANTS OF ACTION impact appraisal likelihood severity response appraisal efficacy costs value perceived agency interpersonal factors normative beliefs motivation to comply SOURCES OF IMPACT Assessment assessment strategy assessment task volume of assessable material sampling cues individual assessor 1Cilliers, F. in preparation.

Areas of development and research Understanding expert judgment Building non-psychometric rigour into assessment Construction and governance of assessment programmes Understanding and using assessment impacting learning Understanding and using qualitative information.

Understanding and using qualitative information Assessment is dominated by the quantitative discourse (Hodges 2006) How to improve the use of qualitative information? How to aggregate qualitative information? How to combine qualitative and quantitative information? How to use expert judgment here?

Finally Assessment in medical education has a rich history of research and development with clear practical implications (we’ve covered some ground in 40 yrs!) We are moving beyond the psychometric discourse into an educational design discourse We are starting to measure the unmeasurable Expert human judgment is reinstated as an indispensable source of information both at the method level as well as at the programmatic level Lots of exciting developments lie still ahead of us!

This presentation can be found at: www. fdg. unimaas This presentation can be found at: www.fdg.unimaas.nl/educ/cees/singapore

Literature Cillier, F. (In preparation). Assessment impacts on learning, you say? Please explain how. The impact of summative assessment on how medical students learn. Driessen, E., van Tartwijk, J., van der Vleuten, C., & Wass, V. (2007). Portfolios in medical education: why do they meet with mixed success? A systematic review. Med Educ, 41(12), 1224-1233. Driessen, E. W., Van der Vleuten, C. P. M., Schuwirth, L. W. T., Van Tartwijk, J., & Vermunt, J. D. (2005). The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: a case study. Medical Education, 39(2), 214-220. Dijkstra, J. , Schuwirth, L. & Van der Vleuten (In preparation) A model for designing assessment programmes. Eva, K. W., & Regehr, G. (2005). Self-assessment in the health professions: a reformulation and research agenda. Acad Med, 80(10 Suppl), S46-54. Gorter, S., Rethans, J. J., Van der Heijde, D., Scherpbier, A., Houben, H., Van der Vleuten, C., et al. (2002). Reproducibility of clinical performance assessment in practice using incognito standardized patients. Medical Education, 36(9), 827-832. Govaerts, M. J., Van der Vleuten, C. P., Schuwirth, L. W., & Muijtjens, A. M. (2007). Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment. Adv Health Sci Educ Theory Pract, 12, 239-260. Hodges, B. (2006). Medical education and the maintenance of incompetence. Med Teach, 28(8), 690-696. Jozefowicz, R. F., Koeppen, B. M., Case, S. M., Galbraith, R., Swanson, D. B., & Glew, R. H. (2002). The quality of in-house medical school examinations. Academic Medicine, 77(2), 156-161. Meng, C. (2006). Discipline-specific or academic ? Acquisition, role and value of higher education competencies., PhD Dissertation, Universiteit Maastricht, Maastricht. Norcini, J. J., Swanson, D. B., Grosso, L. J., & Webster, G. D. (1985). Reliability, validity and efficiency of multiple choice question and patient management problem item formats in assessment of clinical competence. Medical Education, 19(3), 238-247. Papadakis, M. A., Hodgson, C. S., Teherani, A., & Kohatsu, N. D. (2004). Unprofessional behavior in medical school is associated with subsequent disciplinary action by a state medical board. Acad Med, 79(3), 244-249. Papadakis, M. A., A. Teherani, et al. (2005). "Disciplinary action by medical boards and prior behavior in medical school." N Engl J Med 353(25): 2673-82. Papadakis, M. A., G. K. Arnold, et al. (2008). "Performance during internal medicine residency training and subsequent disciplinary action by state licensing boards." Annals of Internal Medicine 148: 869-876.

Literature Petrusa, E. R. (2002). Clinical performance assessments. In G. R. Norman, C. P. M. Van der Vleuten & D. I. Newble (Eds.), International Handbook for Research in Medical Education (pp. 673-709). Dordrecht: Kluwer Academic Publisher. Ram, P., Grol, R., Rethans, J. J., Schouten, B., Van der Vleuten, C. P. M., & Kester, A. (1999). Assessment of general practitioners by video observation of communicative and medical performance in daily practice: issues of validity, reliability and feasibility. Medical Education, 33(6), 447-454. Stalenhoef- Halling, B. F., Van der Vleuten, C. P. M., Jaspers, T. A. M., & Fiolet, J. B. F. M. (1990). A new approach to assessign clinical problem-solving skills by written examination: Conceptual basis and initial pilot test results. Paper presented at the Teaching and Assessing Clinical Competence, Groningen. Swanson, D. B. (1987). A measurement framework for performance-based tests. In I. Hart & R. Harden (Eds.), Further developments in Assessing Clinical Competence (pp. 13 - 45). Montreal: Can-Heal publications. van der Vleuten, C. P., Schuwirth, L. W., Muijtjens, A. M., Thoben, A. J., Cohen-Schotanus, J., & van Boven, C. P. (2004). Cross institutional collaboration in assessment: a case on progress testing. Med Teach, 26(8), 719-725. Van der Vleuten, C. P. M., & D. Swanson, D. (1990). Assessment of Clinical Skills With Standardized Patients: State of the Art. Teaching and Learning in Medicine, 2(2), 58 - 76. Van der Vleuten, C. P. M., & Newble, D. I. (1995). How can we test clinical reasoning? The Lancet, 345, 1032-1034. Van der Vleuten, C. P. M., Norman, G. R., & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of reliability. Medical Education, 25, 110-118. Van der Vleuten, C. P. M., & Schuwirth, L. W. T. (2005). Assessment of professional competence: from methods to programmes. Medical Education, 39, 309-317. Van der Vleuten, C. P. M., & Schuwirth, L. W. T. (Under editorial review). On the value of (aggregate) human judgment. Med Educ. Van Luijk, S. J., Van der Vleuten, C. P. M., & Schelven, R. M. (1990). The relation between content and psychometric characteristics in performance-based testing. In W. Bender, R. J. Hiemstra, A. J. J. A. Scherp bier & R. P. Zwierstra (Eds.), Teaching and Assessing Clinical Competence. (pp. 202-207). Groningen: Boekwerk Publications. Wass, V., Jones, R., & Van der vleuten, C. (2001). Standardized or real patients to test clinical competence? The long case revisited. Medical Education, 35, 321-325. Williams, R. G., Klamen, D. A., & McGaghie, W. C. (2003). Cognitive, social and environmental sources of bias in clinical performance ratings. Teaching and Learning in Medicine, 15(4), 270-292.