Assessment, validation and moderation TVET Australia Assessment, validation and moderation A power point presentation developed by the NQC to support information sessions on assessment, validation and moderation
Contact NQC Secretariat TVET Australia Level 22/ 390 St Kilda Road Melbourne Vic 3004 Telephone: +61 3 9832 8100 Email: nqc.secretariat@tvetaustralia.com.au Web: www.nqc.tvetaustralia.com.au
Disclaimer This work has been produced on behalf of the National Quality Council with funding provided through the Australian Government Department of Education, Employment and Workplace Relations and state and territory governments. The views expressed in this work are not necessarily those of the Australian Government or state and territory governments
Acknowledgement This presentation was designed to support the interactive information sessions that formed part of the NQC’s communication and dissemination strategy: NQC products: validation and communication. Reports and materials which focus on validation and moderation may be downloaded from the NQC website at http:www.nqwc.tvetaustralia.com.au/nqc_publications This work was produced for the National Quality Council by Andrea Bateman, Quorum QA Australia Pty Ltd Chloe Dyson, Quorum QA Australia Pty Ltd
Quality of assessment
Setting the Scene Concerns about the quality of assessments and comparability of standards across the VET sector OECD (2008) Reviews of VET (Australia) NQC (2008) Industry Expectations of VET Service Skills SA (2010) VETiS Project A number of studies have recently been completed that highlight concerns from industry and government associations regarding the quality of assessment in the VET sector in general. For example, in 2008, the OECD reviewed the Australian VET Sector had raised serious concerns with the comparability of standards across the entire Australian. It was recommended in this report, that “Training Packages be replaced by simple and much briefer statements of skills standards. Consistency in standards throughout Australia should be achieved through a common assessment procedure to determine whether the necessary skills have been acquired” (p.8). Another study conducted by Precision Consultancy on behalf of the NQC (2008) titled “industry expectations of VET” also highlighted a number of concerns in the VET sector in relation to comparability of standards. Although at the time, the NQC did not have the appetite to introduce common assessment procedures to statistical moderate RTO based assessments to bring standards into alignment (as typically happens in many jurisdictions offering Senior Secondary School Certificates), the NQC funded a number of research and development activities to explore less intrusive options to enhancing comparability of standards across the sector whilst still maintain a level of flexibility and autonomy at the RTO level to design and implement its own assessment procedures. For example, the NQC commissioned us to research and develop: Professional Code of Practice for conducting validation and moderation Establishing guidelines for: designing assessment tools, engaging industry in the assessment process Determining the level of risk Establishing and maintaining an assessment quality management framework The NQC then funded us to disseminate the findings from our research and development to the VET sector, where we ran information sessions across the country to both RTOs and also for VETis providers. Since those sessions, a number of further research and development activities have been commissioned by the NQC to address issues of concerns in the quality of assessment. The NQC has now commissioned us to disseminate the findings of more research regarding validation and moderation in diverse settings, and also the Assessor Guide which provides guidance to some critical questions asked by assessors in relation to assessment and validation.
Today’s workshop Assessment Developing Assessment Tools Competency Mapping Simulated Assessment Engaging industry Assessment Quality Management Framework Validation and moderation System considerations Diverse settings
NQC resources Guide for Developing Assessment Tools Assessment Facts Sheets Simulated Assessment Making Assessment Decisions Peer Assessment and Feedback Quality Assuring Assessment Tools Assessor Partnerships Systematic validation Assessor Guide: Validation and Moderation
Session 1: What is assessment? Purposeful process of systematically gathering, interpreting, recording and communicating to stakeholders, information on student performance.
Assessment Purposes Evaluative Diagnostic Formative Summative designed to provide information to evaluate institutions and curriculum/standards – primary purpose is accountability Diagnostic Produce information about the candidate’s learning Formative Produce evidence concerning how and where improvements in learning and competency acquisition are required Summative Used to certify or recognise candidate achievement or potential The assessment of competence, as is the case with assessment in other educational contexts, can serve a variety of functions. These functions can, and have been, traditionally classified into four types: evaluative, diagnostic, formative and summative (Griffin & Nix, 1991; William 2000). The term evaluative assessment is used to describe assessments that are designed to provide information to evaluate institutions and curriculum/standards, and therefore serve the primary purpose of accountability. The term summative assessment is used to describe assessments that are used to certify or recognise candidate achievement or potential. The term diagnostic assessment is used to refer to assessments that produce information about the candidate’s learning. It is similar in meaning to the term formative assessment, which has been used to describe assessments that produce evidence concerning how and where improvements in learning and competency acquisition are required. According to Messick (1989), the validity of any assessment depends upon the purpose of the assessment and the way in which the evidence is interpreted and used by the key stakeholders. Evaluative purpose (high stakes) Can narrow the focus of teaching and learning to what is thought to be the intended curriculum (as opposed to the whole curriculum) Diagnostic or formative Greater importance can be placed on aspects of learning that are easily measured (referred to as an undesirable backwash effect) Summative Can promote competitiveness and strategic or partial learning by encouraging both assessors and candidates to focus on those things that are assessed at the expense of those that are not.
Assessment Purposes Assessment for learning occurs when teachers use inferences about student progress to inform their teaching (formative) Assessment as learning occurs when students reflect on and monitor their progress to inform their future learning goals (formative) Assessment of learning occurs when teachers use evidence of student learning to make judgements on student achievement against goals and standards (summative) http://www.education.vic.gov.au/studentlearning/assessment/preptoyear10/default.htm In educational contexts, the terms assessment “for” “as” and “of” have been used instead of the terms summative and formative. Assessment tools should clearly distinguish between conducting assessment for formative or summative purposes. AQTF (2010) 1.5 refers to summative assessment.
Session 2: Developing Assessment Tools NQC Products Guide for Developing Assessment Tools Assessment Facts Sheets Simulated Assessment Making Assessment Decisions Peer Assessment and Feedback Quality Assuring Assessment Tools Assessor Guide: Validation and Moderation In this session, we will provide an overview of some of the research and development activities undertaken by the NQC on designing assessment tools. There are six publications to support this session, all of which can be found at the NQC website. The Guides are more detailed documents, whilst the FACT Sheets have been designed to be a quick – look up type reference using plain language statements. http://www.nqc.tvetaustralia.com.au/nqc_publications
Impact Changes to definitions within the NQC publications AQTF 2010 User Guide documentation; and the Training Package Development Handbook Reliability Validity Assessment tool Validation Moderation The early research findings has also led to changes to the definitions used in the AQTF User Guides and Training Package Development Handbook. In this session, we will explain the changes to these definitions and the implications they have for assessors.
Key Stages – developing assessment tools identify and describe the purposes for the assessment identify the assessment information that can be used as evidence of competence/learning identify a range of possible methods that might be used to collect assessment information define the contexts for interpreting assessment information in ways that are meaningful for both assessor and candidate determine the decision making rules define procedures for coding and recording assessment information identify stakeholders in the assessment and define their reporting needs.
Essential Characteristics - Assessment Tool An assessment tool includes the following components: The context and conditions for the assessment The tasks to be administered to the candidate An outline of the evidence to be gathered from the candidate The evidence criteria used to judge the quality of performance (i.e., the assessment decision making rules); as well as the The administration, recording and reporting requirements. The NQC Implementation Guide defines assessment tools. This definition is now the accepted definition in the AQTF (2010) and the Training Package Developers Handbook. The previous definition defined assessment tool as including: The instruments The procedures. This definition is not in conflict with the revised definition, but focuses on the tool as a whole, rather than as separate components.
Ideal Characteristics The context Competency mapping The information to be provided to the candidate The evidence to be collected from the candidate Decision making rules Range and conditions Materials/resources required Assessor intervention Reasonable adjustments Validity evidence Reliability evidence Recording requirements Reporting Requirements The NQC Implementation Guide also looked at the Ideal Characteristics – listed here. The blue text are the aspects to be looked at today.
Competency Mapping The components of the Unit(s) of Competency that the tool should cover should be described. This could be as simple as a mapping exercise between the components within a task (eg each structured interview question) and components within a Unit or cluster of Units of Competency. The mapping will help determine the sufficiency of the evidence to be collected as well as the content validity. Advice regarding competency mapping can be found in the NQC Assessor Guide: Validation and Moderation The NQC document notes that assessment tools need to include a Competency Mapping. The key purpose of the competency mapping is to ensure that the key components of the tasks are aligned with the unit(s) of competency – content validity. How this is undertaken varies, and the most recent publication NQC Assessor Guide provides various options.
Competency Mapping The NQC Assessor Guide provides samples of mapping, such as the top sample which is more detailed, whereas the bottom sample is more moderate in detail.
Competency Mapping: Steps in the process Step 1: Unpack the unit of competency to identify its critical components. Step 2: For each assessment method, list the tasks to be performed by the candidate. Step 3: For each assessment method, map the critical components of the unit to each assessment task. Refer to NQC Assessor Guide: Validation and Moderation The new NQC Assessor Guide includes some options on pages 13 – 20. With the various samples you will see that the level of risk will affect the level of detail.
Level of specificity in mapping – Risk Assessment Risk can be determined by consideration of: Safety (eg potential danger to clients from an incorrect judgement) Purpose and use of the outcomes (eg selection purposes) Human capacity (eg level of expertise and experience of the assessors) Contextual (eg changes in technology, workplace processes, legislation, licensing requirements and/or training packages) Will depend upon level of risk associated with the unit of competency and the assessment
Decision Making Rules The rules to be used to: Check the quality of the evidence (i.e. the rules of evidence) Judge how well the candidate performed on the task according to the standard expected Synthesise evidence from multiple sources to make an overall judgement Additional advice – refer to Fact Sheet 2 The expected performance needs to be documented to ensure that there is a common understanding across assessors to inform the decision of competence. Decision making rules need to applied at various levels. For example, for each key aspect of performance that needs to be judged for each task, some form of rules and/or guidelines for judging the quality of that performance would be need to be documented (e.g. what is an acceptable response to an open ended interview question). This is referred to as the evidence criteria. There also needs to be a final decision making rule as to how to make an overall judgement of competence using evidence from multiple sources (e.g. an interview and a portfolio). Information regarding decision making rules, and making decisions can be found in the Implementation Guide and also Fact Sheet 2. A competency based assessment system encourages the use of a range of assessment methods and tasks for gathering evidence of competence. However, the use of a combination of assessment methods and tasks yields information about performance that must be first synthesised and interpreted by the assessor to infer the competence level of the candidate; and secondly, used to make as judgement as to whether the competency standards have been met (Wheeler, 1993). This assimilation process requires the assessor to evaluate the assessment information collected against pre-specified decision rules that can be internally or externally imposed, and few, if any such guidelines exist. However a number of alternative decision making models for using multiple assessment results have been identified (refer to Mehrens, 1990 & Scriven, 1991). Four of these are described below. Compensatory Model The Compensatory Model for using multiple assessment results can trade off weak performance on one measure with strong performance on another measure, during the deliberation process. Usually, this model assumes that there is a minimum level of performance that must be demonstrated by the candidate and explicit decision rules must be specified for minimal levels of performance for each task. The decision as to whether unacceptable performance on one task can be compensated by outstanding performance on another task will depend on the critical nature of the competency to be demonstrated. Developers of assessment tasks and assessment policy documents will need to determine the critical nature of the competencies and to set minimum levels of performance for each assessment task designed. Conjunctive Model The Conjunctive Model for using multiple assessment results requires that the candidate demonstrate a minimum level of performance on each of the assessment tasks administered by the assessor. This approach is suitable when the competencies to be assessed are critical for minimal acceptable performance within an industry. Assessors who conduct assessments within high risk or emergency situations for instance, often adopt this model where a wrong assessment decision could place either the candidate or his/her peers in danger. An example of this approach is in attainment of a driving license, where the candidate must pass a written test (80% or higher), a vision test and a practical driving test prior to the issuing of the license. Combined Model This model has features of both the Compensatory Model and the Conjunctive Model and is usually applied when there are two or more decision making levels (Wheeler, 1993). An example of this is where assessments are being carried out as part of a larger assessment system. For instance, a workplace trainer may adopt the Compensatory Model for assessing an employee against a module of training (ie using trade-offs), but the registered training provider may adopt the Conjunctive Model and insist on satisfactory performance on all modules of training prior to issuing a national qualification. Disjunctive Model Unlike the first three models discussed above, this model does not assume that there are minimal levels of performance that must be demonstrated for each assessment task for an overall competent decision to be made. According to this model, the candidate needs only to demonstrate competent performance on one of the assessment tasks. The assumption is that if the candidate can demonstrate competent performance under one circumstance, there is no need for the candidate to be further assessed in that area. Despite the cost-effective nature of this approach, the assessor’s confidence with predicting the candidate’s consistency of performance and transferability of competencies to new situations remains questionable. According to Wheeler (1993), when multiple methods and assessment tasks are used to gather evidence of competence, there must be a sound rationale for the selection of which decision making model is applied. She also argues that to maximise inter-rater reliability of the assessment decisions, the decision making rules for synthesising multiple sources of evidence must be made explicit to both assessors and candidates.
Reasonable Adjustments This section of the assessment tool should describe the guidelines for making reasonable adjustments to the way in which evidence of performance is gathered without altering the expected performance standards (as outlined in the decision making rules). Reasonable adjustments refer, identifying a particular target group/person with background characteristics for which the assessment potentially may prohibit them from completing the task and demonstrating their ‘true’ competence level (for example, setting a written test for people with low literacy). In these cases the assessment tasks will need to be adjusted or alternative tasks developed. A simple example would be responding to questions orally rather than written. However, remember that the adjustments must not alter the expected standard of performance specified within the unit/s of competency.
Simulated assessment For the purposes of assessment, a simulated workplace is one in which all of the required skills are performed with respect to the provision of paid services to an employer or the public can be demonstrated as though the business was actually operating. In order to be valid and reliable, the simulation must closely resemble what occurs in a real work environment. The simulated workplace should involve a range of activities that reflect real work experience. It should allow the performance of all of the required skills and demonstration of the required knowledge. Ref: AQTF definition (refer to Activity Handout), Assessment Fact Sheet 1 Simulated assessment has gained particular focus on the last iteration of the AQTF. A definition has been developed and included in the AQTF glossary. Of particular concern in audit is: Lack of ‘face’ validity of assessment Lack of a simulated environment that reflects workplace realities. Assessment Fact Sheet 1 – provides advice on simulated work environments.
Activity 1: Engaging Industry In your groups discuss what input employers (you might wish to specify a vocational area) could provide to develop valid assessment tools and processes. For the following scenarios, note down 2/3 questions you could ask employers and how the responses will inform the development or review of assessment tools and/or processes. There are a number of ways in which RTOs could engage industry in its assessment quality management system. For example, industry could be involved in determining whether a qualification requires a moderation or a validation process. As part of this process, industry could be involved in determining the level of risk associated with conducting a false positive assessment (i.e., assessing someone as competent when in actual fact they are not yet competent). This may involve determining the critical nature of the competency, the financial/safety implications of making a wrong assessment judgement as well as the frequency of use of the competency in the workplace. The greater the risk, the more likely the need for moderation. Other ways in which industry could be involved in the assessment quality management system have been documented in the Assessor Guide: Validation and Moderation. Assurance Panelling of Assessment Tools to determine Relevance and realism to workplace (face validity) Content validity (mapping of key components within task to curriculum/standards) Technical accuracy Appropriateness of language/terminology Literacy and Numeracy requirements Evidence criteria used to judge candidate performance for each task Range and conditions for the assessment (e.g., materials/equipment, facilities, time restrictions, level of support permitted) Any reasonable adjustments to evidence collection (as opposed to standard expected) Sufficiency of evidence across time and contexts (transferability) Consultation with Industry/Community representatives to identify: Benchmark examples of candidate work at both competent and not yet competent levels Exemplar assessment tasks/activities Quality Control Moderation consensus panel membership Identifying benchmark samples of borderline cases Determining level of tolerance (in relation to risk assessment) External moderation (if representing an organisation/association of authority or standing within the industry) Quality Review Panel representation on validation panel (e.g., check content and face validity of assessment tools) Follow up surveys to determine predictive validity Relevant NQC support materials: Industry Enterprise & RTO Partnership Assessment Fact Sheets: Assessor Partnerships Assessor Guide: Validation and Moderation
Activity 2: Self Assessment In groups of 3, review the assessment tool using the self assessment checklist from the NQC (2009) Implementation Guide (Template A.1, p. 45). Identify any gaps in the tool? Discuss the pros and cons of including such additional information within the tool? Provide sample assessment tool. Identify gaps.
Tool Review Has clear, documented evidence of the procedures for collecting, synthesising, judging and recording outcomes (i.e., to help improve the consistency of assessments across assessors [inter-rater reliability]). Has evidence of content validity (i.e., whether the assessment task(s) as a whole, represents the full range of knowledge and skills specified within the Unit(s) of competency. Reflect work-based contexts, specific enterprise language and job-tasks and meets industry requirements (i.e., face validity). Adheres to the literacy and numeracy requirements of the Unit(s) of Competency (construct validity). Has been designed to assess a variety of evidence over time and contexts (predictive validity). Has been designed to minimise the influence of extraneous factors (i.e., factors that are not related to the unit of competency) on candidate performance (construct validity).
Tool Review Has clear decision making rules to ensure consistency of judgements across assessors (inter-rater reliability) as well as consistency of judgements within an assessor (intra-rater reliability). Has a clear instruction on how to synthesise multiple sources of evidence to make an overall judgement of performance (inter-rater reliability). Has evidence that the principles of fairness and flexibility have been adhered to. Has been designed to produce sufficient, current and authentic evidence. Is appropriate in terms of the level of difficulty of the task(s) to be performed in relation to the skills and knowledge specified within the relevant unit(s) of Competency. Has outlined appropriate reasonable adjustments that could be made to the gathering of assessment evidence for specific individuals and/or groups. Has adhered to the relevant organisation assessment policy.
Quality Checks Panel Pilot Trial Refer to Fact Sheet 4, Quality assuring assessment tools In the development process it is recommended that the assessment tools are: Panelled – with with subject matter experts and colleagues Piloted – on a small number of individuals who have similar characteristics Trialled - with a group of individuals who also have similar characteristics Fact Sheet 4 includes advice regarding unpacking a unit of competency and on the tool development process.
Session 3: Assessment Quality Management NQC Products Code of Professional Practice: Validation & Moderation Implementation Guide: Validation and Moderation Assessment Facts Sheets Quality Assuring Assessment Tools Systematic Validation Assessor Partnerships Assessor Guide: Validation and Moderation http://www.nqc.tvetaustralia.com.au/nqc_publications
Validation Validation is a quality review process. It involves checking that the assessment tool produced valid, reliable, sufficient, current and authentic evidence to enable reasonable judgements to be made as to whether the requirements of the relevant aspects of the Training Package or accredited course had been met. It includes reviewing and making recommendations for future improvements to the assessment tool, process and/or outcomes. NQC Implementation Guide: Validation and Moderation 2009
Outcomes of validation Recommendations for future improvements Context and conditions for the assessment Task/s to be administered to the candidates Administration instructions Criteria used for judging the quality of performance (e.g. the decision making rules, evidence requirements etc) Guidelines for making reasonable adjustments to the way in which the evidence of performance was gathered to ensure that the expected standard of performance specified within the Unit(s) of Competency has not been altered Recording and reporting requirements.
Moderation Moderation is the process of bringing assessment judgements and standards into alignment. It is a process that ensures the same standards are applied to all assessment results within the same Unit(s) of Competency. It is an active process in the sense that adjustments to assessor judgements are made to overcome differences in the difficulty of the tool and/or the severity of judgements. NQC Implementation Guide: Validation and Moderation 2009 Moderation is desirable but not mandatory and is described in the NQC documents as a quality control process. Within a moderation process, adjustments to student results should be made prior to their finalisation of the results if the judgements of the assessor have been determined to be too harsh or lenient. Similarly, moderation can lead to adjustments to student results if the assessment tools have been determined to be too easy and/or difficult. Adjustments are made therefore to the students’ results prior to finalisation. This process helps to bring standards across RTOs into alignment and therefore ensure fairness and comparability of standards across the sector. Although moderation is desirable within the VET Sector, particularly in high risk assessments, it is not necessary under the AQTF as in many instances the benefits may not outweigh the costs.
Outcomes of moderation Recommendations for future improvement and adjustments to assessor judgements (if required) and Recommendations for improvement to the assessment tools Adjusting the results of a specific cohort of candidates prior to the finalisation of results and Requesting copies of final candidate assessment results in accordance with recommended actions. Here is also some examples of the outcomes of a moderation process. Like validation, it could lead to recommendations for improvements to the tool, but unlike validation, it may require altering students’ results prior to finalisation to bring standards into alignment. There may also be a requirement for some form of accountability.
Validation vs Moderation Features Validation Moderation Assessment Quality Management Type Quality Review Quality Control Primary Purpose Continuous improvement Bring judgements and standards into alignment. Timing On-going Prior to the finalisation of candidate results Focus Assessment tools; and Candidate Evidence (including assessor judgements) (desirable only) Assessment tools, and; Candidate Evidence, including assessor judgements (mandatory) Type of Approaches Assessor Partnerships Consensus Meetings External (validators or panels) External (moderators or panels) Statistical Outcomes Recommendations for future improvements Recommendations for improvements; and Adjustments to assessor judgements (if required) This table provides a good summary as to the difference between validation and moderation. It can be found in the NQC Implementation guide.
Types of Approaches - Statistical Limited to moderation Yet to be pursued at the national level in VET Requires some form of common assessment task at the national level Adjusts level and spread of RTO based assessments to match the level and spread of the same candidates scores on a common assessment task Maintains RTO-based rank ordering but brings the distribution of scores across groups of candidates into alignment Strength Strongest form of quality control Weakness Lacks face validity, may have limited content validity Although yet to be pursued at the national level within the VET Sector, statistical moderation could be used to ensure that RTO based assessments are comparable throughout the nation, particularly if grades or marks are to be reported. However, to implement this moderation process, some form of a common assessment task(s) would need to be introduced at a national level in the VET sector (e.g., external exam or standardised assessment tools) to moderate the organisation-based assessments. If a common assessment task was used to statistically moderate organisation-based assessments, the statistical moderation process would maintain the rank order of the candidates’ scores (as determined by the assessor/organisation) but it would bring the distributions of scores across groups of candidates (from other organisations or assessors) within the same units within a qualification into alignment. That is, statistical moderation adjusts the organisation-based assessments in accordance with candidates’ performances on common external tasks. It should be acknowledged that any adjustment to a candidate’s scores is determined by the external scores for the whole organisation’s cohort, not by the candidate’s own external score. It is also important to note that statistical moderation does not change the rank order of candidates, as determined by the organisation’s scores. A candidate given the top score for an assessment task by his/her organisation would have the top score after statistical moderation, no matter how they performed on the external task. The process recognises that organisations are in the best position to make comparative judgements about the performance of their candidates and these comparative judgements are not changed as a result of the statistical moderation. Statistical moderation entails adjusting the level and spread of each organisation’s assessments of its candidates in a particular qualification, to match the level and spread of the same candidates’ scores on a common external task. If a common assessment task was to be completed by all candidates across the nation or within an industry area, it could become the common standard against which organisation’s assessments could be compared. At a national level, the organisation-based assessments could be statistically moderated using: A common exam across all qualifications based on measuring generic/employability skills. Qualification specific national exams (similar to those used for licensing purposes). National common assessment tools within each qualification that would need to be judged centrally. The major benefit of statistical moderation is that it provides the strongest form of quality control over organisation-based assessments. It can also be less expensive to implement and maintain (if paper-based) than external moderation processes. It would however require the introduction of some form of common assessment task(s) at the national level. If the common assessment task was paper-based (as has been typically implemented in other educational sectors due to reduced costs associated with the implementation and scoring procedures), then any adjustments to candidate results would be limited to estimates of candidates’ cognitive skills (i.e., knowledge and understanding); and therefore may have limited face and content validity within the VET sector.
Types of Approaches - External Site Visit Versus Central Agency Strengths Offer authoritative interpretations of standards Improve consistency of standards across locations by identifying local bias and/or misconceptions (if any) Educative Weakness Expensive Less control than statistical External Approaches (Validation and Moderation) There are various external approaches to assessment validation and moderation. One approach would be for an external person (or a panel of people) to visit the organisation to judge the way in which candidates’ evidence were collected and judged against the Unit(s) of Competency. Differences between the local and external assessment judgements could then be either: Discussed and reconciled accordingly (i.e., if conducted for moderation purposes); and/or Discussed to identify ways in which improvements to future assessment practices could be undertaken (i.e., if conducted for validation purposes). An alternative external approach would be for samples of assessment tools and/or judged candidate evidence to be sent to a central location for specialist assessors to review directly against the Unit(s) of Competency. The specialist external assessors could be representatives of the relevant national Industry Skills Council (ISC) and/or the relevant state/territory registering bodies. Again, differences between the organisation and the external-based assessments could then be discussed (e.g., for validation) and/or reconciled (e.g., for moderation) at a distance. There are a number of benefits from using external moderators/validators. These include the potential to: Offer authoritative interpretations of the standards specified within Units of Competency; Improve consistency of the standards across locations by identifying local bias and/or misconceptions (if any); Offer advice to organisations and assessors on assessment approaches and procedures; and Observe actual assessment processes in real time as opposed to simply reviewing assessment products (if site visits are included). In relation to moderation, although external approaches have greater quality control over the assessment processes and outcomes than consensus meetings, they have less quality control than statistical approaches.
Types of Approaches – Assessor Partnerships Validation only Informal, self-managed, collegial Small group of assessors May involve: Sharing, discussing and/or reviewing one another’s tools and/or judgements Benefit Low costs, personally empowering, non-threatening May be easily organised Weakness Potential to reinforce misconceptions and mistakes Ref: Implementation Guide, Assessment Fact Sheet 5 Assessor partnerships involve the sharing of assessment tools and outcomes within a small group of assessors, possibly even just two assessors. Often this type of approach to quality review is informal and self-managed. The focus is on collegiality, mutual assistance and confirmation. The partnership may involve: Sharing and discussing one another’s assessment tools, processes and outcomes; Providing mutual support for reviewing one another’s assessment tools; Assisting one another in resolving any problems and/or issues (e.g., appeals); Checking one another’s judgements of candidate performance against the Unit(s) of competency and/or decision making rules specified within the assessment tool. A major benefit of assessor partnerships is that they can be locally organised and tend to have minimal implementation and maintenance costs. They can also be personally empowering to participants and can help build confidence and expertise of less experienced assessors. However, as partnerships tend to be locally self-managed, other quality review mechanisms may also be required to ensure continuous improvement of assessment practices. This is because there is a possibility that some assessor partnerships may simply reinforce each others’ misconceptions and mistakes if there are no other quality review processes available.
Types of Approaches - Consensus Typically involves reviewing their own & colleagues assessment tools and judgements as a group Can occur within and/or across organisations Strength Professional development, networking, promotes collegiality and sharing Weakness Less quality control than external and statistical approaches as they can also be influenced by local values and expectations Requires a culture of sharing Typically consensus meetings involve assessors reviewing their own and their colleagues’ assessment tools and outcomes as part of a group. It can occur within and/or across organisations. It is typically based on agreement within a group on the appropriateness of the assessment tools and assessor judgements for a particular unit(s) of competency. A major strength of consensus meetings is that assessors are directly involved in all aspects of assessment and gain professionally by learning not only how and what to assess, but what standards to expect from their candidates. It also enables assessors to develop strong networks and promotes collegiality. Another benefit from consensus meetings is that it provides opportunity for sharing materials/resources among assessors. If used for moderation purposes, consensus meetings however provide less quality control than external and statistical approaches as again, they can be influenced by local values and expectations.
Systematic Validation (consensus) Indicators Yes/No Action Is there a plan for assessment validation (including validation of RPL assessment) in place? Does your plan: Determine the sample of units of competency to be validated over a set period of time Provide dates for proposed validation activities Include details about who will participate in assessment validation, including the Chair of consensus panels, if relevant Include a strategy to ensure that all relevant staff are involved Identify what processes and materials will be used for implementing and recording the outcomes of assessment validation Does your RTO have terms of reference in place to guide the work of consensus panels? Does your RTO have validation materials (policy, procedure, forms) in place that cause participants to engage effectively in validation? Does your RTO have a process for monitoring the action taken as a result of validation? Does your RTO have a process and plan in place for reviewing the effectiveness of assessment validation? Ref: Assessor Guide Questions are often asked about how can validation processes be systematic. The Assessor Guide provides some guidance here as to what a plan for validation may include. Implementation of the plan goes towards the notion of ‘systematic’.
System considerations What is the most appropriate approach to validation? Condition Suggested approach Whenever my RTO conducts internal validation few opportunities for improvement arise Consider including external representation on your validation panel Our assessors are contractors and cannot come to validation consensus meetings because my RTO can’t afford to pay for their time and some are located interstate Consider establishing assessor validation partnerships at your local level, but ensure that improvements identified are recorded and fed back to other assessors and formalised Our RTO conducts high risk units related to licensing, where the licensing authority has mandated the use of assessment tools it provides Consider consensus moderation, ideally with external representation on your panel. Our RTO is new and assessors do not have a lot of experience Consider inviting an external person with expertise in assessment tool design to validation consensus meetings The Assessor Guide provides some advice as to what validation model could be suitable to different RTOs or contexts.
Assessment Quality Management It can be seen in Table 1 that there are a number of different quality management processes that could be used to help achieve national comparability of standards, whilst still maintaining flexibility at the RTO level to design and conduct assessments. The AQTF requires validation. The AQTF states that validation is a quality review process However, there are other strategies that the RTOs deploy and these are usually in the quality assurance section. Note however that for high stakes assessment – moderation is often used in senior secondary school assessment.
Quality management in diverse settings Identified barriers: Structural (i.e., the organizational and resource aspects) – financial, variations of definitions across key documents Process (i.e., the practices and activities that take place) – rolling enrolments, partnering arrangements, workloads Personal factors (i.e., the attitudinal, assessment literacy and expectations of the key players). Strategies deployed by RTOs Refer to Handout – Quality management processes in diverse settings. An project related to the initial validation and moderation papers, was one undertaken by Shelley, Chloe and Andrea regarding how validation was implemented in diverse settings. The findings indicated that in general the RTOs were struggling with implementing a systematic process and the identified barriers were…. However, the RTOs were deploying a range of strategies. Refer to Handout. Discuss this with the group.
Activity 3: Assessment Quality Management Approach Strategies implemented Quality assurance Quality control Quality management Ref: Implementation Guide Activity 4: Assessment Quality Man Finally, the NQC Code of Practice has included this table – and it is worthwhile talking with your group what strategies your RTO use. Get comments back. Pull all comments together and close.
Associate Professor Shelley Gillis Chloe Dyson Director Education Consultant Quorum QA Australia Pty Ltd Email: chloed@alphalink.com.au Phone: 0408124825 Andrea Bateman Director Education Consultant Quorum QA Australia Pty Ltd Email: andrea@batemangiles.com.au Phone: 0418 585 754 Principal author Associate Professor Shelley Gillis Deputy Director Work-based Education Research Centre Victoria University Email: shelley.gillis@vu.edu.au Phone: 0432 756 638