Assessment in Language Learning

Slides:

Advertisements

Similar presentations

Assessment types and activities

Advertisements

Testing What You Teach: Eliminating the “Will this be on the final

Item Writing Techniques KNR 279. TYPES OF QUESTIONS Closed ended  Checking yes/no, multiple choice, etc.  Puts answers in categories  Easy to score.

Assessment: Reliability, Validity, and Absence of bias

Chapter 4 Validity.

Presented by: Mohsen Saberi and Sadiq Omarmeli  Language testing has improved parallel to advances in technology.  Two basic questions in testing;

WEEK 1 – TOPIC 1 OVERVIEW OF ASSESSMENT: CONTEXT, ISSUES AND TRENDS 1.

Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.

Stages of testing + Common test techniques

Becoming a Teacher Ninth Edition

ASSESSMENT OF STUDENT LEARNING Manal bait Gharim.

Classroom Assessments Checklists, Rating Scales, and Rubrics

Four Basic Principles to Follow: Test what was taught. Test what was taught. Test in a way that reflects way in which it was taught. Test in a way that.

Validity & Practicality

Teaching Today: An Introduction to Education 8th edition

Chap. 2 Principles of Language Assessment

Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.

Performance and Portfolio Assessment. Performance Assessment An assessment in which the teacher observes and makes a judgement about a student’s demonstration.

Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.

Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.

Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.

What are the stages of test construction??? Take a minute and try to think of these stages???

Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,

Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.

 A test is said to be valid if it measures accurately what it is supposed to measure and nothing else.  For Example; “Is photography an art or a science?

Evaluation, Testing and Assessment June 9, Curriculum Evaluation Necessary to determine – How the program works – How successfully it works – Whether.

Chapter 6 - Standardized Measurement and Assessment

VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena.

Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.

Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

Language Assessment.

Chapter 1 Assessment in Elementary and Secondary Classrooms

EVALUATING EPP-CREATED ASSESSMENTS

Classroom Assessments Checklists, Rating Scales, and Rubrics

ASSESSMENT METHODS – Chapter 10 –.

Dr Anie Attan 26 April 2017 Language Academy UTMJB

Chapter 8: Performance-Based Strategies

Assessment of Learning 1

VALIDITY by Barli Tambunan/

Test Based on Response There are two kinds of tests based on response. They are subjective test and objective test. 1. Subjective Test Subjective test.

QUESTIONNAIRE DESIGN AND VALIDATION

Oleh: Beni Setiawan, Wahyu Budi Sabtiawan

Concept of Test Validity

Alternative Assessment (Portfolio)

EDU 385 Session 8 Writing Selection items

Validity and Reliability

Week 12: Observation and Assessment

ASSESSMENT OF STUDENT LEARNING

Classroom Assessments Checklists, Rating Scales, and Rubrics

Reliability & Validity

Classroom Assessment Validity And Bias in Assessment.

پرسشنامه کارگاه.

Learning About Language Assessment. Albany: Heinle & Heinle

Statistics and Research Desgin

Principles of Assessment & Criteria of good assessment

RESEARCH METHODS Lecture 18

TOPIC 4 STAGES OF TEST CONSTRUCTION

Understanding and Using Standardized Tests

jot down your thoughts re:

In The Name Of the Most High

TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1

Alternative Assessment

jot down your thoughts re:

TESTING, ASSESSING, AND TEACHING

Why do we assess?.

EDUC 2130 Quiz #10 W. Huitt.

Relationship between Standardized and Classroom-based Assessment

Presentation transcript:

Assessment in Language Learning By Didi Sukyadi

Evaluation, Assessment ,and Testing (Cameron, 2001:222) Testing: One technique or method of assessment that is concerned with measuring learning through performance. Assessment: concerns with pupils learning or performance and thus provides one type of information that might be used in evaluation Evaluation: a process of systematically collecting information in order to make a judgment including the issues of lessons, programs through documentation, observation, interviews, questionnaires, etc.

The Place of Evaluation in Curriculum Development Evaluation can or should be involved in all phases of curriculum development starting from needs analysis, stating the objectives, testing itself, material development, teaching and learning process and evaluation.

The Place of Assessment in Teaching and Learning Process Brewster et.al. (2003:247): assessment plays an extremely important part in the teaching and learning process and may heavily influence the way the pupils are taught and the kinds of activities they do

Assessment in Learning

TEST TYPES 1) Selected response (binary choice, matching, and multiple-choice) 2) Constructed responses (fill in, short answer, performance format) 3) Personal responses (conference, portfolio, self assessment)

Constructed response Advantages: virtually no guessing factor, allows for productive language use, allows for testing the interactions of receptive and productive skills. Disadvantages: difficult and time consuming to score and subjective in scoring.

Selected responses Advantages: require a short time to administer, easy to score, scoring is objective. Disadvantages: relatively difficult to create, require no language production.

Personal Response Item Advantages: directly related and integrated to curriculum, appropriate for assessing learning process. Disadvantages: difficult to create and structure, subjective in scoring

INTERPRETING THE OUTCOME OF ASSESSMENT 1) Norm-referenced tests Any test that is primarily designed to disperse performances of students in normal distribution based on their general abilities or proficiencies for purposes of categorizing the students into the levels or comparing students’ performances to the performances of others who formed the normative group (Glaser, 1963)

INTERPRETING THE OUTCOME OF ASSESSMENT 2) Criterion-referenced tests Measures which assess student achievement in terms of certain criterion standard thus provide information as to the degree of competence attained by a particular student which is independent of reference to the performance of others. They are deliberately constructed to yield measurements that are directly interpretable in terms of specified performance standard (Glaser and Nitko, 1971)

Other names for criterion-referenced tests 1) Domain-referenced tests (Documents that delineate a domain of student behaviors and the contents are materials to which test items are then referenced). 2) Objective-referenced tests (A test constructed so that the subsets of the items measures the specific objectives of a course, program of study or other clearly delineated subject matter area)

Characteristics of CRT 1) Emphasis on teaching/testing matches. 2) Focus on instructional sensitivity 3) Curricular relevance 4) Absence of normal distribution restrictions 5) No item discrimination restriction

CRT AND LANGUAGE THEORY Two competing hypotheses 1) The divisibility of language ability 2) The communicative competence The earlier rather simplistic views of language ability have been abandoned, recent focusing on performance assessment has raised new concern.

Nature of Language and Assessment Language and language acquisition are different in nature from other educational content such as in the relationship that exists in the nature of language proficiency and communicative competence. The difference will have a direct influence on how the construct of language knowledge is defined, how language tests are operationalised and how they are evaluated.

Language proficiency 1) Functional approach (listing the various uses to which language can be put) 2) General proficiency (individuals differ basically in the measurable amounts of some indivisible body of competence they posses) CRT is very appropriate and useful in the assessment of such clearly definable but complex language tasks.

Communicative ability 1) Grammatical competence 2) Sociolinguistic competence 3) Strategic competence 4) Organizational competence 5) Pragmatic competence

A language test should reflect: 1) Language is used in interaction 2) Interactions are usually unpredictable 3) Language has a context 4) Language is used for a purpose 5) There is a need to examine a performance 6) Language is authentic 7) Language success is behavioral based.

Testing communicative language ability: 1) Be criterion-referenced against the operational performance of a set of language tasks. 2) Be concerned with validating itself against the criteria and be concerned with the content, construct and predictive validity. 3) Rely on modes of assessment which are qualitative 4) Subordinate reliability to face validity

Test item: A unit of measurement with a prompt and a prescriptive form for responding, which is intended to yield a response from an examinee from which performance in some language construct my be inferred in order to make some decision. A stem can be the portion of the item (in multiple choice), a quote the student must respond, or the reading passage that the student must analyze and write about.

WRITING TEST ITEMS 1) Do not explain too much. 2) Do not use trick questions 3) Provide only the information necessary 4) Avoid ambiguity 5) Be orderly in test presentation

Linguistic confoundings 1) Item should be written at the examinee’s level of proficiency 2) Item should not contain negatives or double negatives 3) Item should not be ambiguous Family plays an important role in life. It sometimes complicates matters. Explain this. Here this may refer to the role of the family or complication involved.

Format confoundings 1) Item should contain only relevant information. (1) Unnecessary information included The following twenty vocabulary items have been selected from the second reading texts in Unit 2 of the reading Packet. Your teacher discussed each of these words in class during the Wednesday vocabulary lesson …. (2) Too brief Write an essay comparing relationship in two countries

Format confoundings 2) Item should be independent e.g. (1) What is the square root of 100? (2) Multiply this by seven/

Format confoundings 3) Item should be clearly organized and formatted. The item and its options should appear on the same page

VALIDITY Hughes (1989, 2003:26): a test is said to be valid if it measures accurately what it is intended to measure. Content validity: the content of a test constitutes a representative sample of the language skills, structures, etc. Criterion related validity: the degree to which results on the test agree with those provided by some independent and highly dependable assessment of the examinees’ ability, Construct validity: is the degree to which a test is measuring the psychological construct or constructs that it claims to be measuring

RELIABILITY (NRTs) 1. Test retest reliability 2. Equivalent forms reliability 3. Internal consistency (split-half reliability

DEPENDABILITY IN CRTs DEPENDABILITY IN CRTs Threshold loss agreement Po = A + D N Po = agreement coefficient A = masters on both administration of the tests D = non-masters on both administrations of the test B = masters on the first administration but non masters on the second. C = Non masters on the first administration and master on the second

Example Of the 45 examinees, 13 are categorized as A, 2 as B, 5 in C and 25 in D. Po = A + D = 13 + 25 = 38 = N 45 45 Consistency due to the test itself Po = (A + B)* (A + C) + (C + D)* (B + D) N2

Validating Test Items Item Analysis 1) Index of difficulty/Item facility/Item easiness/P-value 2) Difference index

Item validity 1) Add item 1 to item 10 and you get the total score for each examinee (Score) 2) Item validity: Correlate each item with the score using point biserial correlation (correlating nominal and interval data)

Rater Consistency 1) Correlate score of each rater with the other two raters using Pearson product moment correlation. 2) If the correlation is significant, the rating is consistent.

AUTHENTIC ASSESSMENT 1) Real life, normal communication (the ability to perform particular tasks) 2) Interactional ability (total communicative effect)

What is meant by authentic? Measures student’s knowledge and skills Requires application of knowledge Product or performance assessment Relevant contextualized tasks Process and products can both be measured Part of learning process Render holistic description Reflection of real world

Types of Authentic Assessment Oral interviews Story retelling Teacher observation Experiments Demonstration Projects/Exhibition Writing samples Portfolios