VALIDITY.

Slides:



Advertisements
Similar presentations
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Advertisements

Cal State Northridge Psy 427 Andrew Ainsworth PhD
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Reliability and Validity
VALIDITY AND RELIABILITY
Professor Gary Merlo Westfield State College
Part II Sigma Freud & Descriptive Statistics
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
Part II Sigma Freud & Descriptive Statistics
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Using statistics in small-scale language education research Jean Turner © Taylor & Francis 2014.
Assessment: Reliability, Validity, and Absence of bias
Chapter 4 Validity.
VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green 1.
Concept of Measurement
Uses of Language Tests.
Validity of Selection. Objectives Define Validity Relation between Reliability and Validity Types of Validity Strategies.
Correlational Designs
Research Methods in MIS
RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY
Chapter 7 Correlational Research Gay, Mills, and Airasian
Chapter 7 Evaluating What a Test Really Measures
Classroom Assessment A Practical Guide for Educators by Craig A
Understanding Validity for Teachers
Chapter 4. Validity: Does the test cover what we are told (or believe)
VALIDITY. Validity is an important characteristic of a scientific instrument. The term validity denotes the scientific utility of a measuring instrument,
Chapter 6 Validity §1 Basic Concepts of Validity
Ch 6 Validity of Instrument
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
The Psychology of the Person Chapter 2 Research Naomi Wagner, Ph.D Lecture Outlines Based on Burger, 8 th edition.
Validity & Practicality
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Reliability & Validity
Validity Is the Test Appropriate, Useful, and Meaningful?
Reliability vs. Validity.  Reliability  the consistency of your measurement, or the degree to which an instrument measures the same way each time it.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 12 Correlational Designs.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Validity.
PORTFOLIO ASSESSMENT OVERVIEW Introduction  Alternative and performance-based assessment  Characteristics of performance-based assessment  Portfolio.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Correlational Research Designs. 2 Correlational Research Refers to studies in which the purpose is to discover relationships between variables through.
Chapter 4 Validity Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
Chapter 6 - Standardized Measurement and Assessment
VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena.
Reliability EDUC 307. Reliability  How consistent is our measurement?  the reliability of assessments tells the consistency of observations.  Two or.
Measurement Chapter 6. Measuring Variables Measurement Classifying units of analysis by categories to represent variable concepts.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
VALIDITY by Barli Tambunan/
Lecture 5 Validity and Reliability
Concept of Test Validity
Evaluation of measuring tools: validity
Reliability & Validity
Outline the steps خطوات in the selection اختيار process عملية
Human Resource Management By Dr. Debashish Sengupta
پرسشنامه کارگاه.
PSY 614 Instructor: Emily Bullock Yowell, Ph.D.
Reliability and Validity of Measurement
VALIDITY Ceren Çınar.
Cal State Northridge Psy 427 Andrew Ainsworth PhD
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

VALIDITY

MAJOR KINDS OF VALIDITY Validity: the test measures what is was designed to measure. content validity criterion validity construct validity

CONTENT VALIDITY Content validity refers to the degree to which the items on the test reflect the intended domain. There are two aspects to this part of validation: content relevance and content coverage.

Content Relevance The investigation of content relevance requires the specification of the behavioral domain in question and the attendant specification of the task or test domain. It is generally recognized that this involves the specification of the ability domain. Examining content relevance also requires the specification of the test method facets.

Content Coverage the extent to which the tasks required in the test adequately represent the behavioral domain in question.

Content validity Content validity is established through a logical analysis, which is basically an analysis of the correspondence between the test items and the content being covered. In many cases this is determined by how closely the items match the objectives. Content validity is not done by statistical analysis but rather by the inspection of the items. It does not generate a validity coefficient. This is different from reliability and from other forms of validity where the evidence is in terms of test scores and their statistical properties.

Test of Content Validity A test administered twice before and after the instruction. The significant difference of the two sets of scores can tell the content validity. t=(X1’-X2’)/√((бx12+бx22-2rбx1бx2)/(n-1)) where X1’:Mean of Test 1 X2’: Mean of Test 2 бx1: standard deviation of test 1 бx2: standard deviation of test 2 r: correlation coefficient between test 1 and test 2 Decision has to be made on the basis of the t-test table.

CRITERION VALIDITY Criterion validity is concerned with the degree to which the test scores are accurate and useful predictors of performance on some other criterion measure. This other measure is called criterion measure, which may be a different test, a future behavior pattern, or almost any other variable of interest. Therefore, criterion validity of a test involves the relationship or correlation between the test scores and scores on some measure representing an identified criterion. The correlation coefficient can be computed between the scores on the test being validated and the scores on the criterion. A correlation coefficient so used is called a validity coefficient.

CRITERION VALIDITY There are two slightly different types of criterion validity: concurrent validity and predictive validity. Information on concurrent criterion is the most commonly used in language testing. Such information typically takes one of two forms:

CRITERION VALIDITY (1)   examining differences in test performance among groups of individuals at different levels of language ability, or (2) examining correlations among various measures of a given ability. The process of establishing concurrent validity is one of administering the two measures - the criterion measure and the measure being validated - at about the same time.

Test of Criterion Validity 1. Product-moment correlation coefficient 2. When a variable is a continuum variable (e.g. scores) and the other is a dichotomous variable (e.g. sex) 3. when one or both of the variables are of the grade type

Continuum and Dichotomous r=((Xp’-Xq’)/б)*√(pq) where p: ratio of the first type of dichotomous variable q: ratio of the second type of dichotomous variable Xp’: mean of the continuum variable corresponding to p Xq’: mean of the continuum variable corresponding to p б: standard deviation of the continuum variable

Grade Type r=1-((6∑D2)/n(n2-1)) where D: difference between the grades n: number of the students

Concurrent Validity Concurrent validity is involved if the scores on the criterion are obtained at the same time as the test scores. Predictive validity is involved if we are concerned about a test score's relationship with some criterion measured in the future.

Predictive Validity Predictive validity indicates the extent to which an individual's future level on the criterion is predicted from prior test performance. For example, when test scores are used for selection purposes, such as choosing individuals for jobs or acceptance for admission to college, predictive validity of the test is of concern.

CONSTRUCT VALIDITY The determination of construct validity is essentially a search for evidence that will help us understand what the test is really measuring and how the test works across a variety of settings and conditions. A construct is a trait, attribute, or quality, something that cannot be observed directly but is inferred from psychological theory. Tests do not measure constructs directly, rather, they measure performance or behavior that reflect constructs.

Correlational Analysis A logical analysis of test content can usually give some indication of the number and nature of the constructs reflected by the test. If tests (or items) measure the same constructs, scores on the tests should be correlated; conversely, scores on tests that measure different constructs should have low correlations.

Correlational Analysis Suppose we have a test that is hypothesized to measure verbal ability. Scores on this test are correlated with scores on seven other measures, six tests (including another known verbal ability test) and individual weight. The pattern of correlations with these measures is as follows: Other Measure Correlation between Test VA and Other Measures Verbal Ability Test Creativity Test Self-Concept Test Vocabulary Test Math Aptitude Test Reading Test Individual weight .85 .65 .30 .89 .71 .57 .03

Correlation Matrix One way of assessing the construct validity of a test is to correlate the different test components with each other. Since the reason for having different test components is that they all measure something different and therefore contribute to the overall picture of language ability attempted by the test, we should expect these correlations to be fairly low—possibly in the order of +.3-+.5. If two components correlate very highly with each other, say +.9, we might wonder whether the two subtests are indeed testing different traits or skills, or whether they are testing essentially the same thing.

Correlation Matrix The correlations between each subtest and the whole test, on the other hand, might be expected, at least according to classical test theory, to be higher—possibly around +.7 or more

Correlation Matrix ---- .53 .27 .44 .73 .50 .43 .66 .84 .72 .45 .46 Reading Cloze Writing Oral Total Total minus self ---- .53 .27 .44 .73 .50 .43 .66 .84 .72 .45 .46 .86

Problem: Ambiguity It is impossible to make clear, unambiguous inferences regarding the influence of various factors on test scores on the basis of a single correlation between two tests. For example, if we found that a multiple-choice test of cohesion were highly correlated with a multiple-choice test of rhetorical organization, there are three possible inferences: (1) the test scores are affected by a common trait (textual competence); (2) they are affected by a common method (multiple-choice), and (3) they are affected by both trait and method.

Problem: Ambiguity Because of the potential ambiguities involved in interpreting a single correlation between two tests, correlational approaches to construct validation of language tests have typically involved correlations among large numbers of measures.

Factor Analysis A commonly used procedure for interpreting a large number of correlations is factor analysis, which is a group of analytical and statistical technique whose common objective is to represent a set of [observed] variables in terms of a smaller number of hypothetical variables.

Factor Analysis Factor analysis is a procedure for analyzing a set of correlation coefficients between measures; the procedure analytically identifies the number and nature of the constructs underlying the measures.

Factor Analysis Factor analysis is a statistical procedure; it does not provide names for the factors or constructs. Factors may be considered artificial variables - they are not variables that are originally measured but variables generated from the data.

Summary Content validity Compare test content with specifications/syllabus. Questionnaires to, interviews with experts such as teachers, subject specialists, applied linguists. C) Expert judges rate test items and texts according to precise list of criteria Concurrent validity Correlate students’ test scores with their scores on other tests. Correlate students’ test scores with teachers’ rankings. Correlate students’ test scores with other measures of ability such as students’ or teachers’ ratings.

Summary Predictive validity Correlate students’ test scores with their scores on tests taken some time later. Correlate students’ tests score with success in final exams. Correlate students’ test scores with other measures of their ability taken some time later, such as subject teachers’ assessments, language teachers’ assessments. Correlate students’ scores with success of later placement

Summary Construct validity Correlate each subtest with other subtests. Correlate each subtest with total test. Correlate each subtest with total minus self. Compare students’ test scores with students’ biodata and psychological characteristics. Multitrait-multimethod studies Factor analysis