Educational Research: Instruments (“caveat emptor”) Richard M. Jacobs, OSA, Ph.D.
Instruments… tools researchers use to collect data for research studies (alternatively called “tests”)
The types of instruments… 1. Cognitive Instruments 2. Affective Instruments 3. Projective Instruments
1. Cognitive instruments... Measure an individual’s attainment in academic areas typically used to diagnose strengths and weaknesses
Types of cognitive instruments... achievement tests …provide information about how well the test takers have learned what they have been taught in school …achievement is determined by comparing it to the norm, the performance of a national group of similar students who have taken the same test
aptitude tests …measure the intellect and abilities not normally taught and often are used to predict future performance …typically provide an overall score, a verbal score, and a quantitative score
2. Affective instruments... Measure characteristics of individuals along a number of dimensions and to assess feelings, values, and attitudes toward self, others, and a variety of other activities, institutions, and situations
Types of affective instruments... attitude scales …self-reports of an individual’s beliefs, perceptions, or feelings about self, others, and a variety of activities, institutions, and situations …frequently use Likert, semantic differential, Thurstone , or Guttman scales
values tests …measure the relative strength of an individual’s valuing of theoretical, economic, aesthetic, social, political, and religious values
personality inventories …an individual’s self-report measuring how behaviors characteristic of defined personality traits describe that individual
3. Projective instruments... Measure a respondent’s feelings or thoughts to an ambiguous stimulus
Primary type of projective test... associational tests …participants react to a stimulus such as a picture, inkblot or word onto which they project a description
Selecting an instrument... 1. determine precisely the type of instrument needed 2. identify and locate appropriate instruments 3. compare and analyze instruments 4. select best instrument
Instrument sources… Burros’ Mental Measurements Yearbook Tests in Print PRO-ED Publications Test Critiques Compendium ETS Test Collection Database ERIC/AE Test Review Locator ERIC/Burros Test Publisher Directory
Rules governing the selection instruments... 1. the highest validity 2. the highest reliability 3. the greatest ease of administration, scoring, and interpretation 4. test takers’ lack of familiarity with instrument 5. avoids potentially controversial matters
Administering the instrument... 1. make arrangements in advance 2. ensure ideal testing environment 3. be prepared for all probable contingencies
Two issues in using instruments... 1. Validity: the degree to which the instrument measures what it purports to measure 2. Reliability: the degree to which the instrument consistently measures what it purports to measure
Types of validity... 1. Content validity 2. Criterion-related validity 3. Construct validity
1. Content validity: the degree to which an instrument measures an intended content area
forms of content validity… …sampling validity: does the instrument reflect the total content area? …item validity: are the items included on the instrument relevant to the measurement of the intended content area?
2. Criterion-related validity: an individual takes two forms of an instrument which are then correlated to discriminate between those individuals who possess a certain characteristic from those who do not
forms of criterion-related validity… …concurrent validity: the degree to which scores on one test correlate to scores on another test when both tests are administered in the same time frame …predictive validity: the degree to which a test can predict how well individual will do in a future situation
3. Construct validity: a series of studies validate that the instrument really measures what it purports to measure
Types of reliability... 1. Stability 2. Equivalence 3. Internal consistency
1. Stability (“test-retest”): the degree to which two scores on the same instrument are consistent over time
2. Equivalence (“equivalent forms”): the degree to which identical instruments (except for the actual items included) yield identical scores
3. Internal consistency (“split-half” reliability with Spearman-Brown correction formula , Kuder-Richardson and Cronback’s Alpha reliabilities, scorer/rater reliability): the degree to which one instrument yields consistent results
Terms associated with instruments...
Data… …the pieces of information researchers collect through instruments to examine a topic or hypothesis
Constructs… …abstractions of behavioral factors that cannot be observed directly and which researchers invent to explain behavior
Variable… …a construct that can take on two or more values or scores
Raw scores… …the number of items an individual scored on an instrument
Measurement scales… …the representation of variables so that they can be quantified
Measurement scales... Qualitative (categorical) 1. nominal variables Quantitative (continuous) 2. ordinal variables 3. interval variables 4. ratio variables
1. nominal (“categorical”): classifies persons or objects into two or more categories
2. ordinal (“order”): classifies persons or objects and ranks them in terms of the degree to which those persons or objects possess a characteristic of interest
3. interval: ranks, orders, and classifies persons or objects according to equal differences with no true zero point
4. ratio: ranks, orders, classifies persons or objects according to equal differences with a true zero point
Norm reference… …provides an indication about how one individual performed on an instrument compared to the other students performing on the same instrument
Criterion reference… …involves a comparison against predetermined levels of performance
Self reference… …involves measuring how an individual’s performance changes over time
Operationalize… …the process of defining behavioral processes that can be observed
Standard error of measurement… …an estimate of how often a researcher can expect errors of a given size on an instrument
Mini-Quiz… True or false… …a large standard error of measurement indicates a high degree of reliability false
…a large standard error of measurement indicates low reliability True or false… …a large standard error of measurement indicates low reliability true
…most affective tests are projective True or false… …most affective tests are projective false
True or false… …the primary source of test information for educational researchers is the Burros Mental Measurements Yearbook true
…research hypotheses are usually stated in terms of variables True or false… …research hypotheses are usually stated in terms of variables true
True or false… …similar to a Thurstone scale, a Guttman scale attempts to determine whether an attitude is unidimensional true
True or false… …validity requires the collection of evidence to support the desired interpretation true
True or false… …researchers should first consider developing an instrument rather than utilizing a published instrument false
…a researcher’s goal is to achieve perfect predictive validity True or false… …a researcher’s goal is to achieve perfect predictive validity false
True or false… …predictive validity is extremely important for instruments that are used to classify or select individuals true
…a high validity coefficient is closer to 1.00 than 0.00 True or false… …a high validity coefficient is closer to 1.00 than 0.00 true
…norm reference and criterion reference are synonymous terms True or false… …norm reference and criterion reference are synonymous terms false
True or false… …“criterion related” refers to correlating one instrument with a second instrument; the second instrument is the criterion against with the validity of the second instrument is judged false
True or false… …a valid test is always reliable but a reliable test is not always valid true
True or false… …it is difficult to state appropriate reliability coefficients because reliability, like validity, is dependent upon the group being tested, i.e., groups with different characteristics will produce different reliabilities true
True or false… …content validity is not compromised if the instrument covers topics not taught false
Fill in the blank… …the tendency of an individual to respond continually in a particular way response set
…a study which consists of two quantitative variables Fill in the blank… …a study which consists of two quantitative variables correlational
experimental or causal-comparative Fill in the blank… …a study which consists of one categorical and one quantitative variable experimental or causal-comparative
correlational or descriptive Fill in the blank… …a study which consists of two or more categorical variables correlational or descriptive
…data collection methods which emphasize student processes or products Fill in the blank… …data collection methods which emphasize student processes or products performance
Fill in the blank… …data collection methods including multiple-choice, true-false, and matching selection
Fill in the blank… …data collection methods in which students fill in the blank, provide a short answer, or write an essay supply
Fill in the blank… …an instrument administered, scored, and interpreted in the same way no matter where or when it is administered standardized
Fill in the blank… …the term that includes the general process of collecting, synthesizing, and interpreting information, whether formal or informal assessment
Fill in the blank… …a formal, systematic, usually paper-and-pencil procedure for gathering information about peoples’ cognitive and affective characteristics test
Fill in the blank… …the degree to which individuals seek out or participate in particular activities, objects, and ideas interests
Fill in the blank… …also called “temperament,” the characteristics representing an individual’s typical behaviors and describes what individual do in their natural life circumstances personality
Fill in the blank… …things individuals feel favorable or unfavorable about; the tendency to accept or reject groups, ideas, or objects attitudes
…deeply held beliefs about ideas, persons, or objects Fill in the blank… …deeply held beliefs about ideas, persons, or objects values
Fill in the blank… …requires administering the predictor instruments to a different sample from the same population and developing a new equation cross-validation
personality inventory Which type of test… …Minnesota Multiphasic Personality Inventory personality inventory
Which type of test… …Stanford-Binet achievement test
Which type of test… …Strong Campbell interest inventory
…SRA Survey of Basic Skills Which type of test… …SRA Survey of Basic Skills achievement test
…Weschler Intelligence Scales Which type of test… …Weschler Intelligence Scales aptitude test
…Gates-McGinitie Reading Test Which type of test… …Gates-McGinitie Reading Test achievement test
…Otis-Lennon School Ability Test Which type of test… …Otis-Lennon School Ability Test aptitude test
Which type of test… …Kuder Occupational interest inventory
…Rorschach Inkblot Test Which type of test… …Rorschach Inkblot Test projective
personality inventory Which type of test… …Meyers-Briggs Type Indicator personality inventory
…Iowa Test of Basic Skills Which type of test… …Iowa Test of Basic Skills achievement test
…Thematic Apperception Test Which type of test… …Thematic Apperception Test projective
Which type of validity… …compares the content of the test to the domain being measured content
Which type of validity… …Graduate Record Examination predictive
Which type of validity… …correlates scores from one instrument to scores on a criterion measure, either at the same or different time criterion-related
Which type of validity… …amasses convergent, divergent, and content-related evidence to determine that the presumed construct is what is being measured construct
stability (test-retest) Which type of reliability… …scores on one instrument are consistent over time stability (test-retest)
Which type of reliability… …the extent to which independent scorers or a single scorer over time agree on the scoring of an open-ended instrument scorer/rater
equivalence and stability Which type of reliability… …scores correlate between similar version of an instrument given at different times equivalence and stability
equivalence (alternate forms) Which type of reliability… …scores correlate between two versions of a test that are intended to be equivalent equivalence (alternate forms)
Which type of reliability… …the extent to which items included on an instrument are similar to one another in content internal consistency
semantic differential Which type of response scale… …an individual gives a quantitative rating to a topic where each position on the continuum has an associated score value semantic differential
Which type of response scale… …value points are assigned to a participant’s responses to a series of statements Likert
Which type of response scale… …participants select from a list of statements that represent differing points of view from those which participations agree Thurstone
This module has focused on... instruments …which describes the procedures researchers use to select individuals to participate in a study
The next module will focus on... qualitative research ...the tools researchers use to gather data for a study