MGTO 324 Recruitment and Selections

Slides:

Advertisements

Similar presentations

Test Development.

Advertisements

Developing a Questionnaire

Conceptualization and Measurement

The Research Consumer Evaluates Measurement Reliability and Validity

1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.

Research Methodology Lecture No : 11 (Goodness Of Measures)

Validity In our last class, we began to discuss some of the ways in which we can assess the quality of our measurements. We discussed the concept of reliability.

Research Methods in Psychology

Research methods – Deductive / quantitative

Designing and Analyzing Questionnaires

Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.

RESEARCH METHODS Lecture 18

Chapter 4 Validity.

Beginning the Research Design

MGTO 231 Human Resources Management Personnel selection I Dr. Kin Fai Ellick WONG.

1 Measurement Measurement Rules. 2 Measurement Components CONCEPTUALIZATION CONCEPTUALIZATION NOMINAL DEFINITION NOMINAL DEFINITION OPERATIONAL DEFINITION.

What are competencies – some definitions ……… Competencies are the characteristics of an employee that lead to the demonstration of skills & abilities,

Measuring Human Performance. Introduction n Kirkpatrick (1994) provides a very usable model for measurement across the four levels; Reaction, Learning,

Research Methods in MIS

Chapter 9 Descriptive Research. Overview of Descriptive Research Focused towards the present –Gathering information and describing the current situation.

Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.

Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.

EDRS6208 Lecture Three Instruments and Instrumentation Data Collection.

Reliability, Validity, & Scaling

How to Make a Survey.

Questionnaires and Interviews

MEASUREMENT OF VARIABLES: OPERATIONAL DEFINITION AND SCALES

Instrumentation.

Measurement, Scales and Attitudes. Nominal Ordinal?

Technical Adequacy Session One Part Three.

MGTO 324 Recruitment and Selections Validity II (Criterion Validity) Kin Fai Ellick Wong Ph.D. Department of Management of Organizations Hong Kong University.

MGTO 324 Recruitment and Selections Personnel Judgment and Decision Making Kin Fai Ellick Wong Ph.D. Department of Management of Organizations Hong Kong.

Study of the day Misattribution of arousal (Dutton & Aron, 1974)

MGTO 324 Recruitment and Selections Validity I (Construct Validity) Kin Fai Ellick Wong Ph.D. Department of Management of Organizations Hong Kong University.

Validity Is the Test Appropriate, Useful, and Meaningful?

6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)

Measurement Validity.

8. Observation Jin-Wan Seo, Professor Dept. of Public Administration, University of Incheon.

Psychometrics & Validation Psychometrics & Measurement Validity Properties of a “good measure” –Standardization –Reliability –Validity A Taxonomy of Item.

Selecting a Sample. Sampling Select participants for study Select participants for study Must represent a larger group Must represent a larger group Picked.

Journalism 614: Concept Explication. Research Concepts  What do we mean when we want to study… –Prejudice?, Participation?, or Patriotism?  Research.

Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.

Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:

©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

MEASUREMENT: PART 1. Overview  Background  Scales of Measurement  Reliability  Validity (next time)

Validity and Reliability in Instrumentation : Research I: Basics Dr. Leonard February 24, 2010.

RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.

Creating A Good Questionnaire IB Geography. Advantages and Disadvantages of Questionnaires Advantages –Can assess a large group quickly –Easy to analyze.

Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.

Chapter 6 - Standardized Measurement and Assessment

PSY 432: Personality Chapter 1: What is Personality?

Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.

Primary Research HSB 4UI ISU. Primary Research Quantitative Quantify (measure) Quantify (measure) Large number of test subjects Large number of test subjects.

Measuring Research Variables

Measurement Chapter 6. Measuring Variables Measurement Classifying units of analysis by categories to represent variable concepts.

QUESTIONNAIRE DESIGN AND VALIDATION

Concept of Test Validity

Measurement: Part 1.

About Market Research Making a Questionnaire

پرسشنامه کارگاه.

Writing Survey Questions

Statistics and Research Desgin

Measurement: Part 1.

RESEARCH METHODS Lecture 18

Measurement: Part 1.

Presentation transcript:

MGTO 324 Recruitment and Selections Scale and Test Construction Kin Fai Ellick Wong Ph.D. Department of Management of Organizations Hong Kong University of Science & Technology

Prologue In the last lesson, I have discussed the scientific elements in testing Today, we focus on how a test can be constructed In particular, you are expected to understand the following The concepts of “item”, “scale”, and “test” Different item formats How to write a set of good items Multiple-item scaling

Outline

Outline

Part I: Basic Concepts What is a (psychological) test? Measurement device or technique To quantify behavior or aid in the understanding and prediction of behavior A set of items designed to measure characteristics of human beings that pertain to behavior

Part I: Basic Concepts What is an item? A specific stimulus to which a person responds overtly What is the English word of “邂逅” You can manage well in interpersonal relationship Overt behaviors (Scientific Standard) Observable Measurable Can be replicated 1 2 3 4 5 Strongly disagree Strongly agree

Part I: Basic Concepts What is a scale? The quantified scores obtained from a test The raw scores are related to some defined theoretical or empirical distribution The matching between the raw score and the theoretical meaning of that score E.g., 0oC = freezing point; 100oC = boiling point The same theoretical meaning could be represented by different scales Temperature: Degree Celsius vs. Degree Fahrenheit Length: Meter vs. Feet; Kilometer vs. Mile Weight: lb vs. kg Wealth: HK$ vs. US$

Part I: Basic Concepts What is a scale? Examples Thermometer HKCEE Raw scores = 2.4 cm; Degree Celsius = 100 Theoretically (empirically) = boiling point HKCEE Raw scores = 87 Grade = A Theoretically (empirically) = Excellent students IQ test Raw scores = 2400 IQ = 130 Theoretically (empirically) = Gifted individual

Part I: Basic Concepts Essential steps in scale and test construction Have a clear definition of what (i.e., the psychological construct) the test is supposed to measure E.g., Locus of control; self-efficacy Generating a set of items that at least seems to capture the construct Determining the scale format Pilot tests to assess the face validity, reliability, construct validity, and criterion validity Deleting or revising less useful items Assessing reliability and validity again Revising again Assessing again Shorten the scale Assessing again….

Outline

Part II: Item Formats

Part II: Item Formats Dichotomous formats Offers two alternatives for each item True/False; or Select a more appropriate statement “You often spend more than three hours on typing every day” Agree vs. Disagree “Generally speaking, the salary for me accurately represents my contribution to the organization” “Which job, technical or administrative, do you prefer?” Technical vs. Administrative Scores Simply count the number of items a person endorse Commonly used in both educational and personality tests

Part II: Item Formats Dichotomous formats Some famous scales used dichotomous formats Locus of control (Rotter, 1966) “the degree to which people believe they are masters of their own fates” from the OB textbook (Robin, 2003) High in externality Less satisfied with jobs; Higher absenteeism rates; More alienated from the work; Less involved in their jobs Choose one A. Many of the unhappy things in people’s lives are partly due to bad luck B. People’s misfortunes result from the mistakes they make Heredity plays the major role in determining one’s personality It is one’s experiences in life which determine what one is like

Part II: Item Formats Dichotomous formats Advantages Disadvantages Simple, easy to administer and score Absolute judgment; people should declare one of the two alternatives Disadvantages Effect of memorizing materials Requires numerous items to produce reliable results (chance = 50%)

Part II: Item Formats

Part II: Item Formats Polytomous formats Offers more than two alternatives (e.g., usually 4 to 5) One correct choice and other distractors Multiple-choice Scores Number of items correctly answered How many distractors? More distractors may not be better than less distractors (Sidick, Barrett, & Doverspike, 1994) Less influenced memory effects (relative to dichotomous format) Mainly used in educational tests

Part II: Item Formats

Part II: Item Formats Likert-type formats Usually, people are required to express their degree of dimension on a statement I can manage well in interpersonal relationship Monotone items A higher score suggests higher agreement 1 2 3 4 5 Strongly disagree disagree neutral agree Strongly agree

Part II: Item Formats Likert-type formats Number of alternatives Odd (5, 7, or 9) Even (6, or 10): avoid midpoint Popular in psychological tests It can be subject to various psychometric analyses, such as factor analysis For example, the General Self-efficacy Scale Self-efficacy “The individual’s belief that he or she is capable of performing a task” from OB textbook (Robin, 2003) Higher self-efficacy Less likely to give up under difficult situations Usually perform better than

Part II: Item Formats

Part II: Item Formats Cumulative (Guttman) formats Items on the same dimension are set up in ascending order Subject with a particular attitude will agree with all items on one side of that position and disagree with other items Example Addition, long-division, and calculus Monotone items

Part II: Item Formats Cumulative (Guttman) formats

Part II: Item Formats Cumulative (Guttman) formats

Part II: Item Formats

Part II: Item Formats Cumulative (Guttman) formats Advantages A single score carries complete information about the response patterns Calculus OK = division OK = addition OK Marriage OK = being neighbor OK When there is no random error Provides a test of the unidimensionality of what are to be tested Cumulative response pattern will not be obtained when the items do not measure only one dimension Disadvantages Problems resulted from random error Difficult to find domains that are unidimensional Less popular than Likert-type format

Outline

Part III: Writing good items Define clearly Clearly define what you want to measure Check the face and content validity Is the items out of syllabus How many psychological factor I want to measure? List them all I want to develop a test that helps me hire employees who have strong self-learning tendency What do you want to measure from the test? What is “self-learning tendency”? Give a clear definition before moving to the next step

Part III: Writing good items Clearly think about the item formats Think carefully what type of tests as well as what statistical analyses you want to use Some statistical analyses may not be appropriate for certain item formats Rank order is not appropriate for t test Polytomous responses may not be able to be analyzed by Factor Analysis Likert-type or dichotomous format seems to be a good choice as the default

Part III: Writing good items Sources of items Discourse and text From brainstorming or informal conversation Asking others the meaning of “self-learning” Use qualitative method is a more systematic method Open-ended interview or question Focus group Content analysis Newspaper Classic literature The goal is to generate a pool of items that seems to measure what we want to measure E.g., “When I have problems, I’ll seek help from books prior to people.”

Part III: Writing good items Nature of items The meanings of items should be clear, straightforward, and can be easily understood Avoid exceptionally long items Avoid reading difficulty Don’t use jargon…. Social desirability “I like self-learning” Offensiveness Pay attention to the problems of sexism and racism Avoid double-barreled items “seeking help from people is a better method than seeking help from books because people are more accessible. Reverse items “Most knowledge could not be learned without the help from others” “I can learn almost all knowledge by myself”

Part III: Writing good items Generate an item pool You first need to generate a set of item pool The numbers of items in this pool are usually much more than the final version of the scale Do a preliminary test Select useful items Discard or revise other items (you will learn these skills later in this course) You may need several rounds of revision before the scale becomes reliable and valid

Part III: Writing good items What should be included in a test apart from the basic items? Clear instruction for subjects Give examples whenever possible Questions for demographic information Age, gender, education level, etc. Declare that how the collected information will be used Is it confidential? The purpose of collecting the data; who will assess the data, etc.

Outline

Part IV: Multiple-item scaling True score theory Due to the matter of precision, measures may vary from time to time The true scores can hardly be obtained by only one measure Sometimes it may overestimate or underestimate the true score, the errors are assumed to be randomly distributed The true scores can be obtained by averaging multiple responses

Part IV: Multiple-item scaling The concepts of multiple-item scaling I want to measure one’s attitude toward whether secondary students can have romantic relationships Single item I encourage secondary students to develop romantic relationship Two items I encourage secondary students to develop romantic relationships We should respect the secondary students’ freedom of engaging in romantic relationships The issues of reliability and validity

Part IV: Multiple-item scaling Advantages multiple-item scaling It allows testing the nature of a construct A set of items may capture more than one dimension Intelligence: Memory span, verbal ability, and visual-spatial ability Items measuring same dimension or same construct should be highly inter-correlated Items measuring different dimensions are supposed to have relatively low inter-item correlation It allows us to check scale validity through Factor analysis Discuss again later Increase reliability and validity Remember the concept of true score theory Random errors may exert impacts on a particular single response Effects of random errors can be “neutralized” by measuring multiple responses