Bias & Fairness in Tests (Rust & Golombok)

Slides:



Advertisements
Similar presentations
Stephen C. Court Presented at
Advertisements

1 Reliability in Scales Reliability is a question of consistency do we get the same numbers on repeated measurements? Low reliability: reaction time High.
THE DISTRIBUTION OF SAMPLE MEANS How samples can tell us about populations.
ENG 101 MOCK EXAM ANSWERS. PART ONE – LISTENING & NOTE-TAKING Listening Task 1 - (5 x 3pts = 15pts) 1. What do people say about test scores? (part A)
Lecture 2: Null Hypothesis Significance Testing Continued Laura McAvinue School of Psychology Trinity College Dublin.
CHAPTER 21 More About Tests: “What Can Go Wrong?”.
MSc Epidemiology Exams what, why, when, how. Paper 1 Covers extended epidemiology, STEPH and clinical trials Purpose of today’s talk: –Explain format.
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
ExamsExams Playing the Game!. Disclaimer I do not know the exact motivations, rationales etc. for every exam question by every examiner, I can only pass.
Outline Test bias – definitions The basic issue: group differences What causes group differences? Arguments that tests are not biased Differential item.
Developments in the Education System
Review: What influences confidence intervals?
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 5): Outliers Fall, 2008.
Error and Sample Sizes PHC 6716 June 1, 2011 Chris McCarty.
Richard M. Jacobs, OSA, Ph.D.
1 Evaluating Psychological Tests. 2 Psychological testing Suffers a credibility problem within the eyes of general public Two main problems –Tests used.
Myths about Language Learning
Perception and Individual Decision-Making
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 21 More About Tests.
Introduction to Hypothesis Testing
Chapter 9 Quiz Intelligence.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Psy B07 Chapter 8Slide 1 POWER. Psy B07 Chapter 8Slide 2 Chapter 4 flashback  Type I error is the probability of rejecting the null hypothesis when it.
More About Tests and Intervals Chapter 21. Zero In on the Null Null hypotheses have special requirements. To perform a hypothesis test, the null must.
Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of times would result in heads half the time (i.e.,
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Goal Setting. Imagining  Imagining by itself can lead one to achieve goals successfully. True or False?
Dealing with all different age groups Knowing a correct way to communicate –Kids –Pre-Teens –Teenagers –Middle Age –Elderly Communicating about certain.
Science & pseudoscience – Part of chapter 3 Including guest appearance by religion & popular (but incorrect) culture.
Hypotheses tests for means
INFORMAL FALLACIES. FALLACIES OF RELEVANCE Errors resulting from attempts to appeal to things that are not relevant, i.e., not really connected to or.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
PSY2004 Research Methods PSY2005 Applied Research Methods Week Five.
Measurement Validity.
Errors in Hypothesis Tests. When you perform a hypothesis test you make a decision: When you make one of these decisions, there is a possibility that.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Trouble with Geniuses Part 1 By Shannon Hancock, Steven Schuffenhauer, Peter Capuano, and Julian Oliver.
Chapter 21: More About Test & Intervals
HOW TO STUDY??? STUDY HABITS Who needs them? We all do. Everyone has deadlines to assignments. No matter how much we like or dislike a subject we are working.
Errors in Hypothesis Tests Notes: P When you perform a hypothesis test you make a decision: When you make one of these decisions, there is a possibility.
Problems with the DSM-IV Definition of Sexual Paraphilia: Criterion A: (1) Lumps together disparate categories of sexual behaviour when there is no evidence.
Schools of Thought in Anthropology. What is a School of Thought? A perspective, a viewpoint, or a certain way of interpreting a discipline's subject matter.
Slide 21-1 Copyright © 2004 Pearson Education, Inc.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Welcome to MM570 Psychological Statistics
Representation Who has voice (and who does not). Images, Images Everywhere! over abundance of images surround us we cant immediately decode all of the.
Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)
Statistical Techniques
Hey, teacher! Leave us kids alone! Says the famous Pink Floyd’s song >.The relationship between teachers and students has always been very hard, as teachers.
Welcome to Sociology! Teachers don’t treat all students in the same way Do you agree or disagree with this statement? Your name Tutor group GCSE Sociology.
The Use of Computer and Video Games in the Classroom Heather Rummelein, Lindsey Kuhn, Meghan Avise, Callie Salerno, Leighann Korn Introduction Our group.
1 Prepared by: Laila al-Hasan. 1. Definition of research 2. Characteristics of research 3. Types of research 4. Objectives 5. Inquiry mode 2 Prepared.
Leadership Presentation By Asim Lodhi Hi! My name is Asim Lodhi and I am a English 12 student at Fraser High School. I am a Senior and will be attending.
The hidden traps in decision making
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
Section Testing a Proportion
Statistics in Clinical Trials: Key Concepts
Is this quarter fair?. Is this quarter fair? Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of.
Chapter 21 More About Tests.
Lecture 01: A Brief Summary
Data Analysis and Standard Setting
THE MARXIST PERSPECTIVE On education
More about Tests and Intervals
Review: What influences confidence intervals?
Language in Contact: Multilingual Societies and Discourse
Section 10.3 Making Sense of Statistical Significance
Errors in Hypothesis Tests
Is this quarter fair?. Is this quarter fair? Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of.
Myths about Language Learning
Presentation transcript:

Bias & Fairness in Tests (Rust & Golombok) Important to note: serious consequences follow from test results! Think about it from the client’s perspective Selection test: You don’t get the job Academic test: You lose a year’s work Clinical tests: get/don’t get help Important to make the right decisions based on the results

Bias and fairness How “correct” our decisions are can be thought of in terms of two properties: Fairness - The social justice issues surrounding the employment of the test Bias - A statistical artefact in the test which makes it respond differently to different groups

Fairness in tests This is not so much a flaw the test as of the use of the test Can only consider fairness in terms of the societal norms which apply We can examine the pattern of decisions which have been made based on that test

Fairness and Justice Depending on the focus on authority in a society, tests might be applied more or less strictly An “unfair test” is one whose consequences do not match the value system of the society

Fairness: an example Imagine we set a super hard exam for 206F Two thirds of the class fails the test I can do two things: Accept the marks as they are and see you next year Adjust the test marks to increase the pass rate Which is the fair thing to do?

Fairness: an example If the society’s norms are such the institution is emphasised over theindividual, accepting the results is the fair thing to do If the society’s norms are such that the individual is emphasised over the institution, adjusting the results is the fair thing to do

Bias in tests A bias exists in a test if gives different results for different populations Example: Army Alpha Soldiers almost always scored lower than officers

Bias: good or bad? Good: Bias can be used to identify which population a client belongs to Should he be officer or soldier? Bad: Creates a false impression of difference between groups Foreign language ability can “reduce” intelligence

Bias: Good or Bad? Irrelevant: If the test only really gets used by one population, who cares? Decide on the importance of bias based on the situation Bias is never an issue of “right” or “wrong” - it is a purely statistical concept

Forms of bias Bias can appear in three forms: Item bias Intrinsic test bias Extrinsic test bias We can examine each of these sources of bias separately and address each individually

Item bias The bias exists in individual questions eg. a questions about dollars, quarters and dimes would be biased Linguistic bias (idioms, slang, etc) common interracial bias This type of item is common in IQ tests (!)

Identifying item bias Do item analysis I.e. check out each item of the test separately Identify possible relevant subgroups Work out the “facility value” of each question for each group the proportion of people who get it right

Item bias: example Imagine we have a test with 3 questions We think it might be language biased Look at the groups: native english speakers, others Work out facility value Native: A: 0.68 B: 0.96 C: 0.57 Other: A: 0.59 B: 0.24 C: 0.59 Big difference in item B, so it is biased

Item offensiveness Eg: Shown an engineer and a psychologist “Which is smarter?” Related to item bias offensive items not necessarily biased Offensive items should be removed May interfere with subsequent items

Intrinsic test bias The test has different mean scores for different groups Does not exist in specific questions (Item bias), but rather a general phenomenon Common in language groups It is a matter of degree

Causes of intrinsic bias Tests created with a specific group in mind are biased in this way Other groups perform worse The more different the group, the bigger the difference

Extrinsic test bias When the test is unbiased, but decisions made using the test are biased Eg. a test finds a true difference, and this leads to one group getting selected more than another Extrinsic bias is the overlap between fairness and bias

Extrinsic bias: example On the SAT, poorer children tend to perform worse than richer children This is a real difference - poorer children have less access to the requirements to academic success The SAT was used to select university applicants poorer children were selected far less frequently

Extrinsic bias and ideology Do you believe in “true differences” Or do you rather believe in “unexplored potrential”? It is a fact that poorer kids did worse at university (in the USA) Do we use this as a basis for not selecting them? No way to take a “scientific” ideology-free standpoint on extrinsic bias