Accounting for Individual Differences in Bradley-Terry Models by Means of Recursive Partitioning Carolin Strobl Florian Wickelmaier Achim Zeileis.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-fit Statistics in Categorical Data Analysis Alberto Maydeu-Olivares.
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Hypothesis testing Another judgment method of sampling data.
Logistic Regression Psy 524 Ainsworth.
3. Binary Choice – Inference. Hypothesis Testing in Binary Choice Models.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
Chapter Seventeen HYPOTHESIS TESTING
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Chapter 7: Variation in repeated samples – Sampling distributions
DEPENDENT SAMPLES t-TEST What is the Purpose?What Are the Assumptions?How Does it Work?
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Chapter 11: Inference for Distributions
Inferences About Process Quality
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Chapter 14 Inferential Data Analysis
Decision Tree Models in Data Mining
Mann-Whitney and Wilcoxon Tests.
Choosing Statistical Procedures
Analysis & Interpretation: Individual Variables Independently Chapter 12.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
- Interfering factors in the comparison of two sample means using unpaired samples may inflate the pooled estimate of variance of test results. - It is.
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Rasch trees: A new method for detecting differential item functioning in the Rasch model Carolin Strobl Julia Kopf Achim Zeileis.
Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
L. Liu PM Outreach, USyd.1 Survey Analysis. L. Liu PM Outreach, USyd.2 Types of research Descriptive Exploratory Evaluative.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
Ch8.2 Ch8.2 Population Mean Test Case I: A Normal Population With Known Null hypothesis: Test statistic value: Alternative Hypothesis Rejection Region.
Statistics for Political Science Levin and Fox Chapter Seven
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Tutorial I: Missing Value Analysis
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Nonparametric Statistics
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chapter 15 Analyzing Quantitative Data. Levels of Measurement Nominal measurement Involves assigning numbers to classify characteristics into categories.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Nonparametric Statistics
BINARY LOGISTIC REGRESSION
Chapter 4. Inference about Process Quality
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15
CHAPTER 10 Comparing Two Populations or Groups
Hypothesis Tests: One Sample
Nonparametric Statistics
More about Posterior Distributions
Sampling Distribution
Sampling Distribution
I. Statistical Tests: Why do we use them? What do they involve?
Interval Estimation and Hypothesis Testing
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 6 Statistical Inference & Hypothesis Testing
Fixed, Random and Mixed effects
Adding variables. There is a difference between assessing the statistical significance of a variable acting alone and a variable being added to a model.
Statistical Inference for the Mean: t-test
Presentation transcript:

Accounting for Individual Differences in Bradley-Terry Models by Means of Recursive Partitioning Carolin Strobl Florian Wickelmaier Achim Zeileis

Three Models Thurstone’s law of comparative judgment: the attractiveness of a stimulus is normally distributed in population (Use normal ogive) Bradley-Terry-Luce (BTL) model: The probability of an alternative depends on the ratio of the attractiveness of that alternative to the sum of the attractiveness values in all alternative. (Use logistic function) Rasch model: one attractiveness is ‘1’, another attractiveness is exp(theta-beta)

Research Question The preference scaling maybe heterogeneous among a group of subjects, but related to stimuli characteristic and/or person characteristic. Fit separate BT models to different groups of subjects (McGuire & Davison, 1991; Kissler & Bauml, 2001); Add covariate explicitly to the model (Bockenholt, 2001) Propose a newly model-based recursive partitioning method to incorporate subject covariate in BT model

Procedure of Recursive Partitioning 1. Fit a BT model to the paired comparisons of all subjects in the current subsample, staring with the full sample 2. Assess the stability of the BT model parameters with respect to each available covariate 3. If there is significant instability, split the sample along the covariate with the strongest instability and use the cutpoint with the highest improvement of the model fit 4. Repeat step 1-3 recursively in the resulting subsamples until there are no more significant instabilities (or the subsample is too small)

Fitting BT models π j are stimulus-specific parameters (called worth parameters; merits). ν is a discrimination constant. (Bockenholt added covariate to model these parameters). Prefer j over j’ Prefer j’ over j Undecided (a tie)

Assessing parameter instability in BT models Fluctuation test: to compute subject-wise model deviations that should fluctuate randomly around zero under the null hypothesis of parameter stability

Subject-wise estimation function: the derivative of the likelihood contribution with respect to the parameter vector The derivative are cumulatively aggregated along each of the covariate

Test statistics of systematic deviations: The variable with the smallest p values is used to determine the subsample in the recursive partition. (the sequence of split) Numeric Categorical

The age has the smallest p value and thus used for the first splitting variable in Figure 2.

Cutpoint selection in BT model After the l-th covariate was chosen for splitting, the cutpoint is selected by maximizing the partitioned likelihood. 52

For numeric and ordered covariates, For unordered covariates, the Q categories of an unordered categorical covaraite can be split into any two groups. The partition which has the maximal likelihood is chosen as the cutpoint.

Application Example 1. Germany’s next topmodel 2007 data

The worth parameter estimates showed the preference for the candidate (or in another words, the attractiveness of the candidate to the subjects) For subjects of >=52, the probability of choosing Barbara over Anni is <=52, q2=y <=52, q2=n, M <=52, q2=n, F >=52

2. CEMS University Choice Data

The first splitting is Italian skill, the second splitting is spanish/french skill, then under the French, the third splitting is study field.

Discussion Recursive partitioning approach, which is a nonparametric (data-driven) way, is more flexible in detecting nonlinear and interaction effects of covariates. Treatment of numeric and categorical covariate is more natural: the splitting of a numeric covariate is automatically selected in a data-driven way. By contrast, the fully parametric approach require not only an active selection of the covariate but also a distinct choice of the functional form in which the covariates are included.

Latent class approach vs recursive partitioning approach Application of recursive partitioning approach to the Rasch model in the future study

Comments An fundamental problem in this paper is: equal- mean-difficulty was used as constrain in this study. Consequently, the magnitude of the deviation from the mean will sum to zero across the persons in Figure 1. If not use the EMD constrain, it is possible to use one person’s estimate as anchor. Moreover, the purification procedure is desirable to incorporate the recursive partitioning approach.