Estimation of Ability Using Globally Optimal Scoring Weights Shin-ichi Mayekawa Graduate School of Decision Science and Technology Tokyo Institute of Technology.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

Estimating a Population Variance

Component Analysis (Review)

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

Objectives 10.1 Simple linear regression

Mean, Proportion, CLT Bootstrap

Pattern Recognition and Machine Learning

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.

Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.

Chap 9: Testing Hypotheses & Assessing Goodness of Fit Section 9.1: INTRODUCTION In section 8.2, we fitted a Poisson dist’n to counts. This chapter will.

© McGraw-Hill Higher Education. All Rights Reserved. Chapter 2F Statistical Tools in Evaluation.

Visual Recognition Tutorial

Chapter 10 Simple Regression.

Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.

Evaluating Hypotheses

2.3. Measures of Dispersion (Variation):

Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.

PAGE 1 ARGOS JGU 東京 18 th February PAGE 2 Few reminders  PTT = P latform T ransmitter T erminal  Monthly active PTT = Platform transmitting, received.

Comparison of Reliability Measures under Factor Analysis and Item Response Theory —Ying Cheng ， Ke-Hai Yuan ， and Cheng Liu Presented by Zhu Jinxin.

Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.

Medical Education Systems in China Zhimin jia Southern Medical University, Guangzhou, China 30 th Sasakawa Researcher IRCME, the University of Tokyo

Standard error of estimate & Confidence interval.

Recognizing Daily Routines Through Activity Spotting Ulf Blanke and Bernt Schiele Computer Science Department, TU Darmstadt.

Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.

THE OPTIMAL STRUCTURAL DESIGN OF QSAT FM (Flight Model)

PATTERN RECOGNITION AND MACHINE LEARNING

Simple Linear Regression Models

1 Introduction to Estimation Chapter Concepts of Estimation The objective of estimation is to determine the value of a population parameter on the.

Random Sampling, Point Estimation and Maximum Likelihood.

Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.

International Funding ENGINEERING AND PHYSICAL SCIENCES RESEARCH COUNCIL Dr Edward Clarke Senior Manager

Measuring Mathematical Knowledge for Teaching: Measurement and Modeling Issues in Constructing and Using Teacher Assessments DeAnn Huinker, Daniel A. Sass,

The Next 100 Years Projection of Debris in GEO Space Systems Dynamics Laboratory Yuya Mimasu 1st March, 2007.

Disaster Related Science University of Indonesia Dr. Rahmat Widyanto Dept of Computer Science University of Indonesia

TILC08, Sendai March 2008 国際リニアコライダー (*) Communications challenges Perrine Royole-Degieux * international linear collider.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.

Public Library and Users’ Lifestyle in a Changing Context Haruki Nagata (University of Tsukuba) Kanako Sakai (Institute of Developing Economy, Library)

Statistical Methods II&III: Confidence Intervals ChE 477 (UO Lab) Lecture 5 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.

© Copyright McGraw-Hill 2000

Optimal revision of uncertain estimates in project portfolio selection Eeva Vilkkumaa, Juuso Liesiö, Ahti Salo Department of Mathematics and Systems Analysis,

BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity

Introduction to Statistical Inference Jianan Hui 10/22/2014.

BCS547 Neural Decoding.

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 8. Parameter Estimation Using Confidence Intervals.

Chapter 8: Simple Linear Regression Yang Zhenlin.

FinalPresentation September 24 th 2010 Recruiting Students from Overseas Final Presentation September 24 th 2010 Atsushi Nagaoka J.F. Oberlin University.

AP Statistics Chapter 16. Discrete Random Variables A discrete random variable X has a countable number of possible values. The probability distribution.

GeV 領域光子で探るメソン生成反応の物理 20 talks ：ありがとうございました。 Baryon Spectroscopy w/wo meson measurements meson(s)

Sampling considerations within Market Surveillance actions Nikola Tuneski, Ph.D. Department of Mathematics and Computer Science Faculty of Mechanical Engineering.

CHAPTER 2: Basic Summary Statistics

Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.

November 8, 2005© 2005 SHARP Corporation 1 International Workshop on Future Software Technology 2005 (g) Quality / Testing Ikuko Suzuki

Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.

Rita Faria Research fellow at the Centre for Health Economics

Logistic Regression APKC – STATS AFAC (2016).

Practice & Communication of Science From Distributions to Confidence

Foreign investment in JAPAN

Linear Regression.

The 5th CJK IT Standards Meeting

Statistics II: An Overview of Statistics

Determining Which Method to use

CHAPTER 2: Basic Summary Statistics

Linear Discrimination

2.3. Measures of Dispersion (Variation):

Applied Statistics and Probability for Engineers

Presentation transcript:

Estimation of Ability Using Globally Optimal Scoring Weights Shin-ichi Mayekawa Graduate School of Decision Science and Technology Tokyo Institute of Technology

2 Outline Review of existing methods Globally Optimal Weight: a set of weights that maximizes the Expected Test Information Intrinsic Category Weights Examples Conclusions

3 Background Estimation of IRT ability  on the basis of simple and weighted summed score X. Conditional distribution of X given  as the distribution of the weighted sum of the Scored Multinomial Distribution. Posterior Distribution of  given X. h(  x)  f(x|  ) h(  )  Posterior Mean(EAP) of  given X.  Posterior Standard Deiation(PSD)

4 Item Score We must choose w to calculate X. IRF

5 Item Score We must choose w and v to calculate X. ICRF

6 Conditional distribution of X given  Binary items Conditional distribution of summed score X.  Simple sum: Walsh(1955), Lord(1969)  Weighted sum: Mayekawa(2003) Polytomous items Conditional distribution of summed score X.  Simple sum: Hanson(1994), Thissen et.al.(1995)  With Item weight and Category weight: Mayekawa & Arai(2007)

7 Example Eight Graded Response Model items 3 categories for each item.

8 Example (choosing weight) Example: Mayekawa and Arai (2008) small posterior variance  good weight. Large Test Information (TI)  good weight

9 Test Information Function Test Information Function is proportional to the slope of the conditional expectation of X given  (TCC), and inversely proportional the squared width of the confidence interval (CI) of  given X. Width of CI Inversely proportional to the conditional standard deviation of X given .

10 Confidence interval (CI) of  given X

11 Test Information Function for Polytomous Items ICRF

12 Maximization of the Test Information when the category weights are known. Category weighted Item Score and the Item Response Function

13 Maximization of the Test Information when the category weights are known.

14 Maximization of the Test Information when the category weights are known. Test Information

15 Maximization of the Test Information when the category weights are known. First Derivative

16 Maximization of the Test Information when the category weights are known.

17 Globally Optimal Weight A set of weights that maximize the Expected Test Information with some reference distribution of . It does NOT depend on .

18 Example NABCT A B1 B2 GO GOINT A AINT Q Q Q Q Q Q Q Q LOx LO GO GOINT A AINT CONST

19 Maximization of the Test Information with respect to the category weights. Absorb the item weight in category weights.

20 Maximization of the Test Information with respect to the category weights. Test Information Linear transformation of the category weights does NOT affect the information.

21 Maximization of the Test Information with respect to the category weights. First Derivative

22 Maximization of the Test Information with respect to the category weights. Locally Optimal Weight

23 Globally Optimal Weight Weights that maximize the Expected Test Information with some reference distribution of .

24 Intrinsic category weight A set of weights which maximizes: Since the category weights can be linearly transformed, we set v0=0, ….. vmax=maximum item score.

25 Example of Intrinsic Weights

26 Example of Intrinsic Weights h(  )=N(-0.5, 1): v0=0, v1=*, v2=2

27 Example of Intrinsic Weights h(  )=N(0.5, 1): v0=0, v1=*, v2=2

28 Example of Intrinsic Weights h(  )=N(1, 1 ): v0=0, v1=*, v2=2

29 Summary of Intrinsic Weight It does NOT depend on , but depends on the reference distribution of  : h(  ) as follows. For the 3 category GRM, we found that For those items with high discrimination parameter, the intrinsic weights tend to become equally spaced: v0=0, v1=1, v2=2 The Globally Optimal Weight is not identical to the Intrinsic Weights.

30 Summary of Intrinsic Weight For the 3 category GRM, we found that The mid-category weight v1 increases according to the location of the peak of ICRF. That is: The more easy the category is, the higher the weight. v1 is affected by the relative location of other two category ICRFs.

31 Summary of Intrinsic Weight For the 3 category GRM, we found that The mid-category weight v1 decreases according to the location of the reference distribution of  h(  )  If the location of h(  ) is high, the most difficult category gets relatively high weight, and vice versa. When the peak of the 2nd category matches the mean of h(  ), we have eqaully spaced category weights: v0=0, v1=1, v2=2

32 Globally Optimal w given v

33 Test Information LOx LO GO GOINT CONST

34 Test Information

35 Bayesian Estimation of  from X

36 Bayesian Estimation of  from X

37 Bayesian Estimation of  from X (1/0.18)^2 =

38 Conclusions Polytomous item has the Intrinsic Weight. By maximizing the Expected Test Information with respect to either Item or Category weights, we can calculate the Globally Optimal Weights which do not depend on . Use of the Globally Optimal Weights when evaluating the EAP of  given X reduces the posterior variance.

39 References

40 ご静聴ありがとうございました。 Thank you.

41

42

43