Thinning Measurement Models and Questionnaire Design

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Independent Component Analysis
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
CS479/679 Pattern Recognition Dr. George Bebis
ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.
Dynamic Bayesian Networks (DBNs)
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
When Measurement Models and Factor Models Conflict: Maximizing Internal Consistency James M. Graham, Ph.D. Western Washington University ABSTRACT: The.
Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.
Optimal Bandwidth Selection for MLS Surfaces
Independent Component Analysis (ICA) and Factor Analysis (FA)
Discriminative Naïve Bayesian Classifiers Kaizhu Huang Supervisors: Prof. Irwin King, Prof. Michael R. Lyu Markers: Prof. Lai Wan Chan, Prof. Kin Hong.
Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
Visual Recognition Tutorial
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
G. Cowan Lectures on Statistical Data Analysis Lecture 14 page 1 Statistical Data Analysis: Lecture 14 1Probability, Bayes’ theorem 2Random variables and.
Online Learning Algorithms
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Bayesian Model Selection in Factorial Designs Seminal work is by Box and Meyer Seminal work is by Box and Meyer Intuitive formulation and analytical approach,
Computational Stochastic Optimization: Bridging communities October 25, 2012 Warren Powell CASTLE Laboratory Princeton University
THE STRUCTURE OF THE UNOBSERVED Ricardo Silva – Department of Statistical Science and CSML HEP Seminar February 2013.
Sparse Gaussian Process Classification With Multiple Classes Matthias W. Seeger Michael I. Jordan University of California, Berkeley
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Thinning Measurement Models and Questionnaire Design Ricardo Silva University College London CSML PI Meeting, January 2013.
Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
Randomized Algorithms for Bayesian Hierarchical Clustering
-Arnaud Doucet, Nando de Freitas et al, UAI
A reinforcement learning algorithm for sampling design in Markov random fields Mathieu BONNEAU Nathalie PEYRARD Régis SABBADIN INRA-MIA Toulouse
Some Aspects of Bayesian Approach to Model Selection Vetrov Dmitry Dorodnicyn Computing Centre of RAS, Moscow.
Quality of LP-based Approximations for Highly Combinatorial Problems Lucian Leahu and Carla Gomes Computer Science Department Cornell University.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Lecture 2: Statistical learning primer for biologists
CpSc 881: Machine Learning
Gaussian Processes For Regression, Classification, and Prediction.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Sparse Approximate Gaussian Processes. Outline Introduction to GPs Subset of Data Bayesian Committee Machine Subset of Regressors Sparse Pseudo GPs /
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Biointelligence Laboratory, Seoul National University
Deep Feedforward Networks
Probability Theory and Parameter Estimation I
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Role and Potential of TAs for Industrial Scheduling Problems
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
Latent Variables, Mixture Models and EM
CSCI 5822 Probabilistic Models of Human and Machine Learning
CSCI 5822 Probabilistic Models of Human and Machine Learning
Bayesian Models in Machine Learning
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
10701 / Machine Learning Today: - Cross validation,
Master Thesis Presentation
Overview of Machine Learning
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
Biointelligence Laboratory, Seoul National University
Probabilistic Surrogate Models
CS249: Neural Language Model
Adaptive Variable Selection
Presentation transcript:

Thinning Measurement Models and Questionnaire Design Ricardo Silva University College London ricardo@stats.ucl.ac.uk Neural Information Processing Systems 2011 Granada, Spain

The Joy of Questionnaires

Postulated Usage A measurement model for latent traits Information of interested postulated to be in the latents

X = latent variables, Y = observations, Yz = a subset of size K Task Given a measurement model, select a subset of given size K that preserves information about latent traits Criterion: “Longer surveys take more time to complete, tend to have more missing data, and have higher refusal rates than short surveys. Arguably, then, techniques to reducing the length of scales while maintaining psychometric quality are wortwhile.” (Stanton, 2002) X = latent variables, Y = observations, Yz = a subset of size K

Optimization Problem Solve for subject to each zi  {0, 1} and where HM(Yz) is marginal entropy Notice: equivalent to minimize conditional entropy Other divergence functions could be used in principle

Issues Hard combinatorial optimization problem Even objective function cannot be calculated exactly Focus: models where small marginals of Y can be calculated without major effort Working example: multivariate probit model

An Initial Approach Gaussian approximation maximize this by greedy search Covariance matrix: use “exact” covariance matrix Doable in models such as the probit without an Expectation- Propagation scheme

Alternative: Using Entropy Bounds Exploit this using a given ordering e of variables (Globerson and Jaakkola, 2007) and a choice of neighborhood N(e, i) Optimize ordering using a permutation-optimization method to get tightest bound E.g., Teyssier and Koller (2005), Jaakkola et al., (2010), etc. Notice role of previous assumption: each entropy term can be computed in a probit model for small N(e, i)

Method I: Bounded Neighborhood Plug in approximate entropy term into objective function where ILP formulation: first define power set Glover and Woolsey (1974)

Method I: Interpretation Entropy for full model: That of a Bayesian network (DAG model) approximation Entropy for submodels (K items): That of a crude marginal approximation to the DAG approximation DAG approximation with |N| = 1, z = (1, 1, 1) Y1 Y2 Y3 X1 Marginal with |N| = 1, z = (1, 0, 1) Y1 Y3 Y1 Y2 Y3 Actual marginal used, z = (1, 0, 1) Starting model Y1 Y3

Method II: Tree-structures Motivation: different selections for z don’t wipe out marginal dependencies

Method II: Interpretation For a fixed ordering, find a directed tree-structured submodel of size K Y1 Y1 Y1 Y2 Y3 Y2 Y2 Y4 Y4 Y4 Tree approximation for z = (1, 1, 1, 1) Tree approximation for z = (1, 1, 0, 1) if Y4 more strongly associated with Y2 than Y1 Alternative

Evaluation Synthetic models, evaluation against a fairly good heuristic, the reliability score Evaluation: for different choices of K, obtain selection Yz for each of the four methods given selection, evaluate how well latent posterior expectations are with respect to model using all Y

Evaluation

Evaluation: NHS data Model fit using 50,000 points, pre selection of 60+ items Structure following structure of questionnaire Relative improvement over reliability score and absolute mean squared errors Some more thoughts in the supplementary material

Conclusion How to combine the latent information criterion with other desirable criterion? Integrating the order optimization step with the item selection step Incorporating uncertainty in the model Relation to adaptive methods in computer-based questionnaires