The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Applications of one-class classification

Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.

Hierarchical Dirichlet Processes

Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson,

Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.

Random Variables ECE460 Spring, 2012.

Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.

STAT 497 APPLIED TIME SERIES ANALYSIS

A New Nonparametric Bayesian Model for Genetic Recombination in Open Ancestral Space Presented by Chunping Wang Machine Learning Group, Duke University.

1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Sample Midterm question. Sue want to build a model to predict movie ratings. She has a matrix of data, where for M movies and U users she has collected.

. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.

Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.

Visual Recognition Tutorial

Space-time Modelling Using Differential Equations Alan E. Gelfand, ISDS, Duke University (with J. Duan and G. Puggioni)

Lecture II-2: Probability Review

Chapter Two Probability Distributions: Discrete Variables

Pairs of Random Variables Random Process. Introduction  In this lecture you will study:  Joint pmf, cdf, and pdf  Joint moments  The degree of “correlation”

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.

Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.

Hierarchical Dirichelet Processes Y. W. Tech, M. I. Jordan, M. J. Beal & D. M. Blei NIPS 2004 Presented by Yuting Qi ECE Dept., Duke Univ. 08/26/05 Sharing.

Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.

ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.

Module 1: Statistical Issues in Micro simulation Paul Sousa.

High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.

Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.

Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.

Clustering and Testing in High- Dimensional Data M. Radavičius, G. Jakimauskas, J. Sušinskas (Institute of Mathematics and Informatics, Vilnius, Lithuania)

A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,

Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.

1 Dirichlet Process Mixtures A gentle tutorial Graphical Models – Khalid El-Arini Carnegie Mellon University November 6 th, 2006 TexPoint fonts used.

Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.

Stick-Breaking Constructions

CS Statistical Machine learning Lecture 24

Lecture 2: Statistical learning primer for biologists

by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)

A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences Duke University Machine Learning Group Presented by Qiuhua Liu March.

Gaussian Processes For Regression, Classification, and Prediction.

Generalized Spatial Dirichlet Process Models Jason A. Duan Michele Guindani Alan E. Gelfand March, 2006.

Bayesian Density Regression Author: David B. Dunson and Natesh Pillai Presenter: Ya Xue April 28, 2006.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,

Bayesian Methods Will Penny and Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course, London, May 12.

The Nested Dirichlet Process Duke University Machine Learning Group Presented by Kai Ni Nov. 10, 2006 Paper by Abel Rodriguez, David B. Dunson, and Alan.

CSC321: Introduction to Neural Networks and Machine Learning Lecture 17: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.

Random Variables By: 1.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.

STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.

Stochastic Process - Introduction

Multiple Random Variables and Joint Distributions

Bayesian Semi-Parametric Multiple Shrinkage

Probability Theory and Parameter Estimation I

Bayesian Generalized Product Partition Model

STATISTICS Random Variables and Distribution Functions

Classification of unlabeled data:

CSCI 5822 Probabilistic Models of Human and Machine Learning

Kernel Stick-Breaking Process

Generalized Spatial Dirichlet Process Models

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Matching Words with Pictures

More Parameter Learning, Multinomial and Continuous Variables

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Pattern Recognition and Machine Learning Chapter 2: Probability Distributions July chonbuk national university.

Presentation transcript:

The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren

Outline  Introduction  Formalizing the model  Properties of the Labeling Process  Identifiability  Model fitting and inference  Applications  Conclusions

Introduction 1.Functional Data Suppose we have a collection of functions, each viewed as a stochastic process realization with observations at a common set of locations. e.g., a random curve or surface. 2. Dirichlet Labeling Process For a particular process realization, we assume that the observation at a given location can be allocated to separate groups via a random allocation process. 3. The Primary Objective Examine clustering of the set of curves.

Introduction 4. The connections with other models Dirichlet Process (DP) mixture model global clustering Dependent Dirichlet Process (DDP) mixture model local clustering Generalized Spatial DP mixture model thresholding latent Gaussian process

Model Formalization Noisy curve realizations: over Obtained at local sites: The corresponding latent curves: Each curve is described by the label function Dirichlet labeling process generates a random distribution: and also a marginal multinomial distribution with for

Model Formalization Assume a collection of “canonical” species is realized at each location by indexing with the labels, i.e., if. Or, it is equivalent to:

Model Formalization is a random probability measure on : where is a base measure on and constructed such that: 1. has a uniform marginal distribution at every location 2. inherits the spatial dependence structure via on. Denote by the finite-dimensional distributions of. Let and consider where denotes the cumulative distribution function at for.

Model Formalization The vector has uniform marginals and induces a joint distribution function denoted by on. Let be an increasing sequence of threshold in such that for. If define, then So an drawn from yields a label Discretize into hyper-cubes then

Model Formalization According to the definition of DP, Similarly, we define an auxiliary variables on for : such that where

Properties 1. Properties of. 2. Properties of. Assume is a mean-zero, isotropic Gaussian process with covariance function

Properties Under the assumptions on, the quantile threshold functions are constant with respect to and the sequence satisfies.

Identifiability 1.Larger will lead to more smooth learned canonical curves but weakly distinguishable, while smaller will make the curves’ posteriors cover different regions in the function space. 2. As is close to 0, label switching is discouraged—global clustering; if the curve realizations tend to switch often, the canonical curves become more weakly identified. 3.Similar locations tend to be (correctly) assigned the same labels, but it is possible that the whole segment is incorrectly labeled relatively to some other segments. strong constraints (ordering of label values) can be imposed upon. The model identifiability cannot be ensured with constraints but the mixing for posterior inference would be expected to improve.

Model fitting and inference The joint distribution associated with model parameters: For canonical curves, the prior for vector is normal with mean and covariance matrix The full conditional for still has a Gaussian form, but it has a high dimension for large data set:. The inference of the label vectors is dependent on the Polya urn sampling scheme and in terms of and :

Applications 1. Synthetic Data Specify locations while leave other 20 locations for validation purposes. for are iid drawn from at locations, where are constructed by. The data collection is obtained by mixing with an independent error process drawn from.

2. Progesterone modeling Applications The data records the natural logarithm of the progesterone metabolite, during a monthly cycle for 51 female subjects. Each cycle ranges from -8 to 15 (8 days pre-ovulation to 15 days post-ovulation). There are total of 88 cycles; the first 66 cycles belong to non-contraceptive group, the remaining 22 cycles belong to the contraceptive group. We also consider a modified data with the curves of the contraceptive group are down-shifted by 2. We focus our analysis to the case k=2.

Applications 3. Image modeling 80 color images with each size equal to. Each image is represented by a surface realization, where is the color intensity of the location. represents the RGB color intensity. We introduce canonical species curves.

Conclusions The Dirichlet labeling process provides a highly flexible prior for modeling collections of functions. The inter-relationships between these parameters are complex with regard to process behavior. MCMC inference is proved to have a fast mixing and yields good results. The model is applied on two real applications.