Model Selection in Parameterizing Cell Images and Populations

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Point Estimation Notes of STAT 6205 by Dr. Fan.

What is Statistical Modeling

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {

CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation Given.

EE462 MLCV 1 Lecture 3-4 Clustering (1hr) Gaussian Mixture and EM (1hr) Tae-Kyun Kim.

Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides

PATTERN RECOGNITION AND MACHINE LEARNING

CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

Enhanced Correspondence and Statistics for Structural Shape Analysis: Current Research Martin Styner Department of Computer Science and Psychiatry.

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.

Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.

Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.

ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.

CS479/679 Pattern Recognition Dr. George Bebis

The Maximum Likelihood Method

Hierarchical Models.

Data Transformation: Normalization

3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.

Chapter 3: Maximum-Likelihood Parameter Estimation

Deep Feedforward Networks

Fitting: Voting and the Hough Transform

LECTURE 09: BAYESIAN ESTIMATION (Cont.)

University of Ioannina

LECTURE 10: DISCRIMINANT ANALYSIS

Model Inference and Averaging

The Maximum Likelihood Method

CJT 765: Structural Equation Modeling

Sample Mean Distributions

Principal Component Analysis (PCA)

Clustering (3) Center-based algorithms Fuzzy k-means

Overview of Supervised Learning

Pattern Classification, Chapter 3

Latent Variables, Mixture Models and EM

Simple Linear Regression - Introduction

Roberto Battiti, Mauro Brunato

Presenter: Hajar Emami

The Maximum Likelihood Method

Fitting Curve Models to Edges

CSCI 5822 Probabilistic Models of Human and Machine Learning

Course Outline MODEL INFORMATION COMPLETE INCOMPLETE

Two-Variable Regression Model: The Problem of Estimation

Modelling data and curve fitting

Probabilistic Models with Latent Variables

Filtering and State Estimation: Basic Concepts

10701 / Machine Learning Today: - Cross validation,

Linear Model Selection and regularization

Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models

Simple Linear Regression

Model Evaluation and Selection

OVERVIEW OF LINEAR MODELS

Anastasia Baryshnikova Cell Systems

Morphological Operators

LECTURE 09: DISCRIMINANT ANALYSIS

Andreas Hilfinger, Thomas M. Norman, Johan Paulsson Cell Systems

Model generalization Brief summary of methods

ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.

Biointelligence Laboratory, Seoul National University

Chapter 14.1 Goodness of Fit Test.

EM Algorithm and its Applications

Reuben Feinman Research advised by Brenden Lake

Tools for automatically combining biochemical and cell organization models Devin Sullivan.

Probabilistic Surrogate Models

Presentation transcript:

Model Selection in Parameterizing Cell Images and Populations MMBIOS, April 2015 Gregory R. Johnson

Object pos. probability Microtubule distribution Nuclear shape Cell shape Object pos. probability Object number Object appearance Microtubule distribution Object positions Object distribution CellOrganizer Training Synthesis Cell Images Synthetic Model Parameters This slide illustrates the central concept of our work in generative modeling. Here we construct models for cell components learned for many cell instances and combine them into a statistical model such that we can sample from that model to obtain new parameter values that we use to synthesize new cell instances

CellOrganizer Models Cell Populations Learn how spatial relationships of cell compartments vary across cell populations Generate high-quality in silico representations (i.e. images) cell shape and the relationships of compartments within them Images Parameterizations X1 p1 Sampled Parameterizations Synthesized Images X2 p2 p1* x1* P(pi|Ɵ) X3 p3 p2* x2* … … X4 p4 pm* xm* Cell Morphology Distribution … … Xn pn f(x) = p d({p1,…,pn}) = Ɵ b(Ɵ) = p* g(p) = x

CellOrganizer Models Cell Populations Represent cell morphology and organization of components in an invertable, compact manner Learn a distribution over these compact parameterizations X1 p1 Sampled Parameterizations Synthesized Images X2 p2 p1* x1* P(pi|Ɵ) X3 p3 p2* x2* … … X4 p4 pm* xm* Cell Morphology Distribution … … Xn pn f(x) = p d({p1,…,pn}) = Ɵ b(Ɵ) = p* g(p) = x

Image To Parameterization Images Parameterizations X1 p1 Represent cell morphology in a compact set of parameters We also desire an invertible function such that we can recover the original image pi,2 pi,3 xi [ , , ] pi,1 cell nucleus protein pattern f(xi) = pi ⟺ g(pi) = xi, i.,e. p1 x1

Image parameterization is lossy Full covariance matrix Gaussian fit Spherical covariance matrix Gaussian fit LAMP2 Protein Pattern GMM parameters ----- Meeting Notes (4/20/15 14:19) ----- Compact parameterizaton Can be lossy add gmm parameters to f(x) = p_i line or pick k based on aic or bic Represent the mixture from parameters Image parameterizations vs number of parameters Becomes Likelihood Maximization problem if K is known

Shape Space Modeling Pipeline MDS 0.85 0.63 0.74 0.90 a. b. c. d.

Image parameterization is lossy (contd.) x1 x2 x3 x4 g(p1) g(p2) g(p3) g(p4) Where ----- Meeting Notes (4/20/15 14:19) ----- By whatever criterion you choose the model, it may be imperfect Fig 2 from T. Peng et al, “Instance-based generative biological shape modeling” 2009.

Multidimensional Scaling = measured distance between shapes i, j = Euclidian embeddings for all shapes = Euclidean distance between embedding coordinates for shapes i, j = Indicator for if Di,j is observed

Shape space dimensionality vs Reconstruction Reconstruction is dependent on the number of observed distances and the dimensionality of the embedding blue = 1 dimensional embedding red = “complete” embedding

Prediction of cell and nuclear dependency

The “goodness” of a cell parameterization Many ways to do this Pixel-pixel Mean Squared Sørensen-Dice Coefficient for binary images and shapes Likelihood function…

Parameters to distribution P(pi|Ɵ) Parameters to distribution … p* pn d({p1,…,pn}) = Ɵ b(p|Ɵ) = p*

Parameters to distribution P(pi|Ɵ) Parameters to distribution p* … pn d({p1,…,pn}) = Ɵ b(p|Ɵ) = p* “Straight forward” distribution learning and model selection Some parameterization may overfit (i.e. point-mass) Many models can not be learned via closed-form solutions Predictive Maximum Likelihood i.e. where n is the number of hold outs xn is some hold-out subset and Ɵn is corresponding trained model

Distributions of object position HIP1 ACBD5 SEC23B

Possible Models Puncta are dependent on organelles, but independent of each other Poisson process Puncta are dependent on organelles and each other Fiskel point process

Five-fold cross validation to choose the best model Model with no puncta-puncta spatial interaction indicates greater likelihood!

Toward Spatial Network Models Colocalization is a complex network with interdependencies Simplify it by use one-direction dependencies (network -> DAG) dprot dcell dnuc pprot nprot sprot iprot Protein N A spatial network exhibiting negative colocalization a) b) c) Fig 1. Representative image of segmented Arabidopsis plant protoplast. a) False colored image with green indicating auto fluorescent chloroplast channel and red indicating endoplasmic reticulum. b) Auto fluorescent chloroplast channel. c) ER channel. Notice the high degree of negative colocalization. Fig 2. DAG of spatial interaction network, N is the number of protein patterns A diagram of a simplified spatial interaction network

Pattern Modeling contd. Generative Models Add parameters to account for spatial dependency of arbitrary numbers of protein patterns P(Chloroplast | Cell) P( ER | Cell) 3D rendering of a protoplast P(Chloroplast | Cell) P(ER | Cell, Chloroplast)

Big Picture… Want most precise cell parameterization f(x) = p, g(p) = x Best-generalizing distribution d({p1,…,pn}) = Ɵ Images Parameterizations X1 p1 Sampled Parameterizations Synthesized Images X2 p2 p1* x1* P(pi|Ɵ) X3 p3 p2* x2* … … X4 p4 pm* xm* Cell Morphology Distribution … … Xn pn f(x) = p d({p1,…,pn}) = Ɵ b(Ɵ) = p* g(p) = x

Master Modeling function How to build a master model-selection model g(pi) with least error between xi and g(pi) d({p1,…,pn}) = Ɵ with greatest likelihood Even if errtot is some sort of proabilistic model, it is not clear how to balance errtot and likelihood of the model ESPECIALLY BECAUSE G(X) DRASTICTLY CHANGES VALUES OF Ɵ Spatial relationship model