Modeling and Segmentation of Dynamic Textures

Slides:

Advertisements

Similar presentations

Bayesian Belief Propagation

Advertisements

Image Registration  Mapping of Evolution. Registration Goals Assume the correspondences are known Find such f() and g() such that the images are best.

3D Geometry for Computer Graphics

Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Level set based Image Segmentation Hang Xiao Jan12, 2013.

IIIT Hyderabad ROBUST OPTIC DISK SEGMENTATION FROM COLOUR RETINAL IMAGES Gopal Datt Joshi, Rohit Gautam, Jayanthi Sivaswamy CVIT, IIIT Hyderabad, Hyderabad,

Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction.

Object Recognition & Model Based Tracking © Danica Kragic Tracking system.

Foreground Modeling The Shape of Things that Came Nathan Jacobs Advisor: Robert Pless Computer Science Washington University in St. Louis.

Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1, Lehigh University.

Local Descriptors for Spatio-Temporal Recognition

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

MASKS © 2004 Invitation to 3D vision Lecture 8 Segmentation of Dynamical Scenes.

Motion Analysis (contd.) Slides are from RPI Registration Class.

A Study of Approaches for Object Recognition

CSci 6971: Image Registration Lecture 4: First Examples January 23, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart, RPI Dr.

Probabilistic video stabilization using Kalman filtering and mosaicking.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Real-time Combined 2D+3D Active Appearance Models Jing Xiao, Simon Baker,Iain Matthew, and Takeo Kanade CVPR 2004 Presented by Pat Chan 23/11/2004.

Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.

COMP 290 Computer Vision - Spring Motion II - Estimation of Motion field / 3-D construction from motion Yongjik Kim.

CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.

CSE554Laplacian DeformationSlide 1 CSE 554 Lecture 8: Laplacian Deformation Fall 2012.

Tracking Pedestrians Using Local Spatio- Temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS.

Multimodal Interaction Dr. Mike Spann

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.

CS 4487/6587 Algorithms for Image Analysis

Medical Image Analysis Image Registration Figures come from the textbook: Medical Image Analysis, by Atam P. Dhawan, IEEE Press, 2003.

Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.

CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.

Joint Tracking of Features and Edges STAN BIRCHFIELD AND SHRINIVAS PUNDLIK CLEMSON UNIVERSITY ABSTRACT LUCAS-KANADE AND HORN-SCHUNCK JOINT TRACKING OF.

Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Implicit Active Shape Models for 3D Segmentation in MR Imaging M. Rousson 1, N. Paragio s 2, R. Deriche 1 1 Odyssée Lab., INRIA Sophia Antipolis, France.

Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.

ICCV 2007 Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1,

CSE 554 Lecture 8: Alignment

Motion and optical flow

SIFT Scale-Invariant Feature Transform David Lowe

Interest Points EE/CSE 576 Linda Shapiro.

A Closed Form Solution to Direct Motion Segmentation

Motion Segmentation with Missing Data using PowerFactorization & GPCA

Unsupervised Riemannian Clustering of Probability Density Functions

Computer Vision, Robotics, Machine Learning and Control Lab

René Vidal and Xiaodong Fan Center for Imaging Science

Segmentation of Dynamic Scenes

Part II Applications of GPCA in Computer Vision

Particle Filtering for Geometric Active Contours

René Vidal Time/Place: T-Th 4.30pm-6pm, Hodson 301

Segmentation of Dynamic Scenes

Language of Motion: Hybrid Systems Modeling

A Unified Algebraic Approach to 2D and 3D Motion Segmentation

Segmentation of Dynamic Scenes from Image Intensities

Optical Flow Estimation and Segmentation of Moving Dynamic Textures

Generalized Principal Component Analysis CVPR 2008

Machine Learning Basics

Paper Presentation: Shape and Matching

Dynamical Statistical Shape Priors for Level Set Based Tracking

Multi-modality image registration using mutual information based on gradient vector flow Yujun Guo May 1,2006.

PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD

Turning to the Masters: Motion Capturing Cartoons

Segmentation of Dynamical Scenes

Biointelligence Laboratory, Seoul National University

Presented by Xu Miao April 20, 2005

Presentation transcript:

Modeling and Segmentation of Dynamic Textures Atiyeh Ghoreyshi, Avinash Ravichandran, René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University

Modeling dynamic textures Extract a set of features from the video sequence Spatial filters ICA/PCA Wavelets Intensities of all pixels Model spatiotemporal evolution of features as the output of a linear dynamical system (LDS): Soatto et al. ‘01 dynamics images appearance

Learning the model parameters Model is a LDS driven by IID white Gaussian noise Bilinear problem, can do EM Optimal solution: subspace identification (Bart de Moor) Suboptimal solution in the absence of noise (Soatto et al. ‘01) Can compute C and z(t) from the SVD of the images Given z(t) solving for A is a linear problem dynamics images appearance

Synthesizing novel sequences Once a model of a dynamic texture has been learned, one can use it to synthesize novel sequences: Shöld et al. ’00, Soatto et al. ’01, Doretto et al. ’03, Yuan et al. ‘04

Classifying/recognizing novel sequences Given videos of several classes of dynamic textures, one can use their models to classify new sequences (Saisan et al. ’01) Identify dynamical models for all sequences in the training set Identify a dynamical model for novel sequences Assign novel sequences to the class of its nearest neighbor Requires one to compute a distance between dynamical modesl Martin distance (Martin ’00) Subspace angles (De Cook ‘02) Binet-Cauchy kernels (Vishwanathan-Smola-Vidal ‘05)

Talk outline Can we recover the rigid motion of a camera observing a dynamic texture? Dynamic Texture Constancy Constraint Can we segment a scene containing multiple dynamic textures? Algebraic method Dynamical models (ARMA) Subspace clustering: GPCA Variational method Ising texture descriptors Dynamical model (AR) Level sets Dynamical shape priors

Part I Computing Optical Flow from Dynamic Textures Center for Imaging Science Institute for Computational Medicine Johns Hopkins University

Optical flow of a rigid scene Brightness constancy constraint (BCC) Differential BCC Computing optical flow

Optical flow of a dynamic texture A time invariant model cannot capture camera motion A rigid scene is a 1st order LDS A = 1, z(t) = 1, y(t) = C = constant Camera motion can be modeled with a time varying LDS A models nonrigid motion C models rigid camera motion Identification of time varying LDS Difficult open problem Approximate solution: assume time invariant LDS in a temporal window, and estimate (A,C(t)) by shifting time window dynamics images appearance

Optical flow of a dynamic texture Static textures: optical flow from brightness constancy constraint (BCC) Dynamic textures: optical flow from dynamic texture constancy constraint (DTCC)

Optical flow results for synthetic sequences Right Left Up Down

Optical flow of flowerbed sequence

Optical flow of rock rain sequence

Part II Segmentation Dynamic Textures Atiyeh Ghoreyshi Center for Imaging Science Institute for Computational Medicine Johns Hopkins University

Segmenting non-moving dynamic textures One dynamic texture lives in the observability subspace Multiple textures live in multiple subspaces Dynamic texture segmentation is a subspace clustering problem water steam

Generalized Principal Component Analysis Given points on multiple subspaces, identify The number of subspaces and their dimensions A basis for each subspace The segmentation of the data points A union of n subspaces = zero set of polynomials of degree n Coefficients of pn(x) computed linearly Normal to the subspaces computed from the gradient of the polynomial

Segmenting water steam sequence Multiple textures live in multiple subspaces Cluster the data using GPCA water steam

Segmenting a cardiac-MR sequence Goal: recognize multiple types of arrhythmias using heart MRI images Model: visual dynamics are modeled as multiple dynamic textures Heart motion: nonrigid Chest motion: respiration

Overview and remaining problems We have seen so far that We can model moving dynamic textures using time varying linear dynamical models We can estimate optical flow of moving dynamic textures using DTCC We can segment dynamic textures using GPCA Problems Identification of time varying linear dynamical models, with C(t) evolving due to perspective projection of a rigid-body motion Spatial coherence of the segmentation result is not taken into account

Dynamic texture segmentation How can we incorporate spatial coherence? Level set methods Represent contour C as the zero level set of an implicit function φ, i.e., C = {(x, y) : φ(x, y) = 0} Advantages Spatial coherence is controllable Do not depend on a specific parameterization of the contour Allow topological changes of the contour during evolution

Level sets intensity-based segmentation Chan-Vese energy functional Implicit methods Represent C as the zero level set of an implicit function φ, i.e. C = {(x, y) : φ(x, y) = 0} Solution The solution to the gradient descent algorithm for φ is given by c1 and c2 are the mean intensities inside and outside the contour C.

Prior work on dynamic texture segmentation Minimize the cost functional (Doretto et al. ’03) in a level set framework, where si is a descriptor modeling the average behavior of the ith region. The discrepancy measure among regions is based on subspace angles between dynamical models Disadvantages Energy does not directly depend on the data Fitting term depends on metrics on dynamical systems Dynamical models are computed locally

Our DT segmentation approach (WDV’06) A new algorithm for the segmentation of dynamic textures Model spatial statistics of each frame with a set of Ising descriptors. Model the dynamics of these descriptors with AR models. Cast the segmentation problem in a variational framework that minimizes a spatial-temporal extension of the Chan-Vese energy. Key advantages of our approach The identification of the parameters of an AR model can be done in closed form by solving a simple linear system. Using AR models of very low order gives good segmentation results, hence we can deal with moving boundaries more efficiently. It naturally handles intensity and texture based segmentation as well as motion based video segmentation as particular cases.

Dynamics & intensity-based energy We represent the intensities of the pixels in the images as the output of a mixture of AR models of order p We propose the following spatial-temporal extension of the Chan-Vese energy functional where

Dynamics & intensity-based segmentation Given the AR parameters, we can solve for the implicit function φ by solving the PDE Given the implicit function φ, we can solve for the AR parameters of the jth region by solving the linear system

Dynamics & texture-based segmentation We use the Ising second order model to represent the static texture at pixel (x,y) and frame f Consider the set of all cliques as shown below in a neighborhood W of size w×w around the pixel: For each clique, one defines the function The ith entry of the texture descriptor is defined as

Dynamics & texture-based segmentation Same approach as before, but instead of intensities, use the 5D feature vectors We model the temporal evolution of the texture descriptors using a mixture of vector-valued AR models of the form We segment the scene by minimizing the energy functional where

Dynamics & texture-based segmentation Alternating minimization method Given the AR parameters, solve for the implicit function φ by solving the PDE. Given the implicit function φ, solve for the AR parameters of the jth region by solving the linear system. Implementation Assume that the AR parameter matrices are diagonal. Entries of the feature vector are outputs of five decoupled scalar AR models. Can estimate the AR parameters independently for each entry of the texture descriptors, as we did before for the intensities.

Experimental results Fixed boundary segmentation results and comparison Original sequence ARMA GPCA Vidal CVPR’05 ARMA Subspace Angles Level Sets Doretto ICCV’03 Our method

Experimental results Fixed boundary segmentation results and comparison Ocean-smoke Ocean-dynamics Ocean-appearance

Experimental results Moving boundary segmentation results and comparison Ocean-fire

Experimental results Results on a real sequence Raccoon on River

One axial slice of a beating heart Experimental Results Segmentation of the epicardium One axial slice of a beating heart Segmentation results Initialized with GPCA

Conclusions A new algorithm for the segmentation of dynamic textures Model spatial statistics of each frame with a set of Ising descriptors Model the dynamics of these descriptors with ARX models Cast the segmentation problem in a variational framework that minimizes a spatial-temporal extension of the Chan-Vese energy Key advantages of our approach The identification of the parameters of an ARX model can be done in closed form by solving a simple linear system Using ARX models of very low order gives good segmentation results, hence we can deal with moving boundaries more efficiently It naturally handles intensity and texture based segmentation as well as motion based video segmentation as particular cases

Overview and remaining problems So far, we have Modeled moving and non-moving dynamic textures Estimated optical flow of moving dynamic textures Segmented dynamic textures We have yet to Improve the accuracy of the results for more sensitive settings such as medical image segmentation. Improve the robustness to Initialization Noise Low resolution data

Segmentation using priors One solution could be incorporating prior knowledge about the attributes of our region of interest. Previous work Priors on shape Tsai et al. ’03, Kaus et al. ’03, Pluempitiwiriyawej et al. ’05 Model-based segmentation: Horkaew and Yang ’03, Biechel et al. ‘05 Priors on shape and intensity Rousson and Cremers ‘05

Dynamic texture segmentation using priors We propose a new algorithm for the segmentation of dynamic scenes using priors on shape, intensity, and AR parameters Represent manually segmented training images by their signed distance functions. Registration within the training set, followed by PCA on the space of signed distance functions. Model the dynamics of the regions with AR models; build pdf’s of the AR parameters and intensity for each region using histograms. Maximize a log-likelihood function using level set methods.

Shape variability A two level approach: Align the true masks using similarity transformations Extract the remaining variability by using PCA on the space of signed distance functions Registration using gradient descent on the energy functional with respect to translation, rotation, and scale. PCA on the matrix , where each column is a columnized signed distance function corresponding to a shape in the training set.

Intensity and AR parameters Exploit as many differences between the ROI and the background as possible Use pixel intensities and dynamics Use AR(p) models in temporal windows of size F. For every pixel in the image we have , where

Priors on Intensity and AR parameters The least square solution to the previous equation, with a regularization term to avoid singularities, is given by Build histograms of mean intensity and AR parameters for the ROI and the background in video sequences. Use histograms as pdf's of the parameters. Regularization term

Segmentation using ML and level sets Find the parameters that maximize the log-likelihood function Which is equivalent to minimizing where φ and H(φ) are the current signed distance and heaviside functions, and Pin and Pout are the joint pdf of the intensity and AR parameters for the ROI and the background respectively. Minimize EML using a gradient descent algorithm with respect to the pose parameters and the weights of the shape principal components.

Experimental Results Registration Results For our experiments, we use a dataset containing 7 video sequences corresponding to 7 axial slices of the heart. Each video sequence is composed of 30 frames of size 128x128 spanning over one complete beating cycle. We use AR(2) models with temporal windows of size 5. Registration Results Registered mean training shape Unregistered mean training shape

Histograms of intensity and AR parameters We calculate the joint pdf of the variables as the product of their individual pdf's, assuming independence. a1 histograms a2 histograms Intensity histograms Top: Heart Bottom: Background

Segmentation Results First slice Green: Initialization Red: Final segmentation Seventh slice

Comparison Statistics of Epicardium Segmentation Results with and without AR parameter priors

Dynamic textures with moving boundaries We use the method described for fixed boundaries on a temporal window of frames: Given a user specified embedding φ0 representing an initial contour, apply the dynamic texture segmentation method to frames 1,…,F. This results in a new embedding, φ1 Use φ1 as an initial embedding, and apply the dynamic texture segmentation method to frames 2,…,F+1. This results in a new embedding φ2 Repeat the previous step for all the remaining frames of the sequence. φ2 φ1