Dynamic Background Learning through Deep Auto-encoder Networks Pei Xu 1, Mao Ye 1, Xue Li 2, Qihe Liu 1, Yi Yang 2 and Jian Ding 3 1.University of Electronic.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Active Appearance Models

Introduction to Neural Networks Computing

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

EKF, UKF TexPoint fonts used in EMF.

Stacking RBMs and Auto-encoders for Deep Architectures References:[Bengio, 2009], [Vincent et al., 2008] 2011/03/03 강병곤.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Ch11 Curve Fitting Dr. Deshi Ye

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.

Visual Recognition Tutorial

1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.

RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.

Chapter 5 Part II 5.3 Spread of Data 5.4 Fisher Discriminant.

Submitted by:Supervised by: Ankit Bhutani Prof. Amitabha Mukerjee (Y )Prof. K S Venkatesh.

Autoencoders Mostafa Heidarpour

Image Denoising and Inpainting with Deep Neural Networks Junyuan Xie, Linli Xu, Enhong Chen School of Computer Science and Technology University of Science.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Radial-Basis Function Networks

Radial Basis Function Networks

Normalised Least Mean-Square Adaptive Filtering

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.

Biointelligence Laboratory, Seoul National University

Neural Network Introduction Hung-yi Lee. Review: Supervised Learning Training: Pick the “best” Function f * Training Data Model Testing: Hypothesis Function.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 9: Ways of speeding up the learning and preventing overfitting Geoffrey Hinton.

Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.

Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,

Intro. ANN & Fuzzy Systems Lecture 14. MLP (VI): Model Selection.

Feature selection with Neural Networks Dmitrij Lagutin, T Variable Selection for Regression

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.

Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.

Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)

Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.

Copyright © Cengage Learning. All rights reserved. Graphs; Equations of Lines; Functions; Variation 3.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

CSC2535: Lecture 4: Autoencoders, Free energy, and Minimum Description Length Geoffrey Hinton.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Chapter 7. Classification and Prediction

Learning Deep Generative Models by Ruslan Salakhutdinov

Evaluating Classifiers

Deep Feedforward Networks

Probability Theory and Parameter Estimation I

Regularizing Face Verification Nets To Discrete-Valued Pain Regression

Unsupervised Learning: Principle Component Analysis

Neural networks (3) Regularization Autoencoder

Dynamical Statistical Shape Priors for Level Set Based Tracking

Unfolding Problem: A Machine Learning Approach

Hidden Markov Models Part 2: Algorithms

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

Presented by: Yang Yu Spatiotemporal GMM for Background Subtraction with Superpixel Hierarchy Mingliang Chen, Xing Wei, Qingxiong.

Goodfellow: Chapter 14 Autoencoders

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Representation Learning with Deep Auto-Encoder

Image and Video Processing

Neural networks (3) Regularization Autoencoder

Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.

Unfolding with system identification

Deep learning enhanced Markov State Models (MSMs)

Ch4: Backpropagation (BP)

Goodfellow: Chapter 14 Autoencoders

Presentation transcript:

Dynamic Background Learning through Deep Auto-encoder Networks Pei Xu 1, Mao Ye 1, Xue Li 2, Qihe Liu 1, Yi Yang 2 and Jian Ding 3 1.University of Electronic Science and Technology of China 2.The University of Queensland 3. Tencent Group Sorry for no show because of the visa delay

Previous Works about Dynamic Background Learning: Mixture of Gaussian [Wren et al. 2002] Hidden Markov Model[Rittscher et al 2000]1-SVM [Cheng et al. 2009] DCOLOR[Zhou et al. 2013]

Existing Problems: 1. Many previous works needed clean background images (without foregrounds) to train classifier. 2. To extract clean background, some works added assumption to background images (such as Linear Correlated).

Preliminaries about Auto-encoder Network In our work, we use the deep auto-encoder networks proposed by Bengio et al (2007), as the building block. 1. Encoding In the encoding stage, the input data is encoded by a function which is defined as: where is a weight matrix, is a hidden bias vector, and sigm(z)=1/(1+exp(-z)) is the sigmoid function.

Then, h 1 as the input is also encoded by another function which is written as: where is a weight matrix, is a bias vector.

2. Decoding In the decoding stage, h 2 is the input of function: where is a bias vector.

Then the reconstructed output is computed by the decoding function: where is a bias vector. The parameters ( and ) are learned through minimizing the Cross-entropy function written as:

Proposed Method 1. Dynamic Background Modeling a. A deep auto-encoder network is used to extract background images from video frames. b. A separation function is defined to formulate the background images. c. Another deep auto-encoder network is used to learn the ‘clean’ dynamic background.

Inspired by denoising auto-encoder (DAE), we view dynamic background and foreground as ‘clean’ data and ‘noise’ data, respectively. [Vincent et al, 2008] DAE needs ‘clean’ data to add noises and learns the distribution of the noises.

Unfortunately, in real world application, clean background images cannot be obtained, such as traffic monitoring system. But do we really need ‘clean’ data to train an auto-encoder network?

Firstly, we use a deep auto-encoder network (named Background Extraction Network, BEN) to extract background image from the input video frames. Cross-entropy Background Items where vector B 0 represents the extracted background image. And is the tolerance value vector of B 0.

Background Items: This item forces the reconstructed frames approach to a background image B 0. This regularization item controls the solution range of. In video sequences, each pixel belongs to background at most of time. Basic observation of our work

Background Items: To be resilient to large variance tolerances, in this item, we divide the approximate error by the parameter at the ith pixel. How we train the parameters of the Background Extraction Network?

The cost function of Background Extraction Network: The parameters contains, and, where. (1)

(1) The updating of is: is the learning rate, and is written as:

There is an absolute item in the second item. We adopt a sign function to roughly compute the derivative as follows: where sign(a)=1(if a>0), sign(a)=0 (if a=0), and sign(a)=-1 (if a<0).

(2) The updating of is the optimal problem written as: According to previous works about –norm optimization, the optimal is the median of for i=1,…, N.

(3) The updating of is the optimal problem written as: Optimizing equals to minimize its logarithmic form, written as be zero. It follows that

The optimal is:

After the training of Background Extraction Network (BEN) is finished, for the video frames x j (j=1, 2,…, D), we can get a clean and static background B 0, and the tolerance measure of background variations. The reconstructed output is not exact the background image though the deep auto-encoder network BEN can move some foregrounds in some sense.

So we adopt a separation function to clean the output furthermore, which is: where are the cleaned background images.

If, then at the ith pixel of the jth background image equals. Otherwise, equals. For the input D video frames, we obtain the clean background image set ( ) in some sense.

2. Dynamic Background Learning Another deep auto-encoder network (named Background Learning Network, BLN) is used to further learn the dynamic background model.

The clean background images as the input data is used to train the parameters of the BLN. The cost function of Background Network is:

Online Learning In previous section, just D frames are used to train the dynamic background model. The number of samples is limited which may produce the overfitting problem. To incorporate more data, we propose an online learning method. Our aim is to find the weight vectors whose effecting of cost function is low.

W Firstly, the weights matrix W is rewritten as, where is a N -dimensional vector and M is the number of the higher layer nodes.

Then, let denote a disturbance from. We have. And then,

So we get where is the Hessian matrix of. Here we ignore the third-order term. Using Taylor’s theory, we obtain

For two hidden layer auto-encoder network, the optimal problem is to solve: where is the weights of two hidden layers. And is the jth column of identity matrix. Is the kth column of identity matrix.

We sort the results of for and, respectively. The vector with some j, which satisfies is substituted by a randomly chosen vector satisfying, where is an artificial parameter.

Experimental Results We use six publicly available video data sets in our experiments, including Jug, Railway, Lights, Trees, Water-Surface, and Fountain to evaluate the performance.

TPR vs on six data sets. The different values of provide different tolerances of dynamic background. 1. Parameter Setting

We compute the TPR on six data sets with different. In our discussion below, we choose the value of which is corresponding to the highest TPR on each data set. Specifically, =0.5, 0.4, 0.4, 0.5, 0.6, 0.4 correspond to the Jug, Lights, Fountain, Railway, Water-Surface, and Trees, respectively.

2. Comparisons to Previous Works Comparisons of ROC Curves

Table1: Comparisons of F-measure on Fountain, Water-Surface, Trees and Lights Table2: Comparisons of F-measure on Jug and Railway

Comparisons of foreground extraction

Online Learning Strategy Comparison Comparisons of online learning strategy

Thank you! Feel free to contact us: