A Feature-Space-Based Indicator Kriging Approach for Remote Sensing Image Classification 鄭克聲 台灣大學生物環境系統工程學系.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Page 1 of 50 Optimization of Artificial Neural Networks in Remote Sensing Data Analysis Tiegeng Ren Dept. of Natural Resource Science in URI (401)
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Basic geostatistics Austin Troy.
What is Statistical Modeling
Pattern recognition Professor Aly A. Farag
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Data Mining Techniques Outline
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Basics: Notation: Sum:. PARAMETERS MEAN: Sample Variance: Standard Deviation: * the statistical average * the central tendency * the spread of the values.
Chapter 4 (part 2): Non-Parametric Classification
Visual Recognition Tutorial
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Remote Sensing Laboratory Dept. of Information Engineering and Computer Science University of Trento Via Sommarive, 14, I Povo, Trento, Italy Remote.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Image Classification and its Applications
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Image Classification 영상분류
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
1 E. Fatemizadeh Statistical Pattern Recognition.
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
Optimal Bayes Classification
Page  1 LAND COVER GEOSTATISTICAL CLASSIFICATION FOR REMOTE SENSING  Kęstutis Dučinskas, Lijana Stabingiene and Giedrius Stabingis  Department of Statistics,
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
STOCHASTIC HYDROLOGY Stochastic Simulation of Bivariate Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National.
Stochastic Hydrology Random Field Simulation Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Chapter 7 Maximum likelihood classification of remotely sensed imagery 遥感影像分类的最大似然法 Chapter 7 Maximum likelihood classification of remotely sensed imagery.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
CWR 6536 Stochastic Subsurface Hydrology Optimal Estimation of Hydrologic Parameters.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Chapter 3: Maximum-Likelihood Parameter Estimation
CEE 6410 Water Resources Systems Analysis
LINEAR CLASSIFIERS The Problem: Consider a two class task with ω1, ω2.
Deep Feedforward Networks
Supervised Training and Classification
Probability Theory and Parameter Estimation I
LECTURE 11: Advanced Discriminant Analysis
LECTURE 10: DISCRIMINANT ANALYSIS
Ch3: Model Building through Regression
CH 5: Multivariate Methods
Chapter 12 Object Recognition
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
University College London (UCL), UK
Statistical Learning Dong Liu Dept. EEIS, USTC.
Inference for Geostatistical Data: Kriging for Spatial Interpolation
REMOTE SENSING Multispectral Image Classification
REMOTE SENSING Multispectral Image Classification
Paul D. Sampson Peter Guttorp
Stochastic Hydrology Random Field Simulation
Image Information Extraction
EE513 Audio Signals and Systems
University College London (UCL), UK
Generally Discriminant Analysis
The Naïve Bayes (NB) Classifier
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
LECTURE 09: DISCRIMINANT ANALYSIS
Parametric Methods Berlin Chen, 2005 References:
Learning From Observed Data
Multivariate Methods Berlin Chen
Mathematical Foundations of BME
Multivariate Methods Berlin Chen, 2005 References:
Presentation transcript:

A Feature-Space-Based Indicator Kriging Approach for Remote Sensing Image Classification 鄭克聲 台灣大學生物環境系統工程學系

Image Classification Statistical pattern recognition techniques are widely used for landuse/land cover classification. Some supervised classification algorithms Parametric Approach Maximum likelihood classifier Bayes classifier Non-parametric Approach Nearest Neighbour classifier Artificial Neural network classifier

ANN Classifiers ANN classifiers do not consider classification features as having probability distributions, and therefore, classification is not explicitly probability-based. In a loosely defined sense, ANN classification is a process of searching optimal solution of weight vector that minimizes the sum of squared errors between network and desired output responses.

ANN Classifiers It has been shown that the output of a backpropagation network can approximate the posterior density function, if its activation function is capable of representing the a posteriori probability function and the number of training samples is sufficiently large (Lee et al., 1991). Manry et al. (1996) also showed that a neural network can approximate the minimum mean square estimator arbitrarily well, provided that it is of adequate size and is well-trained.

ANN Classifiers Egmont-Petersen et al. (2002) point out that ANNs suffer from what is known as the black-box problem: given any input a corresponding output is produced, but it cannot be elucidated why this decision was reached, how reliable it is, etc.

Image Classification The work of image classification can be considered as partitioning a hyperspace using discriminant rules established by samples. Each sample point in feature space is labeled a class index.

Difficulties Encountered In Application of Parametric Approaches Application of parametric approaches require knowledge of probability distribution of classification features. Classification features often have finite mixture distributions (multi-modal class densities). The class distribution may be non-Gaussian.

Geostatistical Approach of Spatial Estimation Geostatistics is a set of techniques, often referred to as kriging methods, which utilize the spatial covariance function or the semivariogram for spatial data analysis. Ordinary kriging yields best linear unbiased estimator (BLUE). Indicator kriging yields estimate of the probability distribution at specified locations.

Since probability density and correlation structure between classification features are insightful, probability-based classification methods are appealing to many researchers and practitioners. The work of probability-based classification can be conceived as a spatial estimation problem for which the estimates are probabilities that a pixel with certain feature-vector belongs to different classes.

Ordinary Kriging Ordinary kriging assumes second-order stationary properties for the random field {Z(x), x}

Properties of OK Estimates Unbiased i.e. Minimum variance of estimation error Conditional minimization Minimizing

Ordinary Kriging System Semi-variogram

Typical Form of A Variogram Variogram characterizes the spatial variation of a random field.

Matrix Form of OK System

Indicator Kriging Indicator kriging is a method of spatial estimation that yields an estimate of probability distribution function of the random variable of interest. Consider a random field of k classes where Ω represents the spatial domain of the random field. A total number of n features are used for classification of the k classes. For convenience of illustration, let’s assume k = 3 and n = 2. From a set of training pixels, we first establish the k-class scatter plot in feature space.

Scatter Plot in Feature Space

Indicator Variable For a continuous random field, the indicator variable can be used to estimate the distribution of the random variable by using a set of cutoff values. The indicator variable at location x is defined as where is a selected cutoff value.

The weighted average of indicator variables is an estimate of the cumulative probability, i.e., If Z(xj), j = 1, 2, …, N, are mutually independent, then j = 1/N . For a random field with spatial auto-correlation characteristics, indicator variogram must be established and used to estimate the cumulative probability of the random variable at unobserved locations.

Indicator Variable for Categorical Random Field Similar to the case of continuous random field, the indicator variable can be used to estimate the probability that a pixel belongs to a certain class for categorical random field. Let the indicator variable be defined as wherere presents the j-th class and represents the pixel at location x. is the value of the indicator variable related to the j-th class.

The weighted average of values of indicator variables is an estimate of the probability that a pixel belongs to the j-th class, i.e.

Class-specific Indicator Variable Scatter Plot in Feature Space Three-class scatter plot of indicator variables in two dimensional feature space. Class-specific Scatter plot of indicator variables (Binary Scatter Plots) Class-1 Class-2 Class-3

For each binary scatter plot, we consider the variation of indicator variables as a random field associated with that particular class. By conducting ordinary kriging, for each class, of indicator variables in feature space, we obtain the probability that the pixel of interest belongs to each individual class. Class assignment of the pixel of interest is done based on the following criterion If max , then assign to class .

Study Area and Data An area of approximately 70 km2 in Central Taiwan is selected as our study site. The study area includes a small township and nearby mountainous area. A major river flows westward along the northern edge of the area. Five landcover classes (water, built-up, forest, crop, and bare land) are identified in this study. SPOT satellite image acquired on September 21, 2001 was used for landuse classification.

SPOT Image of the Study Area

Three classification features (green, red and near infrared bands) were used. A total of 1886 training pixels and 732 verification pixels were selected.

Confusion Matrix – ML (Training)

Confusion Matrix – IK (Training)

Confusion Matrix – IK (Verification)

A Further Test Case

Further Considerations Replicates in feature space. Anisotropic variation in feature space.

Conclusions Indicator kriging approach is distribution-free; therefore, it does not require the knowledge of distribution types. IK algorithm achieves high classification accuracies.

References Bierkens, M. F. and P. A. Burrough , 1993.The indicator approach to categorical soil data.Ι.Theory .J. of Soil Science, 44, pp. 361-368. Bierkens, M. F. and P.A. Burrough , 1993.The indicator approach to categorical soil data. II. Application to mapping and land use suitability analysis. J. of Soil Science, 44, pp. 369-381. Meer, F. V. D., 1996. Classification of remotely-sensed imagery using an indicator kriging approach: application to the problem of calcite dolomite mineral. Int. J. of remote sensing. Vol. 17, no. 6, pp. 1233-1249.

References Journel, A. G., 1983. non-parametric estimation of spatial distributions, math. Geol. 15445-468. Lillesand, T. M. , Johnson, W. L. , Deueil, R. L.,O.M. Lixdstrom and D.E. Meisner,1983.Use of Landsat data to predict the trophic state of Minnesota lakes. Photogrammetric Engineering & Remote Sensing, Vol. 49,No.2, pp. 219~229. Lillesand, Thomas M. and Kiefer, Ralph W. ,1994. Remote sensing and image interpretation.