Image Classification: Supervised Methods

Slides:

Advertisements

Similar presentations

With support from: NSF DUE in partnership with: George McLeod Prepared by: Geospatial Technician Education Through Virginia’s Community Colleges.

Advertisements

Major Operations of Digital Image Processing (DIP) Image Quality Assessment Radiometric Correction Geometric Correction Image Classification Introduction.

Chapter 4: Linear Models for Classification

What is Statistical Modeling

An Overview of RS Image Clustering and Classification by Miles Logsdon with thanks to Robin Weeks Frank Westerlund.

Image Classification.

Thematic Information Extraction: Pattern Recognition/ Classification

Basics: Notation: Sum:. PARAMETERS MEAN: Sample Variance: Standard Deviation: * the statistical average * the central tendency * the spread of the values.

Lecture 14: Classification Thursday 18 February 2010 Reading: Ch – 7.19 Last lecture: Spectral Mixture Analysis.

Image Classification To automatically categorize all pixels in an image into land cover classes or themes.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Lecture 14: Classification Thursday 19 February Reading: “Estimating Sub-pixel Surface Roughness Using Remotely Sensed Stereoscopic Data” pdf preprint.

CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.

Image Classification: Redux

Pixel-based image classification

METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Image Classification

Classification Advantages of Visual Interpretation: 1. Human brain is the best data processor for information extraction, thus can do much more complex.

Spatial-based Enhancements Lecture 3 prepared by R. Lathrop 10/99 updated 10/03 ERDAS Field Guide 6th Ed. Ch 5: ;

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

Rsensing6_khairul 1 Image Classification Image Classification uses the spectral information represented by the digital numbers in one or more spectral.

Principles of Pattern Recognition

Image Classification: Introduction Lecture Notes 6 prepared by R. Lathrop 11/99 updated 3/04 Readings: ERDAS Field Guide 6th Ed. CH. 6.

Environmental Remote Sensing Lecture 5: Image Classification " Purpose: – categorising data – data abstraction / simplification – data interpretation –

Image Classification: Introduction Lecture Notes 7 prepared by R. Lathrop 11/99 updated 3/04 Readings: ERDAS Field Guide 5th Ed. CH. 6:

Land Cover Classification Defining the pieces that make up the puzzle.

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

1 Remote Sensing and Image Processing: 6 Dr. Mathias (Mat) Disney UCL Geography Office: 301, 3rd Floor, Chandler House Tel:

Summer Session 09 August Tips for the Final Exam Make sure your answers clear, without convoluted language. Read questions carefully – are you answering.

Image Classification 영상분류

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

Remote Sensing Supervised Image Classification. Supervised Image Classification ► An image classification procedure that requires interaction with the.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Classification Heejune Ahn SeoulTech Last updated May. 03.

1 E. Fatemizadeh Statistical Pattern Recognition.

Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

CS654: Digital Image Analysis

Digital Image Processing

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Supervised Classification in Imagine D. Meyer E. Wood

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Remote Sensing Unsupervised Image Classification.

Fuzzy Pattern Recognition. Overview of Pattern Recognition Pattern Recognition Procedure Feature Extraction Feature Reduction Classification (supervised)

Chapter 7 Maximum likelihood classification of remotely sensed imagery 遥感影像分类的最大似然法 Chapter 7 Maximum likelihood classification of remotely sensed imagery.

Unsupervised Classification

Thematic Information Extraction: Supervised Classification

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Supervised Training and Classification

Object-based Classification

Classification of unlabeled data:

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

University College London (UCL), UK

Course Outline MODEL INFORMATION COMPLETE INCOMPLETE

REMOTE SENSING Multispectral Image Classification

REMOTE SENSING Multispectral Image Classification

Supervised Classification

Unsupervised Classification

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Image Information Extraction

University College London (UCL), UK

LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.

Learning From Observed Data

Environmental Remote Sensing GEOG 2021

Mathematical Foundations of BME

EM Algorithm and its Applications

Presentation transcript:

Image Classification: Supervised Methods Lecture 8 Prepared by R. Lathrop 11//99 Updated 3/06 Readings: ERDAS Field Guide 5th Ed. Ch 6:234-260

Where in the World?

Learning objectives Remote sensing science concepts Math Concepts Basic concept of supervised classification Major classification algorithms Hard vs Fuzzy Classification. Math Concepts Skills --Training set selection: Digital polygon vs. seed pixel-region growing --Training aids: plot of training data, statistical measure of separability; --Edit/evaluate signatures -- Applying Classification algorithms

Supervised vs. Unsupervised Approaches Supervised - image analyst "supervises" the selection of spectral classes that represent patterns or land cover features that the analyst can recognize Prior Decision Unsupervised - statistical "clustering" algorithms used to select spectral classes inherent to the data, more computer-automated Posterior Decision

Supervised vs. Unsupervised Run clustering algorithm Select Training fields Edit/evaluate signatures Identify classes Edit/evaluate signatures Classify image Evaluate classification Evaluate classification

Supervised vs. Unsupervised Supervised Prior Decision: from Information classes in the Image to Spectral Classes in Feature Space Red NIR Unsupervised Posterior Decision: from Spectral Classes in Feature Space to Information Classes in the Image

Training Training: the process of defining criteria by which spectral patterns are recognized Spectral signature: result of training that defines a training sample or cluster parametric - based on statistical parameters that assume a normal distribution (e.g., mean, covariance matrix) nonparametric - not based on statistics but on discrete objects (polygons) in feature space

Supervised Training Set Selection Objective - selecting a homogenous (unimodal) area for each apparent spectral class Digitize polygons - high degree of user control; often results in overestimate of spectral class variability Seed pixel - region growing technique to reduce with-in class variability; works by analyst setting threshold of acceptable variance, total # of pixels, adjacency criteria (horiz/vert, diagonal)

ERDAS Area of Interest (AOI) tools Seed pixel or region growing dialog

Region Growing: good for linear features Spectral Distance = 7 Spectral Distance = 10

Region Growing: good for spectrally heterogeneous features Spectral Distance = 5 Spectral Distance = 10

Supervised Training Set Selection Whether using the digitized polygon or seed pixel technique, the analyst should select multiple training sites to identify the many possible spectral classes in each information class of interest

Guided Clustering: hybrid supervised/unsupervised approach Polygonal areas of known land cover type are delineated as training sites ISODATA unsupervised clustering performed on these training sites Clusters evaluated and then combined into a single training set of spectral signatures

Training Stage Training set ---> training vector Training vector for each spectral class- represents a sample in n-dimensional measurement space where n = # of bands for a given spectral class j Xj = [ X1 ] X1 = mean DN band 1 [ X2] X2 = mean DN band 2

Classification Training Aids Goal: evaluate spectral class separability 1) Graphical plots of training data - histograms - coincident spectral plots - scatter plots 2) Statistical measures of separability - divergence - Mahalanobis distance 3) Training Area Classification 4) Quick Alarm Classification - paralellipiped

Parametric vs. Nonparametric Distance Approaches Parametric - based on statistical parameters assuming normal distribution of the clusters e.g., mean, std dev., covariance Nonparametric - not based on "normal" statistics, but on discrete objects and simple spectral distance in feature space

Parametric Assumption: each spectral class exhibits a unimodal normal distribution 255 Digital Number # of pixels Bimodal histogram: Mix of Class 1 & 2 Class 1 Class 2

Training Aids histogram (check for normality) Graphical portrayals of training data histogram (check for normality) “good” “bad”

Training Aids Graphical portrayals of training data coincident spectral mean plots

Training Aids Scatter plots: each training set sample constitutes an ellipse in feature space Provides 3 pieces of information - location of ellipse: mean vector - shape of ellipse: covariance - orientation of ellipse: slope & sign of covariance Need training vector and covariance matrix

Mix: grass/trees Broadleaf Examine ellipses for gaps and overlaps. Overlapping ellipses ok within information classes; want to limit between info classes Conifer

Training Aids Are some training sets redundant or overlap too greatly? Statistical Measures of Separability: expressions of statistical distance that are sensitive to both mean and variance - divergence - Mahalanobis distance

Training Aids Training/Test Area classification: look for misclassification between information classes; training areas can be biased, better to use independent test areas Quick alarm classification: on-screen evaluation of all pixels that fall within the training decision region (e.g. parallelipiped)

Classification Decision Process Decision Rule: mathematical algorithm that, using data contained in the signature, performs the actual sorting of pixels into discrete classes Parametric vs. nonparametric rules

Parallelepiped or box classifier Decision region defined by the rectangular area defined by the highest and lowest DN’s in each band; specify by range (min/max) or std dev. Pro: Takes variance into account but lacks sensitivity to covariance (Con) Pro: Computationally efficient, useful as first pass Pro: Nonparametric Con: Decision regions may overlap; some pixels may remain unclassified

Parallelepiped or Box Classifier Upper and lower limit of each box set by either range (min/max) or # of standard devs. Note overlap in Red but not NIR band

Parallelepipeds have “corners” Parallelepiped boundary NIR reflectance . Signature ellipse unir Candidate pixel ured Red reflectance Adapted from ERDAS Field Guide

Parallelepiped or Box Classifier: problems Red reflectance NIR reflectance Veg 1 Unclassified pixels ?? Veg3 Soil 3 Misclassified pixel Veg 2 Overlap region Soil 2 Soil 1 Water 2 Water 1 Adapted from Lillesand & Kiefer, 1994

Minimum distance to means Compute mean of each desired class and then classify unknown pixels into class with closest mean using simple euclidean distance Con: insensitive to variance & covariance Pro: computationally efficient Pro: all pixels classified, can use thresholding to eliminate pixels far from means

Minimum Distance to Means Classifier Red reflectance NIR reflectance Veg 1 Veg3 Soil 3 Veg 2 Soil 2 Soil 1 Water 2 Water 1 Adapted from Lillesand & Kiefer, 1994

Minimum Distance to Means Classifier: Euclidian Spectral Distance Y 92, 153 Distance = 111.2 Yd = 85-153 180, 85 Xd = 180 -92 X

Feature Space Classification Image analyst draws in decision regions directly on the feature space image using AOI tools - often useful for a first-pass broad classification Pixels that fall within a user-defined feature space class is assigned to that class Pro: Good for classes with a non-normal distribution Con: Potential problem with overlap and unclassified pixels

Feature Space Classifier Analyst draws decision regions in feature space

Statistically-based classifiers Defines a probability density (statistical) surface Each pixel is evaluated for its statistical probability of belonging in each category, assigned to class with maximum probability The probability density function for each spectral class can be completely described by the mean vector and covariance matrix

Parametric Assumption: each spectral class exhibits a unimodal normal distribution 255 Digital Number # of pixels Bimodal histogram: Mix of Class 1 & 2 Class 1 Class 2

2d vs. 1d views of class overlap wj wi Band 2 2d vs. 1d views of class overlap Band 1 255 Digital Number # of pixels Band 1

Probabilities used in likelihood ratio wj 255 Digital Number # of pixels wi } p (x | wj) { p (x | wi)

Spectral classes as probability surfaces Ellipses defined by class mean and covariance; creates likelihood contours around each spectral class;

Sensitive to large covariance values Some classes may have large variance and greatly overlap other spectral classes

Mahalonobis Distance Classifier D = (X-Mc)T (COVc-1)(X-Mc) D = Mahalanobis distance c = particular class X = measurement vector of the candidate pixel Mc = mean vector of class c COVc = covariance matrix COVc-1 = inverse of covariance matrix T = transposition Pro: takes the variability of the classes into account with info from COV matrix Similar to maximum likelihood but without the weighting factors Con: parametric, therefore sensitive to large variances

Maximum likelihood classifier Pro: potentially the most accurate classifier as it incorporates the most information (mean vector and COV matrix) Con: Parametric procedure that assumes the spectral classes are normally distributed Con: sensitive to large values in the covariance matrix Con: computationally intensive

Bayes Optimal approach Designed to minimize the average (expected) cost of misclassifying in maximum likelihood approach Uses an apriori (previous probability) term to weight decisions - weights more heavily towards common classes Example: prior probability suggests that 60 of the pixels are forests, therefore the classifier would more heavily weight towards forest in borderline cases

Hybrid classification Can easily mix various classification algorithms in a multi-step process First pass: some non-parametric rule (feature space or paralellipiped) to handle the most obvious cases, those pixels remaining unclassified or in overlap regions fall to second pass Second pass: some parametric rule to handle the difficult cases; the training data can be derived from unsupervised or supervised techniques

Thresholding Statistically-based classifiers do poorest near the tails of the training sample data distributions Thresholds can be used to define those pixels that have a higher probability of misclassification; these pixels can be excluded and labeled un-classified or retrained using a cluster-busting type of approach

Thresholding: define those pixels that have a higher probability of misclassification 255 Unclassified Regions # of pixels Class 1 Class 2 Threshold

Thresholding Chi square distribution used to help define a one-tailed threshold Chi Square # of pixels Threshold: values above will remain unclassified

Hard vs. Fuzzy Classification Rules Hard - “binary” either/or situation: a pixel belongs to one & only one class Fuzzy - soft boundaries, a pixel can have partial membership to more than one class

Hard vs. Fuzzy Classification Hard Classification Forested Wetland Forest Water Fuzzy Classification Adapted from Jensen, 2nd ed. 1996

Hard vs. Fuzzy Classification NIR reflectance MIR reflectance Forest Forested Wetland Hard decision boundaries Water Adapted from Jensen, 2nd ed. 1996

Fuzzy Classification: In ERDAS Fuzzy Classification: in the Supervised Classification option, the analyst can use choose Fuzzy Classification and then choose the number of “best classes” per pixel. This will create multiple output classification layers, as many as the number of best classes chosen above.

Fuzzy Classification: In ERDAS Fuzzy Convolution: calculates the total weighted inverse distance of all the classes in a window of pixels and assigns the center pixel the class with the distance summed over the entire set of fuzzy classification layers. This has the effect of creating a context-based classification. Classes with a very small distance value will remain unchanged while classes with higher distance values may change to a neighboring value if there are a sufficient number of neighboring pixels with class values and small corresponding distance values.

Main points of the lecture Training: --Training set selection: Digital polygon vs. seed pixel-region growing --Training aids: plot of training data, statistical measure of separability; --Edit/evaluate signatures. Classification algorithms: box classifier, minimum distance to means classifier, feature space classifier, statistically-based classifiers (maximum likelihood classifier, Mahalonobis distance classifier) Hybrid classification: statistical + Threshold method; Hard vs Fuzzy Classification.

Homework 1 Homework: Unsupervised classification (Hand up your excel file and figure process); 2 Reading Textbook Ch. 9:337-389; 3 Reading Field Guide Ch. 7:226-231, 235-253.