DATA INTEGRATION AND ANALYSIS IMAGE CLASSIFICATION DATA INTEGRATION AND ANALYSIS Source & Courtesy: Evren Bakılan
What is Remote Sensing? The science and art of obtaining information about an object, area, or phenomenon through the analysis of data acquired by a device that is not in contact with the object, area, or phenomenon under investigation. The practice of deriving information about the earth's land and water surfaces using images acquired from an overhead perspective, using electromagnetic radiation in one or more regions of the electromagnetic spectrum, reflected or emitted from the earth's surface.
Applications of Remote Sensing Meteorology (Weather Prediction) Climatology Oceanography Costal Studies Water Resources Geology Archeology Land cover\land use Classification and monitoring of urban, agricultural and marine environments from satellite images
Principle of Remote Sensing Interaction between incident radiation and the targets of interest Recording of Energy by the Sensor (D) Transmission, Reception, and Processing (E) Interpretation and Analysis (F) Application (G) Energy Source or Illumination (A) Radiation and the Atmosphere (B) Interaction with the Target (C)
Reflected Light
The “PIXEL”
Wavelength (Bands)
Band Combinations 3,2,1 4,3,2 5,4,3
Feature space image Band 4 Band 3 A graphical representation of the pixels by plotting 2 bands vs. each other For a 6-band Landsat image, there are 15 feature space images Band 3 Band 4
Each color represents a different “cluster” pixels that may correspond to the land cover classes you are interested in
Image Classification Why classify? Make sense of a landscape Place landscape into categories (classes) Forest, Agriculture, Water, etc Classification scheme = structure of classes Depends on needs of users
What is a Classified Image Image has been processed to put each pixel into a category Result is a vegetation map, land use map, or other map grouping related features Categories are defined by the intended use of the map Can be few or many categories, depending on the purpose of the map and available resources
Land Cover Classification Defining the pieces that make up the puzzle
Land cover classification steps Define why you want a classified image, how will it be used? Decide if you really need a classified image Define the study area Select or develop a classification scheme (legend) Select imagery Prepare imagery for classification Collect ancillary data Choose classification method and classify Adjust classification and assess accuracy
Image Classification
Example Uses Provide context Drive models Landscape planning or assessment Research projects Drive models Global carbon budgets Meteorology Biodiversity
Application in Agriculture A - color infrared photograph of big lake B - classified image of big lake C - QuickBird image of big lake D - classified satellite image
Basic Strategy: How do you do it? Use radiometric properties of remote sensor Different objects have different spectral signatures
Basic Strategy: How do you do it? In an easy world, all “Vegetation” pixels would have exactly the same spectral signature Then we could just say that any pixel in an image with that signature was vegetation We’d do the same for soil, etc. and end up with a map of classes
Example: Near Mary’s Peak Image Classification Example: Near Mary’s Peak Input data (Digital) Output data (Thematic) classification process
Image Classification black = water yellow = open/field dark green = dense forest light green = sparse forest bronze = mixed urban red = dense urban
Query Formulation Purpose ? Query “patch” – pertaining to some semantics, e.g. mountains Satellite Image Database Ranked Results Purpose ? Geography - Find mountainous regions with snow-caps (low-level semantics). Forestry – Find forests of a certain density, analyze deforestation (mid-level semantics). Military – Find air-bases in certain regions of the world (high-level semantics).
Basic strategy: Dealing with variability Classification: Delineate boundaries of classes in n-dimensional space Assign class names to pixels using those boundaries
Information Classes vs. Spectral Classes Information classes are categorical, such as crop type, forest type, tree species, different geologic units or rock types, etc. Spectral classes are groups of pixels that are uniform (or near-similar) with respect to their brightness values in the different spectral channels of the data.
Classification Strategies Two basic strategies Supervised classification We impose our perceptions on the spectral data Unsupervised classification Spectral data imposes constraints on our interpretation
Image Classification Classification Supervised Classification Unsupervised Classification (Clustering) No extensive prior knowledge required Unknown, but distinct, spectral classes Are generated Limited control over classes and identities No detailed information Statistical Techniques Distribution Free Gaussian maximum Likelihood classifier based on probability distribution models, which may be parametric or nonparametric Euclidean classifier K-nearest neighbour Minimum distance Decision Tree
Classification Approaches
Supervised Classification
Supervised Classification
Supervised Classification Supervised classification requires the analyst to select training areas where he/she knows what is on the ground and then digitize a polygon within that area… Mean Spectral Signatures The computer then creates... Conifer Known Conifer Area Digital Image Water Known Water Area Deciduous Known Deciduous Area
Supervised Classification Information (Classified Image) Mean Spectral Signatures Multispectral Image Conifer Deciduous Water Unknown Spectral Signature of Next Pixel to be Classified
Areas of Interest Detection of geometric features Simple line detection filters Detection of geometric features e.g. buildings, high-pass filtering & tresholding template matching (temporal) Change Detection subtraction, ratio, correlation (movement) comparison of classified images 2 -1 -1 -1 -1 2 -1 2 -1 -1 2 -1 -1 -1 2 2 -1 -1 -1 2 -1 -1 -1 -1 -1 2 -1 2 2 2 -1 2 -1 -1 -1 -1
Areas of Interest Image Segmentation detection of homogenous surfaces by means of tresholding or edge-detection Region-growing algorithm High-pass? Std-filtering? NDI?
Areas of Interest 1) Line-detection filters 2) Averaging filter 3) tresholding 2 -1 -1 -1 -1 2 -1 2 -1 -1 2 -1 1/9 1/9 1/9 -1 -1 2 2 -1 -1 1/9 1/9 1/9 -1 2 -1 -1 -1 -1 1/9 1/9 1/9 -1 2 -1 2 2 2 -1 2 -1 -1 -1 -1
Supervised Classification “Training”
Training Areas
Supervised Classification “Segmentation”
Supervised Classification Common Classifiers: Parallelpiped Minimum distance to mean Maximum likelihood
Supervised Classification: Statistical Approaches Minimum distance to mean Find mean value of pixels of training sets in n-dimensional space All pixels in image classified according to the class mean to which they are closest
Supervised Classification Parallelepiped Approach Pros: Simple Makes few assumptions about character of the classes
Supervised Classification
Supervised Classification: Minimum Distance Pros: All regions of n-dimensional space are classified Allows for diagonal boundaries (and hence no overlap of classes)
Maximum Likelihood Classifier Mean Signature 1 Candidate Pixel Relative Reflectance Mean Signature 2 It appears that the candidate pixel is closest to Signature 1. However, when we consider the variance around the signatures… Blue Green Red Near-IR Mid-IR
Maximum Likelihood Classifier Mean Signature 1 Candidate Pixel Relative Reflectance Mean Signature 2 The candidate pixel clearly belongs to the signature 2 group. Blue Green Red Near-IR Mid-IR
Supervised Classification In addition to classified image, you can construct a “distance” image For each pixel, calculate the distance between its position in n-dimensional space and the center of class in which it is placed Regions poorly represented in the training dataset will likely be relatively far from class center points May give an indication of how well your training set samples the landscape
Supervised Classification Some advanced techniques Neural networks Use flexible, not-necessarily-linear functions to partition spectral space Contextual classifiers Incorporate spatial or temporal conditions Linear regression Instead of discrete classes, apply proportional values of classes to each pixel; ie. 30% forest + 70% grass
Decision Rules in Spectral Feature Space Maximum Likelihood (Discriminant Analysis Parallelpiped Minimum Distance to Means
Classified Image
Unsupervised Classification
Unsupervised Classification In unsupervised classification, the spectral data imposes constraints on our interpretation How? Rather than defining training sets and carving out pieces of n-dimensional space, we define no classes beforehand and instead use statistical approaches to divide the n-dimensional space into clusters with the best separation After the fact, we assign class names to those clusters
Unsupervised Classification Clustering
Unsupervised Classification The analyst requests the computer to examine the image and extract a number of spectrally distinct clusters… Spectrally Distinct Clusters Cluster 3 Cluster 5 Cluster 1 Cluster 6 Cluster 2 Cluster 4 Digital Image
Unsupervised Classification Saved Clusters Cluster 3 Cluster 5 Cluster 1 Cluster 6 Cluster 2 Cluster 4 Output Classified Image Next Pixel to be Classified Unknown
Unsupervised Classification The result of the unsupervised classification is not yet information until… The analyst determines the ground cover for each of the clusters… ??? Water ??? Water ??? Conifer ??? Conifer ??? Hardwood ??? Hardwood
Unsupervised Classification It is a simple process to regroup (recode) the clusters into meaningful information classes (the legend). The result is essentially the same as that of the supervised classification: Conif. Hardw. Water Land Cover Map Legend Water Conifer Hardwood Labels
Unsupervised Classification
Evaluating Signatures--Signature Ellipses
Sources of Errors Acquisition Data Processing Implementation (Geometric Aspects,Sensor Systems, Platforms, Ground Control, Scene Considerations) (Geometric Rectification, Radiometric Rectification, Data conversion) Implementation Data Analysis ERROR (Quantitative Analysis, Classification System, Data Generalization) Decision Making Preprocessing the images by appropriate de-noising and enhancement algorithms have increased the efficiency of the classification. Data Conversion Final Product Presentation Error Assessment Spatial Error Thematic Error (Sampling Error Matrix Locational Accuracy…) (Raster to Vector Vector to Raster)
Unsupervised Classification Post classification sorting - ‘labeling’ Cluster 1 Class 1 Cluster 2 Cluster 3 Class 2 Cluster 4 Cluster 5 Class 3 Cluster 6 Cluster 7 Cluster 8
Unsupervised Classification Pros Takes maximum advantage of spectral variability in an image Cons The maximally-separable clusters in spectral space may not match our perception of the important classes on the landscape
Unsupervised Classification Results from Clustering - Spectral Classes
Input data is a digital data Data Analysis Input data is a digital data Image Rectification and Restoration Geometric Correction Radiometric Correction Noise Removal Image Enhancement The objective is to create “new” images from the original image data in order to increase the amount of information that can be visually interpreted from the data. Image classification – pixelwise classification Image Classification
ISODATA Procedure Arbitrary cluster means are established, The image is classified using a minimum distance classifier A new mean for each cluster is calculated The image is classified again using the new cluster means Another new mean for each cluster is calculated The image is classified again...
ISODATA Procedure After each iteration, the algorithm calculates the percentage of pixels that remained in the same cluster between iterations When this percentage exceeds T (convergence threshold), the program stops or… If the convergence threshold is never met, the program will continue for M iterations and then stop.
ISODATA -- A Special Case of Minimum Distance Clustering “Iterative Self-Organizing Data Analysis Technique” Parameters you must enter include: N - the maximum number of clusters that you want T - a convergence threshold and M - the maximum number of iterations to be performed.
ISODATA Clusters
ISODATA Pros and Cons Not biased to the top pixels in the image (as sequential clustering can be) Non-parametric--data does not need to be normally distributed Very successful at finding the “true” clusters within the data if enough iterations are allowed Cluster signatures saved from ISODATA are easily incorporated and manipulated along with (supervised) spectral signatures Slowest (by far) of the clustering procedures.
Classification -- Final Thoughts Classifications are never complete -- they end when time and money run out Classification is iterative -- it’s tough to get it right the first few iterations Consider a hybrid classification -- part supervised, part unsupervised Manual Classification and/or Editing is not cheating!
Landsat ETM+ Digital color infrared Acquired: April 21, 2003 Spatial resolution: 30 meters
Landsat TM Digital color infrared Acquired: February 17, 1989 Spatial resolution: 30 meters
Landsat MSS Digital color infrared Acquired: March 14, 1975 Spatial resolution: 57 meters
Corona Panchromatic (b/w) film Acquired: March 2, 1969 Spatial Resolution: 3 meters
Examples of Classification Results
Scene Classification
Some sample patches Car Pavement Road Tree
Thanks…