AMwww.Remote-Sensing.info Ch 4. Image Quality Assessment and Statistical Evaluation www.Remote-Sensing.info.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Chapter 3, Numerical Descriptive Measures
Statistical Techniques I EXST7005 Start here Measures of Dispersion.
Descriptive Measures MARE 250 Dr. Jason Turner.
Image quality assessment and statistical evaluation Lecture 3 February 4, 2005.
Major Operations of Digital Image Processing (DIP) Image Quality Assessment Radiometric Correction Geometric Correction Image Classification Introduction.
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Introduction to Summary Statistics
Introduction to Data Analysis
QUANTITATIVE DATA ANALYSIS
Calculating & Reporting Healthcare Statistics
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
Analysis of Research Data
Basic Business Statistics 10th Edition
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
Review of Measures of Central Tendency, Dispersion & Association
Basic Statistics - Concepts and Examples
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
Numerical Descriptive Techniques
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
PSYCHOLOGY: Themes and Variations Weiten and McCann Appendix B : Statistical Methods Copyright © 2007 by Nelson, a division of Thomson Canada Limited.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
Statistics Chapter 9. Statistics Statistics, the collection, tabulation, analysis, interpretation, and presentation of numerical data, provide a viable.
Hardware, Software, Quality Assessment Rockhampton, Queensland, Australia, on Jan. 7, 2011 Terra Aster.
Lecture 3 A Brief Review of Some Important Statistical Concepts.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
Review of Measures of Central Tendency, Dispersion & Association
Skewness & Kurtosis: Reference
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Lecture 3 The Digital Image – Part I - Single Channel Data 12 September
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
Measures of Dispersion
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
RESEARCH & DATA ANALYSIS
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Lean Six Sigma: Process Improvement Tools and Techniques Donna C. Summers © 2011 Pearson Higher Education, Upper Saddle River, NJ All Rights Reserved.
LIS 570 Summarising and presenting data - Univariate analysis.
Introduction to statistics I Sophia King Rm. P24 HWB
Describing Samples Based on Chapter 3 of Gotelli & Ellison (2004) and Chapter 4 of D. Heath (1995). An Introduction to Experimental Design and Statistics.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Statistical Methods © 2004 Prentice-Hall, Inc. Week 3-1 Week 3 Numerical Descriptive Measures Statistical Methods.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Descriptive Statistics Dr.Ladish Krishnan Sr.Lecturer of Community Medicine AIMST.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
Image Enhancement Band Ratio Linear Contrast Enhancement
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
1.Image Error and Quality 2.Sampling Theory 3.Univariate Descriptive Image Statistics 4.Multivariate Statistics 5.Geostatistics for RS Next Remote Sensing1.
Outline Sampling Measurement Descriptive Statistics:
Descriptive Statistics ( )
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Digital Data Format and Storage
Description of Data (Summary and Variability measures)
Numerical Descriptive Measures
Basic Statistical Terms
Numerical Descriptive Measures
Satellite data Marco Puts
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Advanced Algebra Unit 1 Vocabulary
Principal Component Analysis (PCA)
Numerical Descriptive Measures
Presentation transcript:

AMwww.Remote-Sensing.info Ch 4. Image Quality Assessment and Statistical Evaluation

Many remote sensing datasets contain high-quality, accurate data. Unfortunately, sometimes error (or noise) is introduced into the remote sensor data by: the environment (e.g., atmospheric scattering), random or systematic malfunction of the remote sensing system (e.g., an uncalibrated detector creates striping), or improper airborne or ground processing of the remote sensor data prior to actual data analysis (e.g., inaccurate analog-to- digital conversion). Many remote sensing datasets contain high-quality, accurate data. Unfortunately, sometimes error (or noise) is introduced into the remote sensor data by: the environment (e.g., atmospheric scattering), random or systematic malfunction of the remote sensing system (e.g., an uncalibrated detector creates striping), or improper airborne or ground processing of the remote sensor data prior to actual data analysis (e.g., inaccurate analog-to- digital conversion). Image Quality Assessment and Statistical Evaluation

Therefore, the person responsible for analyzing the digital remote sensor data should first assess its quality and statistical characteristics. This is normally accomplished by: looking at the frequency of occurrence of individual brightness values in the image displayed in a histogram viewing on a computer monitor individual pixel brightness values at specific locations or within a geographic area, computing univariate descriptive statistics to determine if there are unusual anomalies in the image data, and computing multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). Therefore, the person responsible for analyzing the digital remote sensor data should first assess its quality and statistical characteristics. This is normally accomplished by: looking at the frequency of occurrence of individual brightness values in the image displayed in a histogram viewing on a computer monitor individual pixel brightness values at specific locations or within a geographic area, computing univariate descriptive statistics to determine if there are unusual anomalies in the image data, and computing multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). Image Quality Assessment and Statistical Evaluation

Remote Sensing Sampling Theory A population is an infinite or finite set of elements. An infinite population could be all possible images that might be acquired of the Earth in All Landsat 7 ETM+ images of Charleston, S.C. in 2004 is a finite population. A sample is a subset of the elements taken from a population used to make inferences about certain characteristics of the population. For example, we might decide to analyze a June 1, 2004, Landsat image of Charleston. If observations with certain characteristics are systematically excluded from the sample either deliberately or inadvertently (such as selecting images obtained only in the spring of the year), it is a biased sample. Sampling error is the difference between the true value of a population characteristic and the value of that characteristic inferred from a sample. A population is an infinite or finite set of elements. An infinite population could be all possible images that might be acquired of the Earth in All Landsat 7 ETM+ images of Charleston, S.C. in 2004 is a finite population. A sample is a subset of the elements taken from a population used to make inferences about certain characteristics of the population. For example, we might decide to analyze a June 1, 2004, Landsat image of Charleston. If observations with certain characteristics are systematically excluded from the sample either deliberately or inadvertently (such as selecting images obtained only in the spring of the year), it is a biased sample. Sampling error is the difference between the true value of a population characteristic and the value of that characteristic inferred from a sample.

Remote Sensing Sampling Theory Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred. Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred.

Common Symmetric and Skewed Distributions in Remotely Sensed Data

Remote Sensing Sampling Theory The histogram is a useful graphic representation of the information content of a remotely sensed image. The histogram is a useful graphic representation of the information content of a remotely sensed image. It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed.It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed. The histogram is a useful graphic representation of the information content of a remotely sensed image. The histogram is a useful graphic representation of the information content of a remotely sensed image. It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed.It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed.

Histogram of A Single Band of Landsat Thematic Mapper Data of Charleston, SC

Histogram of Thermal Infrared Imagery of a Thermal Plume in the Savannah River

Cursor and Raster Display of Brightness Values

Two- and Three- Dimensional Evaluation of Pixel Brightness Values within a Geographic Area

Univariate Descriptive Image Statistics Measures of Central Tendency in Remote Sensor Data The mode is the value that occurs most frequently in a distribution and is usually the highest point on the curve (histogram). It is common, however, to encounter more than one mode in a remote sensing dataset. The histograms of the Landsat TM image of Charleston, SC and the predawn thermal infrared image of the Savannah River have multiple modes. They are nonsymmetrical (skewed) distributions. The median is the value midway in the frequency distribution. One- half of the area below the distribution curve is to the right of the median, and one-half is to the left. Measures of Central Tendency in Remote Sensor Data The mode is the value that occurs most frequently in a distribution and is usually the highest point on the curve (histogram). It is common, however, to encounter more than one mode in a remote sensing dataset. The histograms of the Landsat TM image of Charleston, SC and the predawn thermal infrared image of the Savannah River have multiple modes. They are nonsymmetrical (skewed) distributions. The median is the value midway in the frequency distribution. One- half of the area below the distribution curve is to the right of the median, and one-half is to the left.

Univariate Descriptive Image Statistics The mean (  k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The mean is the arithmetic average and is defined as the sum of all brightness value observations divided by the number of observations. It is the most commonly used measure of central tendency. The mean (  k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The sample mean,  k, is an unbiased estimate of the population mean. For symmetrical distributions, the sample mean tends to be closer to the population mean than any other unbiased estimate (such as the median or mode). The mean (  k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The mean is the arithmetic average and is defined as the sum of all brightness value observations divided by the number of observations. It is the most commonly used measure of central tendency. The mean (  k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The sample mean,  k, is an unbiased estimate of the population mean. For symmetrical distributions, the sample mean tends to be closer to the population mean than any other unbiased estimate (such as the median or mode).

Remote Sensing Univariate Statistics - Variance Measures of Dispersion Measures of the dispersion about the mean of a distribution provide valuable information about the image. For example, the range of a band of imagery (range k ) is computed as the difference between the maximum (max k ) and minimum (min k ) values; that is, Unfortunately, when the minimum or maximum values are extreme or unusual observations (i.e., possibly data blunders), the range could be a misleading measure of dispersion. Such extreme values are not uncommon because the remote sensor data are often collected by detector systems with delicate electronics that can experience spikes in voltage and other unfortunate malfunctions. When unusual values are not encountered, the range is a very important statistic often used in image enhancement functions such as min–max contrast stretching. Measures of Dispersion Measures of the dispersion about the mean of a distribution provide valuable information about the image. For example, the range of a band of imagery (range k ) is computed as the difference between the maximum (max k ) and minimum (min k ) values; that is, Unfortunately, when the minimum or maximum values are extreme or unusual observations (i.e., possibly data blunders), the range could be a misleading measure of dispersion. Such extreme values are not uncommon because the remote sensor data are often collected by detector systems with delicate electronics that can experience spikes in voltage and other unfortunate malfunctions. When unusual values are not encountered, the range is a very important statistic often used in image enhancement functions such as min–max contrast stretching.

Remote Sensing Univariate Statistics - Variance Measures of Dispersion The variance of a sample is the average squared deviation of all possible observations from the sample mean. The variance of a band of imagery, var k, is computed using the equation: The numerator of the expression is the corrected sum of squares (SS). If the sample mean (  k ) were actually the population mean, this would be an accurate measurement of the variance. Measures of Dispersion The variance of a sample is the average squared deviation of all possible observations from the sample mean. The variance of a band of imagery, var k, is computed using the equation: The numerator of the expression is the corrected sum of squares (SS). If the sample mean (  k ) were actually the population mean, this would be an accurate measurement of the variance.

Remote Sensing Univariate Statistics Unfortunately, there is some underestimation because the sample mean was calculated in a manner that minimized the squared deviations about it. Therefore, the denominator of the variance equation is reduced to n – 1, producing a larger, unbiased estimate of the sample variance:

Remote Sensing Univariate Statistics The standard deviation is the positive square root of the variance. The standard deviation of the pixel brightness values in a band of imagery, s k, is computed as

Pixel Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) (1,1) (1,2) (1,3) (1,4) (1,5) Hypothetical Dataset of Brightness Values

Jensen, 2004 Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Mean (  k ) Variance (var k ) Standard deviation (s k ) Minimum (min k ) Maximum (max k ) Range (BV r ) Univariate Statistics for the Hypothetical Example Dataset

Measures of Distribution (Histogram) Asymmetry and Peak Sharpness Skewness is a measure of the asymmetry of a histogram and is computed using the formula: A perfectly symmetric histogram has a skewness value of zero. Skewness is a measure of the asymmetry of a histogram and is computed using the formula: A perfectly symmetric histogram has a skewness value of zero.

A histogram may be symmetric but have a peak that is very sharp or one that is subdued when compared with a perfectly normal distribution. A perfectly normal distribution (histogram) has zero kurtosis. The greater the positive kurtosis value, the sharper the peak in the distribution when compared with a normal histogram. Conversely, a negative kurtosis value suggests that the peak in the histogram is less sharp than that of a normal distribution. Measures of Distribution (Histogram) Asymmetry and Peak Sharpness

Remote Sensing Multivariate Statistics Remote sensing research is often concerned with the measurement of how much radiant flux is reflected or emitted from an object in more than one band (e.g., in red and near-infrared bands). It is useful to compute multivariate statistical measures such as covariance and correlation among the several bands to determine how the measurements covary. Later it will be shown that variance–covariance and correlation matrices are used in remote sensing principal components analysis (PCA), feature selection, classification and accuracy assessment.

Remote Sensing Multivariate Statistics To calculate covariance, we first compute the corrected sum of products (SP) defined by the equation:

Just as simple variance was calculated by dividing the corrected sums of squares (SS) by (n – 1), covariance is calculated by dividing SP by (n – 1). Therefore, the covariance between brightness values in bands k and l, cov kl, is equal to: Remote Sensing Multivariate Statistics

Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Band 1 SS 1 cov 1,2 cov 1,3 cov 1,4 Band 2 cov 2,1 SS 2 cov 2,3 cov 2,4 Band 3 cov 3,1 cov 3,2 SS 3 cov 3,4 Band 4 cov 4,1 cov 4,2 cov 4,3 SS 4 Format of a Variance-Covariance Matrix

Jensen, 2004 Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Band Band Band Band Variance-Covariance Matrix of the Sample Data

Correlation between Multiple Bands of Remotely Sensed Data To estimate the degree of interrelation between variables in a manner not influenced by measurement units, the correlation coefficient, r, is commonly used. The correlation between two bands of remotely sensed data, r kl, is the ratio of their covariance (cov kl ) to the product of their standard deviations (s k s l ); thus:

Correlation between Multiple Bands of Remotely Sensed Data If we square the correlation coefficient (r kl ), we obtain the sample coefficient of determination (r 2 ), which expresses the proportion of the total variation in the values of “band l” that can be accounted for or explained by a linear relationship with the values of the random variable “band k.” Thus a correlation coefficient (r kl ) of 0.70 results in an r 2 value of 0.49, meaning that 49% of the total variation of the values of “band l” in the sample is accounted for by a linear relationship with values of “band k”.

Correlation Matrix of the Sample Data Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Band Band Band Band

Band Min Max Mean Standard Deviation Covariance Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band Correlation Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band Band Min Max Mean Standard Deviation Covariance Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band Correlation Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band Univariate and Multivariate Statistics of Landsat TM Data of Charleston, SC

Feature Space Plots The univariate and multivariate statistics discussed provide accurate, fundamental information about the individual band statistics including how the bands covary and correlate. Sometimes, however, it is useful to examine statistical relationships graphically. Individual bands of remotely sensed data are often referred to as features in the pattern recognition literature. To truly appreciate how two bands (features) in a remote sensing dataset covary and if they are correlated or not, it is often useful to produce a two-band feature space plot. The univariate and multivariate statistics discussed provide accurate, fundamental information about the individual band statistics including how the bands covary and correlate. Sometimes, however, it is useful to examine statistical relationships graphically. Individual bands of remotely sensed data are often referred to as features in the pattern recognition literature. To truly appreciate how two bands (features) in a remote sensing dataset covary and if they are correlated or not, it is often useful to produce a two-band feature space plot.

Feature Space Plots A two-dimensional feature space plot extracts the brightness value for every pixel in the scene in two bands and plots the frequency of occurrence in a 255 by 255 feature space (assuming 8-bit data). The greater the frequency of occurrence of unique pairs of values, the brighter the feature space pixel.

Two-dimensional Feature Space Plot of Landsat Thematic Mapper Band 3 and 4 Data of Charleston, SC obtained on November 11, 1982 Two-dimensional Feature Space Plot of Landsat Thematic Mapper Band 3 and 4 Data of Charleston, SC obtained on November 11, 1982