Download presentation
Presentation is loading. Please wait.
Published byBryan McLaughlin Modified over 9 years ago
1
AMwww.Remote-Sensing.info Ch 4. Image Quality Assessment and Statistical Evaluation www.Remote-Sensing.info
2
Many remote sensing datasets contain high-quality, accurate data. Unfortunately, sometimes error (or noise) is introduced into the remote sensor data by: the environment (e.g., atmospheric scattering), random or systematic malfunction of the remote sensing system (e.g., an uncalibrated detector creates striping), or improper airborne or ground processing of the remote sensor data prior to actual data analysis (e.g., inaccurate analog-to- digital conversion). Many remote sensing datasets contain high-quality, accurate data. Unfortunately, sometimes error (or noise) is introduced into the remote sensor data by: the environment (e.g., atmospheric scattering), random or systematic malfunction of the remote sensing system (e.g., an uncalibrated detector creates striping), or improper airborne or ground processing of the remote sensor data prior to actual data analysis (e.g., inaccurate analog-to- digital conversion). Image Quality Assessment and Statistical Evaluation
3
Therefore, the person responsible for analyzing the digital remote sensor data should first assess its quality and statistical characteristics. This is normally accomplished by: looking at the frequency of occurrence of individual brightness values in the image displayed in a histogram viewing on a computer monitor individual pixel brightness values at specific locations or within a geographic area, computing univariate descriptive statistics to determine if there are unusual anomalies in the image data, and computing multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). Therefore, the person responsible for analyzing the digital remote sensor data should first assess its quality and statistical characteristics. This is normally accomplished by: looking at the frequency of occurrence of individual brightness values in the image displayed in a histogram viewing on a computer monitor individual pixel brightness values at specific locations or within a geographic area, computing univariate descriptive statistics to determine if there are unusual anomalies in the image data, and computing multivariate statistics to determine the amount of between-band correlation (e.g., to identify redundancy). Image Quality Assessment and Statistical Evaluation
4
Remote Sensing Sampling Theory A population is an infinite or finite set of elements. An infinite population could be all possible images that might be acquired of the Earth in 2004. All Landsat 7 ETM+ images of Charleston, S.C. in 2004 is a finite population. A sample is a subset of the elements taken from a population used to make inferences about certain characteristics of the population. For example, we might decide to analyze a June 1, 2004, Landsat image of Charleston. If observations with certain characteristics are systematically excluded from the sample either deliberately or inadvertently (such as selecting images obtained only in the spring of the year), it is a biased sample. Sampling error is the difference between the true value of a population characteristic and the value of that characteristic inferred from a sample. A population is an infinite or finite set of elements. An infinite population could be all possible images that might be acquired of the Earth in 2004. All Landsat 7 ETM+ images of Charleston, S.C. in 2004 is a finite population. A sample is a subset of the elements taken from a population used to make inferences about certain characteristics of the population. For example, we might decide to analyze a June 1, 2004, Landsat image of Charleston. If observations with certain characteristics are systematically excluded from the sample either deliberately or inadvertently (such as selecting images obtained only in the spring of the year), it is a biased sample. Sampling error is the difference between the true value of a population characteristic and the value of that characteristic inferred from a sample.
5
Remote Sensing Sampling Theory Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred. Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Large samples drawn randomly from natural populations usually produce a symmetrical frequency distribution. Most values are clustered around some central value, and the frequency of occurrence declines away from this central point. A graph of the distribution appears bell shaped and is called a normal distribution. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred. Many statistical tests used in the analysis of remotely sensed data assume that the brightness values recorded in a scene are normally distributed. Unfortunately, remotely sensed data may not be normally distributed and the analyst must be careful to identify such conditions. In such instances, nonparametric statistical theory may be preferred.
6
Common Symmetric and Skewed Distributions in Remotely Sensed Data
7
Remote Sensing Sampling Theory The histogram is a useful graphic representation of the information content of a remotely sensed image. The histogram is a useful graphic representation of the information content of a remotely sensed image. It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed.It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed. The histogram is a useful graphic representation of the information content of a remotely sensed image. The histogram is a useful graphic representation of the information content of a remotely sensed image. It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed.It is instructive to review how a histogram of a single band of imagery, k, composed of i rows and j columns with a brightness value BV ijk at each pixel location is constructed.
8
Histogram of A Single Band of Landsat Thematic Mapper Data of Charleston, SC
9
Histogram of Thermal Infrared Imagery of a Thermal Plume in the Savannah River
10
Cursor and Raster Display of Brightness Values
11
Two- and Three- Dimensional Evaluation of Pixel Brightness Values within a Geographic Area
12
Univariate Descriptive Image Statistics Measures of Central Tendency in Remote Sensor Data The mode is the value that occurs most frequently in a distribution and is usually the highest point on the curve (histogram). It is common, however, to encounter more than one mode in a remote sensing dataset. The histograms of the Landsat TM image of Charleston, SC and the predawn thermal infrared image of the Savannah River have multiple modes. They are nonsymmetrical (skewed) distributions. The median is the value midway in the frequency distribution. One- half of the area below the distribution curve is to the right of the median, and one-half is to the left. Measures of Central Tendency in Remote Sensor Data The mode is the value that occurs most frequently in a distribution and is usually the highest point on the curve (histogram). It is common, however, to encounter more than one mode in a remote sensing dataset. The histograms of the Landsat TM image of Charleston, SC and the predawn thermal infrared image of the Savannah River have multiple modes. They are nonsymmetrical (skewed) distributions. The median is the value midway in the frequency distribution. One- half of the area below the distribution curve is to the right of the median, and one-half is to the left.
13
Univariate Descriptive Image Statistics The mean ( k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The mean is the arithmetic average and is defined as the sum of all brightness value observations divided by the number of observations. It is the most commonly used measure of central tendency. The mean ( k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The sample mean, k, is an unbiased estimate of the population mean. For symmetrical distributions, the sample mean tends to be closer to the population mean than any other unbiased estimate (such as the median or mode). The mean ( k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The mean is the arithmetic average and is defined as the sum of all brightness value observations divided by the number of observations. It is the most commonly used measure of central tendency. The mean ( k ) of a single band of imagery composed of n brightness values (BV ik ) is computed using the formula: The sample mean, k, is an unbiased estimate of the population mean. For symmetrical distributions, the sample mean tends to be closer to the population mean than any other unbiased estimate (such as the median or mode).
14
Remote Sensing Univariate Statistics - Variance Measures of Dispersion Measures of the dispersion about the mean of a distribution provide valuable information about the image. For example, the range of a band of imagery (range k ) is computed as the difference between the maximum (max k ) and minimum (min k ) values; that is, Unfortunately, when the minimum or maximum values are extreme or unusual observations (i.e., possibly data blunders), the range could be a misleading measure of dispersion. Such extreme values are not uncommon because the remote sensor data are often collected by detector systems with delicate electronics that can experience spikes in voltage and other unfortunate malfunctions. When unusual values are not encountered, the range is a very important statistic often used in image enhancement functions such as min–max contrast stretching. Measures of Dispersion Measures of the dispersion about the mean of a distribution provide valuable information about the image. For example, the range of a band of imagery (range k ) is computed as the difference between the maximum (max k ) and minimum (min k ) values; that is, Unfortunately, when the minimum or maximum values are extreme or unusual observations (i.e., possibly data blunders), the range could be a misleading measure of dispersion. Such extreme values are not uncommon because the remote sensor data are often collected by detector systems with delicate electronics that can experience spikes in voltage and other unfortunate malfunctions. When unusual values are not encountered, the range is a very important statistic often used in image enhancement functions such as min–max contrast stretching.
15
Remote Sensing Univariate Statistics - Variance Measures of Dispersion The variance of a sample is the average squared deviation of all possible observations from the sample mean. The variance of a band of imagery, var k, is computed using the equation: The numerator of the expression is the corrected sum of squares (SS). If the sample mean ( k ) were actually the population mean, this would be an accurate measurement of the variance. Measures of Dispersion The variance of a sample is the average squared deviation of all possible observations from the sample mean. The variance of a band of imagery, var k, is computed using the equation: The numerator of the expression is the corrected sum of squares (SS). If the sample mean ( k ) were actually the population mean, this would be an accurate measurement of the variance.
16
Remote Sensing Univariate Statistics Unfortunately, there is some underestimation because the sample mean was calculated in a manner that minimized the squared deviations about it. Therefore, the denominator of the variance equation is reduced to n – 1, producing a larger, unbiased estimate of the sample variance:
17
Remote Sensing Univariate Statistics The standard deviation is the positive square root of the variance. The standard deviation of the pixel brightness values in a band of imagery, s k, is computed as
19
Pixel Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) (1,1)13057180205 (1,2)16535215255 (1,3)10025135195 (1,4)13550200220 (1,5)14565205235 Hypothetical Dataset of Brightness Values
20
Jensen, 2004 Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Mean ( k ) 13546.40187222 Variance (var k ) 562.50264.801007570 Standard deviation (s k ) 23.7116.2731.423.87 Minimum (min k ) 10025135195 Maximum (max k ) 16565215255 Range (BV r ) 65408060 Univariate Statistics for the Hypothetical Example Dataset
21
Measures of Distribution (Histogram) Asymmetry and Peak Sharpness Skewness is a measure of the asymmetry of a histogram and is computed using the formula: A perfectly symmetric histogram has a skewness value of zero. Skewness is a measure of the asymmetry of a histogram and is computed using the formula: A perfectly symmetric histogram has a skewness value of zero.
22
A histogram may be symmetric but have a peak that is very sharp or one that is subdued when compared with a perfectly normal distribution. A perfectly normal distribution (histogram) has zero kurtosis. The greater the positive kurtosis value, the sharper the peak in the distribution when compared with a normal histogram. Conversely, a negative kurtosis value suggests that the peak in the histogram is less sharp than that of a normal distribution. Measures of Distribution (Histogram) Asymmetry and Peak Sharpness
23
Remote Sensing Multivariate Statistics Remote sensing research is often concerned with the measurement of how much radiant flux is reflected or emitted from an object in more than one band (e.g., in red and near-infrared bands). It is useful to compute multivariate statistical measures such as covariance and correlation among the several bands to determine how the measurements covary. Later it will be shown that variance–covariance and correlation matrices are used in remote sensing principal components analysis (PCA), feature selection, classification and accuracy assessment.
24
Remote Sensing Multivariate Statistics To calculate covariance, we first compute the corrected sum of products (SP) defined by the equation:
25
Just as simple variance was calculated by dividing the corrected sums of squares (SS) by (n – 1), covariance is calculated by dividing SP by (n – 1). Therefore, the covariance between brightness values in bands k and l, cov kl, is equal to: Remote Sensing Multivariate Statistics
26
Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Band 1 SS 1 cov 1,2 cov 1,3 cov 1,4 Band 2 cov 2,1 SS 2 cov 2,3 cov 2,4 Band 3 cov 3,1 cov 3,2 SS 3 cov 3,4 Band 4 cov 4,1 cov 4,2 cov 4,3 SS 4 Format of a Variance-Covariance Matrix
27
Jensen, 2004 Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Band 1 562.25--- Band 2 135264.80-- Band 3 718.75275.251007.50- Band 4 537.5064663.75570 Variance-Covariance Matrix of the Sample Data
28
Correlation between Multiple Bands of Remotely Sensed Data To estimate the degree of interrelation between variables in a manner not influenced by measurement units, the correlation coefficient, r, is commonly used. The correlation between two bands of remotely sensed data, r kl, is the ratio of their covariance (cov kl ) to the product of their standard deviations (s k s l ); thus:
29
Correlation between Multiple Bands of Remotely Sensed Data If we square the correlation coefficient (r kl ), we obtain the sample coefficient of determination (r 2 ), which expresses the proportion of the total variation in the values of “band l” that can be accounted for or explained by a linear relationship with the values of the random variable “band k.” Thus a correlation coefficient (r kl ) of 0.70 results in an r 2 value of 0.49, meaning that 49% of the total variation of the values of “band l” in the sample is accounted for by a linear relationship with values of “band k”.
30
Correlation Matrix of the Sample Data Band 1 (green) Band 2 (red) Band 3 (near- infrared) Band 4 (near- infrared) Band 1 ---- Band 2 0.35--- Band 3 0.950.53-- Band 4 0.940.160.87-
31
Band Min Max Mean Standard Deviation 1 51 242 65.163137 10.231356 1 51 242 65.163137 10.231356 2 17 115 25.797593 5.956048 2 17 115 25.797593 5.956048 3 14 131 23.958016 8.469890 3 14 131 23.958016 8.469890 4 5 105 26.550666 15.690054 4 5 105 26.550666 15.690054 5 0 193 32.014001 24.296417 5 0 193 32.014001 24.296417 6 0 128 15.103553 12.738188 6 0 128 15.103553 12.738188 7 102 124 110.734372 4.305065 7 102 124 110.734372 4.305065 Covariance Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band 7 1 104.680654 58.797907 82.602381 69.603136 142.947000 94.488082 24.464596 1 104.680654 58.797907 82.602381 69.603136 142.947000 94.488082 24.464596 2 58.797907 35.474507 48.644220 45.539546 90.661412 57.877406 14.812886 2 58.797907 35.474507 48.644220 45.539546 90.661412 57.877406 14.812886 3 82.602381 48.644220 71.739034 76.954037 149.566052 91.234270 23.827418 3 82.602381 48.644220 71.739034 76.954037 149.566052 91.234270 23.827418 4 69.603136 45.539546 76.954037 246.177785 342.523400 157.655947 46.815767 4 69.603136 45.539546 76.954037 246.177785 342.523400 157.655947 46.815767 5 142.947000 90.661412 149.566052 342.523400 590.315858 294.019002 82.994241 5 142.947000 90.661412 149.566052 342.523400 590.315858 294.019002 82.994241 6 94.488082 57.877406 91.234270 157.655947 294.019002 162.261439 44.674247 6 94.488082 57.877406 91.234270 157.655947 294.019002 162.261439 44.674247 7 24.464596 14.812886 23.827418 46.815767 82.994241 44.674247 18.533586 7 24.464596 14.812886 23.827418 46.815767 82.994241 44.674247 18.533586 Correlation Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band 7 1 1.000000 0.964874 0.953195 0.433582 0.575042 0.724997 0.555425 1 1.000000 0.964874 0.953195 0.433582 0.575042 0.724997 0.555425 2 0.964874 1.000000 0.964263 0.487311 0.626501 0.762857 0.577699 2 0.964874 1.000000 0.964263 0.487311 0.626501 0.762857 0.577699 3 0.953195 0.964263 1.000000 0.579068 0.726797 0.845615 0.653461 3 0.953195 0.964263 1.000000 0.579068 0.726797 0.845615 0.653461 4 0.433582 0.487311 0.579068 1.000000 0.898511 0.788821 0.693087 4 0.433582 0.487311 0.579068 1.000000 0.898511 0.788821 0.693087 5 0.575042 0.626501 0.726797 0.898511 1.000000 0.950004 0.793462 5 0.575042 0.626501 0.726797 0.898511 1.000000 0.950004 0.793462 6 0.724997 0.762857 0.845615 0.788821 0.950004 1.000000 0.814648 6 0.724997 0.762857 0.845615 0.788821 0.950004 1.000000 0.814648 7 0.555425 0.577699 0.653461 0.693087 0.793462 0.814648 1.000000 7 0.555425 0.577699 0.653461 0.693087 0.793462 0.814648 1.000000 Band Min Max Mean Standard Deviation 1 51 242 65.163137 10.231356 1 51 242 65.163137 10.231356 2 17 115 25.797593 5.956048 2 17 115 25.797593 5.956048 3 14 131 23.958016 8.469890 3 14 131 23.958016 8.469890 4 5 105 26.550666 15.690054 4 5 105 26.550666 15.690054 5 0 193 32.014001 24.296417 5 0 193 32.014001 24.296417 6 0 128 15.103553 12.738188 6 0 128 15.103553 12.738188 7 102 124 110.734372 4.305065 7 102 124 110.734372 4.305065 Covariance Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band 7 1 104.680654 58.797907 82.602381 69.603136 142.947000 94.488082 24.464596 1 104.680654 58.797907 82.602381 69.603136 142.947000 94.488082 24.464596 2 58.797907 35.474507 48.644220 45.539546 90.661412 57.877406 14.812886 2 58.797907 35.474507 48.644220 45.539546 90.661412 57.877406 14.812886 3 82.602381 48.644220 71.739034 76.954037 149.566052 91.234270 23.827418 3 82.602381 48.644220 71.739034 76.954037 149.566052 91.234270 23.827418 4 69.603136 45.539546 76.954037 246.177785 342.523400 157.655947 46.815767 4 69.603136 45.539546 76.954037 246.177785 342.523400 157.655947 46.815767 5 142.947000 90.661412 149.566052 342.523400 590.315858 294.019002 82.994241 5 142.947000 90.661412 149.566052 342.523400 590.315858 294.019002 82.994241 6 94.488082 57.877406 91.234270 157.655947 294.019002 162.261439 44.674247 6 94.488082 57.877406 91.234270 157.655947 294.019002 162.261439 44.674247 7 24.464596 14.812886 23.827418 46.815767 82.994241 44.674247 18.533586 7 24.464596 14.812886 23.827418 46.815767 82.994241 44.674247 18.533586 Correlation Matrix Band Band 1 Band 2 Band 3 Band 4 Band 5 Band 6 Band 7 1 1.000000 0.964874 0.953195 0.433582 0.575042 0.724997 0.555425 1 1.000000 0.964874 0.953195 0.433582 0.575042 0.724997 0.555425 2 0.964874 1.000000 0.964263 0.487311 0.626501 0.762857 0.577699 2 0.964874 1.000000 0.964263 0.487311 0.626501 0.762857 0.577699 3 0.953195 0.964263 1.000000 0.579068 0.726797 0.845615 0.653461 3 0.953195 0.964263 1.000000 0.579068 0.726797 0.845615 0.653461 4 0.433582 0.487311 0.579068 1.000000 0.898511 0.788821 0.693087 4 0.433582 0.487311 0.579068 1.000000 0.898511 0.788821 0.693087 5 0.575042 0.626501 0.726797 0.898511 1.000000 0.950004 0.793462 5 0.575042 0.626501 0.726797 0.898511 1.000000 0.950004 0.793462 6 0.724997 0.762857 0.845615 0.788821 0.950004 1.000000 0.814648 6 0.724997 0.762857 0.845615 0.788821 0.950004 1.000000 0.814648 7 0.555425 0.577699 0.653461 0.693087 0.793462 0.814648 1.000000 7 0.555425 0.577699 0.653461 0.693087 0.793462 0.814648 1.000000 Univariate and Multivariate Statistics of Landsat TM Data of Charleston, SC
32
Feature Space Plots The univariate and multivariate statistics discussed provide accurate, fundamental information about the individual band statistics including how the bands covary and correlate. Sometimes, however, it is useful to examine statistical relationships graphically. Individual bands of remotely sensed data are often referred to as features in the pattern recognition literature. To truly appreciate how two bands (features) in a remote sensing dataset covary and if they are correlated or not, it is often useful to produce a two-band feature space plot. The univariate and multivariate statistics discussed provide accurate, fundamental information about the individual band statistics including how the bands covary and correlate. Sometimes, however, it is useful to examine statistical relationships graphically. Individual bands of remotely sensed data are often referred to as features in the pattern recognition literature. To truly appreciate how two bands (features) in a remote sensing dataset covary and if they are correlated or not, it is often useful to produce a two-band feature space plot.
33
Feature Space Plots A two-dimensional feature space plot extracts the brightness value for every pixel in the scene in two bands and plots the frequency of occurrence in a 255 by 255 feature space (assuming 8-bit data). The greater the frequency of occurrence of unique pairs of values, the brighter the feature space pixel.
34
Two-dimensional Feature Space Plot of Landsat Thematic Mapper Band 3 and 4 Data of Charleston, SC obtained on November 11, 1982 Two-dimensional Feature Space Plot of Landsat Thematic Mapper Band 3 and 4 Data of Charleston, SC obtained on November 11, 1982
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.