Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote Sensing of Environment

Premise of the paper: Proposes criteria for assessing algorithms for supervised land cover classification. Proposes criteria for assessing algorithms for supervised land cover classification. For land classification analysis to be operational, more automated procedures will be required. For land classification analysis to be operational, more automated procedures will be required. Land cover monitoring using remotely-sensed satellite data is becoming more common. Land cover monitoring using remotely-sensed satellite data is becoming more common. Larger volumes of data at higher quality are becoming more readily available. Larger volumes of data at higher quality are becoming more readily available. No single machine learning algorithm has been shown superior for all situations. No single machine learning algorithm has been shown superior for all situations.

Supervised classification Training stage (define useful land cover categories with spectral response patterns from training data of known cover). Training stage (define useful land cover categories with spectral response patterns from training data of known cover). Classification stage (reassign image pixels to land cover categories based on the match with defined spectral attributes). Classification stage (reassign image pixels to land cover categories based on the match with defined spectral attributes). Output stage (develop a categorized data set as maps, tables, or GIS data files). Output stage (develop a categorized data set as maps, tables, or GIS data files).

Example: Supervised classification Training dataClassified image

Objectives of this study: Compare three machine learning algorithms for supervised land cover classification Compare three machine learning algorithms for supervised land cover classification based on four criteria based on four criteria using two different data sets. using two different data sets.

Data sets 8 km AVHRR data (Advanced Very High Resolution Radiometer from NOAA) 8 km AVHRR data (Advanced Very High Resolution Radiometer from NOAA) 30 m Landsat Thematic Mapper scene (from Pucallpa, Peru area) 30 m Landsat Thematic Mapper scene (from Pucallpa, Peru area) Note: Reliable land cover classifications had been derived for both data sets based on expert knowledge (used in place of ground measurements)

1984 AVHRR data included 6 channels at 8 km resolution Two { One visible

1996 Landsat TM scene included 5 bands at 30m resolution Approximately 9000 pixels can be overlaid on the 8km AVHRR data.

8 km AVHRR data To train the classifiers To train the classifiers  Overlaid Landsat scenes on AVHRR.  Each pixel was labeled as a cover type based on interpretation of Landsat scene. To test the classification results To test the classification results  Obtained a random sample of 10,000 pixels from final classification results of a previous study (they believe their test data has a high degree of confidence).

30 m Landsat Thematic Mapper scene To train the classifiers To train the classifiers  Data were selected by sampling the results of a previous study (5958 pixels). To test the classification results To test the classification results  Date were randomly selected on an additional 12,084 pixels (although not independently derived, they were used to illustrate the evaluation criteria).

The three algorithms compared: 1. C5.0 decision tree (standard) 2. Decision tree w/ “Bagging” 3. Decision tree w/ “Boosting” Note: Bagging and boosting (2 & 3) are refinements of (1) that build multiple iterations of classifiers. They can be applied to any supervised classification algorithm.

What is a decision tree? a machine learning technique (algorithm) that analyzes data, recognizes patterns, and predicts through repeated learning instances a machine learning technique (algorithm) that analyzes data, recognizes patterns, and predicts through repeated learning instances useful when it is important for humans to understand the classification structure useful when it is important for humans to understand the classification structure successfully applied to satellite data for extraction of land cover categories successfully applied to satellite data for extraction of land cover categories

1. C5.0 decision tree predicts classes by repeatedly partitioning a data set into homogeneous subsets predicts classes by repeatedly partitioning a data set into homogeneous subsets variables are used to split subsets into further subsets variables are used to split subsets into further subsets most important component is the method used to estimate splits at each “node” of the tree most important component is the method used to estimate splits at each “node” of the tree

2. C5.0 decision tree w/“Bagging” generates a decision tree for each sample generates a decision tree for each sample a final classification result is obtained by plurality vote of the individual classifiers a final classification result is obtained by plurality vote of the individual classifiers

3. C5.0 decision tree w/“Boosting” entire training set is used to generate the decision tree with a weight is assigned to each training observation entire training set is used to generate the decision tree with a weight is assigned to each training observation subsequent decision tree iterations focus on misclassified observations subsequent decision tree iterations focus on misclassified observations a final classification result is obtained by plurality vote of the individual classifiers a final classification result is obtained by plurality vote of the individual classifiers

One of the most important criteria in selecting an appropriate algorithm: the degree of human interpretation and involvement in the classification process the degree of human interpretation and involvement in the classification process Example: supervised classification (need for time intensive collection of training data) vs. unsupervised classification (no training data). Example: supervised classification (need for time intensive collection of training data) vs. unsupervised classification (no training data).

As a result: There are always trade-offs between accuracy, computational speed, and ability to automate the classification process.

Four assessment criteria were evaluated in the study: Classification accuracy – overall, mean class, and adjusted (accounts for unequal costs of misclassification, which will vary) Classification accuracy – overall, mean class, and adjusted (accounts for unequal costs of misclassification, which will vary) Computational resources required Computational resources required Stability of the algorithms w/r/t minor variability in input data Stability of the algorithms w/r/t minor variability in input data Robustness to noise in the training data (includes random noise in input and mislabeling of cover type in training data) Robustness to noise in the training data (includes random noise in input and mislabeling of cover type in training data)

Summary: Results Accuracy is comparable between the three algorithms using two data sets. Accuracy is comparable between the three algorithms using two data sets. The Bagging and Boosting algorithms are more stable and more robust to noise in the training data. The Bagging and Boosting algorithms are more stable and more robust to noise in the training data. The Bagging algorithm is the most costly, and standard decision tree is the least costly, in terms of computational resources. The Bagging algorithm is the most costly, and standard decision tree is the least costly, in terms of computational resources.

The End thank you for listening

Accuracy Accuracy is one of the primary criteria for comparing algorithms in literature. Accuracy is one of the primary criteria for comparing algorithms in literature. Accuracy = % pixels correctly classified in the test set. Accuracy = % pixels correctly classified in the test set. In this study, all three algorithms provide fairly similar accuracies (generally within 5%). In this study, all three algorithms provide fairly similar accuracies (generally within 5%).

Computational resources Likely to be a key consideration in machine learning, where “amount of work done” is used as a measure of operations performed. Likely to be a key consideration in machine learning, where “amount of work done” is used as a measure of operations performed. Standard tree: requires less resources. Standard tree: requires less resources. Bagging: number of operations increases in proportion to number of samples used. Bagging: number of operations increases in proportion to number of samples used. Boosting: number of operations is in proportion to number of iterations used. Boosting: number of operations is in proportion to number of iterations used.

Stability of algorithm Algorithm should ideally produce stable results with minor variability in input data, otherwise, it may incorrectly indicate land cover changes when none occurred. Algorithm should ideally produce stable results with minor variability in input data, otherwise, it may incorrectly indicate land cover changes when none occurred. Variable input data can be common if training data are from same locations. Variable input data can be common if training data are from same locations. Test method: random sampling generated 10 training sets (to approximate minor variation). Test method: random sampling generated 10 training sets (to approximate minor variation). Bagging and Boosting provide more stable classification (less sensitivity to variation) than a standard decision tree. Bagging and Boosting provide more stable classification (less sensitivity to variation) than a standard decision tree.

Robustness to noise Remotely sensed data is likely to be noisy due to: signal saturation, missing scans, mislabeling, problems with sensor or viewing geometry. Remotely sensed data is likely to be noisy due to: signal saturation, missing scans, mislabeling, problems with sensor or viewing geometry. Test methods: 1) random noise in input (introduced zero values randomly to simulate missing data); 2) mislabeling of cover type in training data (assigned one class to all training pixels derived from 3 Landsat scenes). Test methods: 1) random noise in input (introduced zero values randomly to simulate missing data); 2) mislabeling of cover type in training data (assigned one class to all training pixels derived from 3 Landsat scenes). Bagging and Boosting appear substantially more robust than standard C5.0 decision tree. Bagging and Boosting appear substantially more robust than standard C5.0 decision tree.

Noise: random noise Standard C5.0 decision tree clearly has higher error rates and lower stability. Standard C5.0 decision tree clearly has higher error rates and lower stability. Bagging appears slightly more stable than boosting for the Landsat data. Bagging appears slightly more stable than boosting for the Landsat data.

Noise: mislabeling of cover type in training data Causes more problems in terms of stability for the decision tree algorithms than random noise. Causes more problems in terms of stability for the decision tree algorithms than random noise. Standard C5.0 decision tree is least stable and has the highest error of all algorithms. Standard C5.0 decision tree is least stable and has the highest error of all algorithms.

Some applications of results: These same criteria can be applied to other types of algorithms such as  Neural networks  Maximum likelihood  Unsupervised classification

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Similar presentations

Presentation on theme: "Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Similar presentations

Presentation on theme: "Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote."— Presentation transcript:

Similar presentations

About project

Feedback