Reference data & Accuracy Assessment Dr. Russ Congalton Kamini Yadav
Generating more No-Crop samples for Australia The Error Matrix was generated from 1,118 ground samples collected by Pardha in August 2014 in Australia The error matrix and accuracy estimates were not statistically valid and balanced The samples for Crop and No-Crop were neither balanced nor proportional to crop/no-crop area in GCE v.2 Map
Crop/No-Crop Area Proportionality GCE v.2 ClassPixel Count (PC) Area sq. m. (PC * 250*250) Area % Cropland Cropland Cropland Cropland Cropland Cropland Total Cropland No-Crop Total Area Class No. of Samples Sample %Crop Area % Crop No crop Total929 Samples in Crop/No-Crop Maintain proportional samples to GCE v.2 Crop/No- Crop area
No-Crop Samples Only 36 samples were of No-Crop out of 1118 (1/3 rd ground validation samples) 800 random samples have been generated in No-Crop region of the map The center part of Australia has been removed to avoid sample because there is almost no possibility for cropland
Crop Samples Generate samples separately for Crop and No-Crop Regions 106 Crop samples randomly selected from 1082 ground collected samples to balance the ratio No-Crop samples (36 original +787 out of 800 new samples)
The Error Matrix Original Balanced
Buffer GCE v.2 Map Cropland Area Generate Euclidean distance layer of GCE v.2 cropland class to No-Crop It calculates distance of crop pixels from No-Crop pixels Within Australia bound, out of 0-24 range of pixel values, the range of 0 -1 and 0-2 map units are selected for the buffer 1 and buffer and 800 Random samples have been generated (250x250m) for Buffer 1 and 2 resp. that are proportional to crop area using Goggle Imagery
Buffered GCE v.2 Cropland Map
Crop Buffer 1 (250m)
Samples proportional to Crop Area in Buffer 1 Crop Buffer 0-1 ED Area (sq. m.)Area %No. of SamplesSample % Crop No Crop Total (Buffer Area 1) The Error Matrix *ED- Euclidean Distance in map units
Crop Buffer 2 (500m)
Samples proportional to Crop Area in Buffer 2 Crop Buffer 0-2 ED Area (sq. m.)Area %No. of SamplesSample % Crop No Crop Total (Buffer Area 2) *ED- Euclidean Distance in map units
Error Matrices Comparison Crop/No-Crop Accuracy Matrix Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 96.23%98.42% 98.17% Kappa = Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 53.40%97.49% 91.00% Kappa = Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 70.73%95.68% 93.13% Kappa = Sampling separately in Crop/No-Crop Sampling in Crop Buffer of 250m Sampling in Crop Buffer of 500m
Conclusion High accuracy (98%) when sampling in the whole continent stratified by Crop and No-Crop class 91% overall accuracy when sampling performed in a Crop buffer of 250m 93% overall accuracy when sampling in a crop buffer of 500m Non-proportional sampling results in high accuracy that might not be a valid assessment Area Proportional sampling results in a reasonable, statistically valid accuracy estimates
North America Accuracy Assessment Data Received The crop type map of AEZ 5, 6 and 7 with 7 composite classified labels. Cross-walked and resampled CDL reference map according to the classification scheme of classified map (7 CDL composite labels). Sub-regions boundaries for each AEZ of North America. CSV files of the generated reference data for three AEZ (5, 6 and 7). The file consists of training, validation and untouched samples.
Steps to Perform Accuracy Assessment The quality of the data received from North America team has been examined All the possible issues and questions has been documented and given to Russ Need to take some decisions and actions in order to perform the accuracy assessment
Thanks !!