Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Menlo Park 19 th -21 st January, 2016 Reference Data Collection & Accuracy Assessment: Some Results The Error Matrices
Theory vs. Practice 2
Accuracy by Continent Australia Africa North America Europe South America Asia Each Continent Has Different Characteristics: Differences in land and water resources including soil and terrain conditions, Topology and climatology (such as annual precipitation and thermal zones), and Variability in growing periods of different crops. Each Continent has different suitability for growing crops and thus it is necessary to consider some common agronomical and ecological matching zones for assessing agriculture resources and potential to estimate cropland extent and its types. 3
Assumptions and Issues Assumed: Create crop/no crop map as first level of stratification before beginning crop type mapping Map continent by some type of agricultural eco-zone to reduce variability and improve accuracy Every team mapping the same 8 crop types we agreed on Issues: Must determine the appropriate population to sample Must then balance the sample size based on proportion of area in crop vs. non-crop Must account for both commission and omission error Important Note: Conducting this assessment was highly dependent on using HSI for validation samples 4
Different Accuracy Assessment Strategies by Continent Each Continent has been treated differently to perform accuracy assessment Australia: Buffering Crop/No-Crop region using Euclidean Distance between the crop/no-crop pixels Africa: Considering IIASA crop proportion in each AEZ and asymptote behavior of number of samples to determine the sample proportion in each zone North America: Minimum number of crop samples in each agro- ecological Zone (AEZ) 5
Accuracy Assessment: Australia 6
Generating more No-Crop samples for Australia The Error Matrix was generated from 1,118 ground samples collected by Pardha in August 2014 in Australia The error matrix and accuracy estimates were not statistically valid and balanced The samples for Crop and No-Crop were neither balanced nor proportional to crop/no-crop area in GCE v.2 Map 7
Crop/No-Crop Area Proportionality GCE v.2 ClassPixel Count (PC) Area sq. m. (PC * 250*250) Area % Cropland Cropland Cropland Cropland Cropland Cropland Total Cropland No-Crop Total Area Class No. of Samples Sample %Crop Area % Crop No crop Total929 Samples in Crop/No-Crop Maintain proportional samples to GCE v.2 Crop/No-Crop area 8 1.Croplands, RF, all crops 2.Croplands, RF, Pastures 3.Croplands, Irr., all crops 4.Croplands, Irr., Pastures 5.Croplands, Irr., Orchards 6.Croplands, Fallow
No-Crop Samples 1,118 reference samples (1/3 rd ground collected samples) were received Only 36 samples were of No-Crop out of 1,118 samples Additional 800 random samples have been generated in No-Crop region of Australia The center part of Australia has been removed to avoid sample because there is almost no possibility for cropland 9
Crop Samples Generate samples separately for Crop and No-Crop Regions 106 Crop samples randomly selected from 1,082 ground collected samples to balance the proportion with No-Crop samples (i.e.,36 original +787 out of 800 additional samples) 10
Buffer GCE v.2 Map Cropland Area GOAL: Include omission error Procedure: Generate Euclidean distance layer from GCE v.2 Crop to No-Crop Calculate distance of Crop pixels from No-Crop pixels Within Australia bound, the distance layer had the range from 0-24 pixels or map units Two buffer zones (buffer 1 and buffer 2 ) were generated using the range of 0 -1 and 0-2 map units 700 and 800 Random samples have been generated (250x250m) for Buffer 1 and 2 respectively which are proportional to cropland area using Google Earth Imagery 11
Buffered GCE v.2 Cropland Map 12
Crop Buffer 1 13
Samples proportional to Crop Area in Buffer 1 Crop Buffer 0-1 ED Area (sq. m.)Area %No. of SamplesSample % Crop No Crop Total (Buffer Area 1) The Error Matrix *ED- Euclidean Distance in map units 14 Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 53.40%97.49% 91.00%
Crop Buffer 2 15
Samples proportional to Crop Area in Buffer 2 Crop Buffer 0-2 ED Area (sq. m.)Area %No. of SamplesSample % Crop No Crop Total (Buffer Area 2) *ED- Euclidean Distance in map units 16 Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 70.73%95.68% 93.13% The Error Matrix
Result: The Error Matrix Unbalanced samples Balanced by Crop Proportion 17
Crop/No-Crop Accuracy Matrix using Crop Buffers (Population) Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 53.40%97.49% 91.00% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 70.73%95.68% 93.13% Crop Buffer 2 (0-2 Euclidean Distance) Crop Buffer 1 (0-1 Euclidean Distance) 18
Conclusion/Summary 19 Class description Histogram SMT-2014 Percent of croplands No of samples % of samples Name#Pixels%#% 1. Croplands, rainfed, SC (season 1 &2), all crops % % 2. Croplands, rainfed, SC, pastures % % 3. Croplands, irrigated, SC, DC (Season1 &2), all crops % % 4. Croplands, irrigated, SC, pastures % % 5. Croplands, irrigated, continuous, orchards % % 6. Croplands, fallow % % 7. Noncropland % Total croplands % % Information provided by Pardha about increasing the reference data promotionally with crop area How was the percent of cropland types determined? Classified map or some other independent source of reference The percent of the area covered by non-croplands?? To decide number of No-Crop samples and their distribution ISSUE: The percent of Class 1 and Class 2 are almost same, but the percent of number of samples does not match with this area proportionality – working with Pardha on this now Number of samples for fallow are more than the area extent ISSUE: Methods of augmenting additional reference samples need to discussed
Accuracy Assessment: Africa 20
Africa Cropland Products L1: Cropland Extent map, 2014 with Irrigated, Rain-fed, No-crop class labels L2: Crop intensity, 2014 (Limited reference data) L3: Crop Dominance, (Limited reference data) Agro-ecological layer provided, but this layer was not used for the mapping. 21
Training Data for Crop Mapping Curt Reynold’s Data Visual interpretation from Google Earth Ground data from Murali Corn samples in South Africa Sugarcane samples Irrigation samples in Egypt 22
Distribution of Ground Collected Validation Samples 23 Irrigated, Rain-fed Samples from Mutlu Mali Ground Collected Data LULC Independent Dataset East Africa Dataset Samples collected by different independent projects
Issue : Some of the “Validation Samples” have been already used for Training e.g., Malawi Data Used in Training 24
Issue : Redundant, overlapping and spatially auto-correlated 25 Cleaned-up ground collected Validation Data Still we have uneven distribution of Reference Samples in each zone to perform accuracy assessment
Agro-Ecological Zones in Africa ZonesGrowing Days Zone 10 Days Zone Days Zone Days Zone Days Zone Days Zone Days Zone Days Zone Days 26
Assessment Performed by Ag-Eco Zone Determination of number of samples needed per zone Analysis of proportion of crop/no crop by zone Initial use of hybrid crop probability layer Evaluated using 50, 100, 150, 200, and 250 samples per zone Highly dependent on using HSI for the reference data Produced error matrices by zone and total for Africa Zone 3 27
Hybrid Crop Probability Layer (IIASA, Fritz et al. 2015) Zone 1Zone 2Zone 3Zone 4Zone 5Zone 6Zone 7Zone 8 1 (0-10)% (10-20)% (20-40)% (40-60)% (60-80)% (80-100)% Weighted crop Zone Area Weighted % Total Crop Crop %
Crop/No-Crop Proportion in each zone Zones Crop Samples No-Crop Samples Zone Area (Sq. Km.) Sample Proportion (%) IIASA Total Crop Proportion (%) Zone Zone Zone Zone Zone Zone Zone Zone Sample Proportion % Samples Zone 1Zone 2Zone 3Zone 4Zone 5Zone 6Zone 7Zone Zone 1Zone 2Zone 3Zone 4Zone 5Zone 6Zone 7Zone 8 IIASA Weighted %
Crop Samples Proportion in Agro-ecological Zones of Africa 30 Asymptote level does not match/reach the IIASA crop Proportion
Results: Error Matrices Reference Data CroplandNo-Crop Sum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 66.67%92.82% 88.98% Reference Data CroplandNo-Crop Sum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 37.93%94.12% 87.12% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 50.00%95.19% 91.23% Zone 3 Zone 4 Zone 5Zone 8 Zone 7 Zone 6 Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 57.14%99.18% 98.00% Reference Data CroplandNo-Crop Sum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 66.67%97.02% 96.27% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 23.08%98.21% 94.86% 31
Result: Overall Accuracy for Africa Zones Crop Samples No-Crop Samples Zone Area (Sq. Km.) Overall Accuracy % Zone Zone Zone Zone Zone Zone Zone Zone Zones Crop Samples No Crop Samples Total Samples Zone Area (Sq. Km.) Area ratioOverall Accuracy, OA %Area Ratio * OA Zone Zone Zone Zone Zone Zone Total % 32 Almost No-Crop zones Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop571,3281, % Sum Points 1141,3791,493 Producer Accuracy 50.00%96.30% 92.77% The Error Matrix
Conclusion Each agro-ecological zone (AEZ) had different crop area proportionality which then requires sample distribution calculation The number of samples in each AEZ stabilizes at different sample sizes Important to assess accuracy in each zone as results vary based on complexity of the AEZ. Easy to use HSI for Crop/No Crop Reference Data, BUT to generate or collect Reference Data for validation of Crop Intensity and Crop Dominance Products we will need Ground Collected Data 33
Accuracy Assessment of Cropland Products of North America 34
Cropland Data for North America Resampled CDL map of 250m pixel resolution with 7 translated crop types Zone wise Validation samples for each crop type Classified mosaic map with 7 crop types 13 Agro-ecological Zone Map 35 Classified Crop Types Alfalfa Corn-Soybean Rice Cotton Potato Wheat-Barley Other Crops
Steps to perform accuracy assessment of North America crop type maps: The accuracy assessment will be performed using the reference data from resampled 250m Cropland Database Layer (CDL) for the year The reference data will consist of randomly generated 250m homogeneous samples. The composite labels (i.e. combined crop types of the classified map) will be compared with the combined CDL reference labels. The accuracy estimates will be provided for each agro-ecological zone in North America. The accuracy assessment will be provided in the form of error matrix with overall accuracy, producer’s accuracy and user’s accuracy. 36
Step 1: Crop Proportion in each Zone Zone area (Sq. Km.) Total Crop (Sq. Km.) Total Crop % Zone 171 Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone
Sample Proportion in each Zone Total Crop %No Crop %Crop SamplesNo-Crop SamplesTotal Samples Zone 1 Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Total ,95831,67233,630 38
Crop/No-Crop Error Matrices Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop297,8757, % Sum Points 1047,8887,992 Producer Accuracy 72.12%99.84% 99.47% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland1454,2984, % No-Crop359,4099, % Sum Points 18013,70713,887 Producer Accuracy 80.56%68.64% 68.80% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop281,8661, % Sum Points1852,4572,642 Producer Accuracy 84.86%75.95% 76.57% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points1871,0601,247 Producer Accuracy 88.24%76.89% 78.59% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 86.79%78.01% 80.41% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 81.36%78.95% 79.67% 39 Zone 2 Zone 3 Zone 4 Zone 5 Zone 6 Zone 7
Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points ,089 Producer Accuracy 63.64%66.96% 66.39% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points ,127 Producer Accuracy 76.35%76.95% 76.84% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points ,142 Producer Accuracy 87.56%75.24% 77.58% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop379741, % Sum Points 1591,2861,445 Producer Accuracy 76.73%75.74% 75.85% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points ,044 Producer Accuracy 86.17%67.16% 68.87% Reference Data CroplandNo-CropSum PointsUser Accuracy Map Data Cropland % No-Crop % Sum Points Producer Accuracy 90.48%67.19% 69.29% 40 Zone 8 Zone 9 Zone 10 Zone 11 Zone 12 Zone 13 Contd.
Result: Overall Accuracy Crop Samples No-Crop SamplesNo. of SamplesZone AreaArea % Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Total Reference Data Cropland No- CropSum Points User Accuracy Map Data Cropland1,5836,9218, % No-Crop37524,75125, % Sum Points1,95831,67233,630 Producer Accuracy 80.85%78.15% 78.31% Crop SamplesNo-Crop SamplesNo. of SamplesZone AreaArea ratioOverall Accuracy, OA %Area ratio * OA Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Zone Total % 41 Overall accuracy with 33,630 samples in 13 Zones
Spatial distribution of disagreement in Crop/No-Crop map of North America 42
Conclusion/Summary The accuracy assessment of crop and no-crop class shows that there is a commission error in the Cropland class Need to check the percent of cropland vs. no-cropland area in North America with reference to CDL Need to have a Crop/No-Crop extent map Need to check the accuracy of crop types in each zone based on their area proportionality 43
44
45