Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

Similar presentations


Presentation on theme: "Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1."— Presentation transcript:

1 Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1

2 Outline  Reference Data  Africa  Australia  North America  Europe & South America  VHRI data collection  RHSeg Work  eCognition/R Program  Accuracy Assessment (Australia)  Future work 2

3 Discussion with each Group on Reference data  Africa-Jun  Received reference data form  Had first conference call on April 16, 2015  Australia-Pardha  Received reference data form  Conference call on May 28, 2015, follow up call on 30 th June, 2015  North America-Richard/Teki  Received reference data form  Conference call on June 3, 2015  Europe-Aparna/Mutlu  Received reference data form  Unavailable to schedule the call  South America-Ying/Chandra  Not Approached/Received 3

4 Outcome of Reference Data Calls  These calls proved to be excellent to coordinate efforts between the mapping team and the accuracy team to determine mutually integrated needs and analysis.  Resulted in detailed knowledge about each mapping team and how they are generating their product. Will help us to perform validation for the respective products.  Determined actions items specific to each team.  Complied all the possible sources of reference data information. Discussed how to collect or build the necessary independent reference data for each continent. 4

5 Ground Data Sources  Ground data (collected by our team including Murali)  Received shape files for Ethiopia, Tanzania, Malawi, Rawanda, Burundi  India (South India, Rajasthan)  Ground data sourced from other projects (e.g., CORINE)  Curt Reynolds's field data from USDA/FAS  2015 corn map for South Africa and 2014 cotton / rice map for Australia GDA Corp  Ground data from global collection (e.g., Mutlu's)  To be purely use for accuracy assessment  Ground data from literature  Authors contacted to obtain the reference data they used or the map they produced for their project (if willing to share)  Reference data from other work (e.g., USDA CDL, Agriculture and Agri-Food Canada) Statistical Data (FAOSTAT)Ground DataExisting Cropland DataVery High Resolution Imagery 5

6 Africa Reference data Africa Australia North America 6 Source: With the help of Jim Tilton, received VHRI from NGA

7 Australia Reference Data  The statistical data from FAOSTAT is only used to evaluate the sub pixel/actual crop area for the year 2014  Before using the ground data, there is a need to define the crosswalk used to compare its classification scheme with the classified map  The Dynamic Land cover map of Australia (DLDC) is used for crop area comparison only. It is available for the years 2000-2008  Create random samples from DLDC map for accuracy assessment  Decide the strategy and perform accuracy assessment for the GCE v2.0 250m  Select and perform reference classification of VHRI to generate ground data Africa Australia North America Agriculture Ecological Zones: Zone 3 -9 7

8 North America Reference Data Type Number of Pixels (250m) Total Crop (Original)26,444,974 Total Crop (Buffered)6,415,797 Total Unused Pixels20,029,177 Africa Australia North America  CDL – 56m cropland layer from USDA-NASS has been resampled to 250m and used to build pure pixels (6,415,797 out of total 26, 444, 974 crop pixels). The remaining 20, 029, 177 unused pixels are potential ones where homogeneous samples can be chosen for creating validation dataset  NLCD 30m cropland layer can be a possible source to create a crop/No-crop mask  Also, ground data used for NLCD validation and accuracy assessment is possible source?? 8

9 Canada Reference Data  Generate reference data from AAFC cropland layer as homogeneous samples for 250m mapping (the same way as USDA NASS CDL layer)  Based on the field size, decide the homogeneous pixel criteria to label 3x3 30m or 250m pixel  Process VHRI in some areas and use to test the accuracy of the cropland layer Agriculture Ecological Zones: Zone 3-6, 12 & 13 9

10 AAFC Cropland Map Labels LabelCodeDefinition Agriculture (undifferentiated) 120 Agricultural land, including annual and perennial crops; and would exclude grassland. This class is mapped only if the distinction of sub-agricultural covers (classes 132-199) is not possible. Pasture / Forages122 Periodically cultivated. Includes tame grasses and other perennial crops such as alfalfa and clover grown alone or as mixtures for hay, pasture or seed. Too Wet to be Seeded130 Agricultural fields that are normally seeded that remain unseeded due to excess spring moisture. Fallow131 Plowed and harrowed fields that are left unsown for the growing season Cereals132 This class is mapped only if the distinction of sub-cereal covers (classes 133-146) is not possible. Barley133 Other Grains134 Millet135 Oats136 Rye137 Spelt138 Triticale139 Wheat140 This sub-cereal class is mapped only if the distinction of sub- wheat covers (classes 145-146) is not possible Switch grass141 Winter Wheat145 Spring Wheat146 Corn147 Tobacco148 Ginseng149 Oilseeds150 This class is mapped only if the distinction of sub-oilseed covers (classes 151-158) is not possible. Borage151 Camelina152 Canola / Rapeseed153 Flaxseed 154 LabelCodeDefinition Mustard155 Safflower156 Sunflower157 Soybeans158 Pulses160 This class is mapped only if the distinction of sub-pulse covers (classes 162-174) is not possible. Peas162 Beans167 Lentils174 Vegetables175 This class is mapped only if the distinction of sub-vegetable covers (classes 176-179) is not possible. Tomatoes176 Potatoes177 Sugar beets178 Other Vegetables179 Fruits180 This class is mapped only if the distinction of sub-fruit covers (classes 181-190) is not possible. Berries181 Cranberry183 Orchards188 Other Fruits189 Vineyards190 Hops191 Sod192 Herbs193 Nursery194 Buckwheat195 Canary Seed196 Hemp197 Vetch198 Other Crops199 Issues in CrossWalking from AAFC to Map Classification: Standard Criteria?? AAFC: Agriculture & Agri Food Canada 10

11 Reference data from Literature Paper TitleJournalContactData 1 Crop area mapping in West Africa using landscape stratification of MODIS time series and comparison with existing global land products International Journal of Applied Earth Observation and Geoinformation, Volume 14, Issue 1, February 2012, Pages 83–93 elodie.vintrou@ci rad.fr, elodie.vintrou@g mail.com A ground data set collected during the 2009 and 2010 cropping seasons (744 GPS waypoints at the validation sites) 2 Generating plausible crop distribution maps for Sub- Saharan Africa using a spatially disaggregated data fusion and optimization approach Agricultural Systems, Volume 99, Issues 2–3, February 2009, Pages 126– 140 L.YOU@CGIAR. ORG Crop distribution map of sub Saharan Africa 3 Generating global crop distribution maps: From census to grid Agricultural Systems, Volume 127, May 2014, Pages 53–60 L.YOU@CGIAR. ORG Global Rainfed/Irrigated crop map 4 Disaggregating and mapping crop statistics using hyper temporal remote sensing International Journal of Applied Earth Observation and Geoinformation, Volume 12, Issue 1, February 2010, Pages 36–46 Khan@ITC.nl Wheat, sunflower, Barley crop maps of southern Spain 5Global rain-fed, irrigated, and paddy croplands (GRIPC) J. Meghan Salmon, Mark A. Friedl, Steve Frolking, Dominik Wisser, Ellen M. Douglas https://dl.dropbox usercontent.com/ u/12683052/GRIP Cmap.zip. Irrigated/Rainfed Map 6 Finer resolution observation and monitoring of global land cover: first mapping results with Landsat TM and ETM+ data International Journal of Remote Sensing Volume 34, Issue 7, 2013 penggong@berkel ey.edu Landsat/MODIS Mapping 91,000 Training samples; 38,000 Test samples 11

12 Reference data from Literature Paper TitleJournalContactData 7 Data Mining, A Promising Tool for Large-Area Cropland Mapping IEEE Journal of selected topics in applied earth observations and remote sensing, vol. 6, no. 5, October 2013 elodie.vintrou@ci rad.fr The field surveys were conducted in Mali during the 2009 and 2010 crop seasons (980 Way points) 8 GlobeLand30 (http://www.globallandcover.com/GLC30Download/ind ex.aspx) ISPRS Journal of Photogrammetry and Remote Sensing 103 (2015) 7–27 chenjun@nsdi.gov.cn 154,587 pixel samples 2010 year 9 Mapping and discrimination of soybean and corn crops using spectrotemporal profiles of vegetation indices International Journal of Remote Sensing, 2015, Vol. 36, No. 7, 1809–1824, carlos_hws@hot mail.com Field data from 19 different croplands (state of Paraná, located in the South of Brazil, between) 10 Improving Crop Area Estimation in West Africa Using Multiresolution Satellite Data Proceedings of Global Geospatial Conference 2013 gerald.forkuor@u ni-wuerzburg.de field survey conducted between May and July 2012. 11 Impact of feature selection on the accuracy and spatial uncertainty of per-field crop classification using Support Vector Machines ISPRS Journal of Photogrammetry and Remote Sensing 85 (2013) 102–119 fabian.loew@uni- wuerzburg.de Extensive field survey conducted in Four test sites in Middle Asia. 12 MODIS Collection 5 global land cover: Algorithm refinements and characterization of new datasets Remote Sensing of Environment 114 (2010) 168–182 friedl@bu.edu 1860 Training sites globally 13 Cropland for sub-Saharan Africa: A synergistic approach using five land cover data sets Calibrated synergy map for Africa (http://onlinelibrary.wiley.com/doi/10.1029/ 2010GL046213/abstract) fritz@iiasa.ac.at 2553 samples distributed over Africa 12

13 RHSeg Algorithm: Segmentation 13  The conversion of NTF format to Tiff using GDAL does not match with the one converted using Erdas/ArcMap (for e.g. FCC (GDAL) & NCC (Erdas)satellite imagery on Left have different extent)  The output from RHSeg does not overlay over the Image in eCognition  The issue of converting the RHSeg results from raster to vector format because segmentation results efficiently represented in vector format RHSEG result overlay on World-View-1 (FCC, NCC)

14 Adding Thematic Layers to RHSeg Result 14  The object labels from RHSeg segmentation need to be appended to a number of parameters  Jim Tilton has worked on NDVI layer which can be computed either prior to processing with RHSeg (pre-processing) – or the NDVI can be computed on the RHSeg output (post-processing). Border Index, Homogeneity Contrast, Band Values, Rectangular Fit…etc. eCognition segmentation output Example

15 Classification with Random Forest  R Program  RHSeg segmentation result is used for Random forest classification in R  Either the variables can be used in raster thematic form or attached to the region objects  Allows integration of the RHSeg output into the Random forest classification  Useful variables will be selected based on accuracy metrics predicted in R  eCognition Software 15  The random selection of objects created from multi-resolution segmentation are used as training data to perform random forest classification

16 North America Reference Data Speltz Misc. Vegetables & fruits Water Melon Hops Sod Switch Grass Wildflowers Other tree Crops Pistachio Carrot Garlic Cantaloupes Pumpkin Brocolli Caneberries Cranberries 16 The Study Area showing World View-1 imagery and Crop Database Layer, Yolo, California CDL Class Labels Crosswalk between CDL Labels and classification Methodology Development

17 North America Reference Data 17 Random distribution of Samples: Vegetation samples appeared more due larger proportion of the area is vegetation in this particular scene The Error Matrix

18 Accuracy Assessment of GCE v.2 Australia Map  1/3rd Validation set Received: 1,118 Ground Reference Data Locations  The ground data were collected from Zone 3 to Zone 9. Some of the locations are not on the GCE map  Both scale 1 (90x90m) and scale 2 (250x250 m) labels are part of the ground reference data observations  Ground data label: Land prepared for season 2 – Fallow or Rainfed cropland?? 18

19 Crosswalk between Ground and Map Labels GCE v.2 Classification SchemeGround data Classification 1.Rainfed Single Crop, all crops 2.Rainfed Single Crop, Pastures Rainfed Grazing Rainfed Continuous Orchard Rainfed Single flowers Rainfed Plantation 3. Irrigated Single Crop, Double Crop, all crops 4.Irrigated Single Crop Pastures Irrigated single vegetables 5.Irrigated Continuous, Orchards 6.Fallow Land prepared for season 2 Season 2 Crop Rainfed/Irrigated Single Land prepared for season 2 No-Crop 1. Rf, croplands 2. Rf, Pastures 3. Irrigated croplands 4. Irrigated pastures 5. Cropland, Irrigated, Continuous, Orchard?? 6. Fallow Africa Australia North America 19 Group11 Class Name Category 1AlfafaPasture 2BarleyCrops 3BeansCrops 4CanolaCrops 5LentilsCrops 6LupinCrops 7OatsCrops 8PeasCrops 9WheatCrops 10 Cropland, Irrigated, Continuous, Orchardorchard /Continuous crops 11Cropland single, sown-pasturePasture 12 Cropland, single, land prepared for S2Crops 13Crop harvestedCrops 14Rainfed vegetablesCrops 15PlantationCrops 16Cropland, RF, single, CropCrops 20RF, GrazingPasture 30no cropNon crop Final 6 classes to generate Error Matrix

20 Accuracy Assessment for SMT and ACCA algorithm generated maps Crop/No-Crop Accuracy The Error matrix for 6 Classes after truncating the samples ACCA SMT 20

21 More Possible Error Matrices 6 Classes + No-Crop ACCA6 Classes + No-Crop SMT 4 Classes(Merge Rain fed & Irrigated) ACCA 4 Classes(Merge Rain fed & Irrigated) SMT 21 Irrigation map has been used as mask to label ground samples

22 90 m Samples Verification over Google Earth 22 90m Samples Not Good for 250m

23 250m Samples Verification from Google Earth May result in Spatial Autocorrelation 23 Some samples are very near to the road and need to be placed more in the center of the field Most of the 250m samples are valid, a few of them need to be revised Most of the 90m samples are good for 30m Map validation but not for 250m Map

24 Consider Scale 2: 250m Samples only The Error matrix for 7 Classes after removing 90m samples ACCA SMT 24

25 Conclusion 25  The ground data with 250m homogeneous area are mostly good but the 90m samples do not cover 250m homogeneity in all samples  A number of Error Matrices can be generated to present the accuracy. The rows and columns can be reduced or expanded in detail, as necessary  The objective is to generate a valid, balanced, statistically sound error matrix with proportional representative number of samples for each class

26 REMEMBER 26  Anyone can generate an error matrix with any data.  Just because there is an error matrix does not mean that there is a valid accuracy assessment.  We have already provided you with a number of resources including a full reference data collection document to help you.  Our goal is to work with each team to make sure that you are thinking about all the requirements now so that our accuracy assessments are valid.

27 27 Some Key Topics  Classification Scheme  Sample Unit  Sample Size  Sampling Scheme  Spatial Autocorrelation

28 28 1. Classification Scheme  Key to any mapping project.  Must be done at beginning of project.  We have done this for our project – BUT…  Requirements of the Classification Scheme:  Meets the user’s needs  Consists of both labels and rules (definitions) that are  Mutually exclusive  Totally exhaustive  Hierarchical  Includes a minimum mapping unit.  Issue for us is crosswalking all the different classification schemes used for the various reference data sets all of us are using to our map classification scheme.  Can introduce serious error!

29 29 2. Sample Unit  Must consider positional accuracy and mmu  We have selected a 3x3 TM pixels (90m x 90m) sample unit that is homogeneous for the Landsat accuracy assessment.  We need at least a single MODIS pixel (250m x 250m) that is homogeneous for the MODIS accuracy assessment.

30 30 3. Sample Size  A useful rule of thumb: 50 sample units per map class  Need to balance proportion of samples in each map class with insuring that enough samples are taken per map class to know the accuracy of each map class  Need enough samples to insure good distribution across the map (avoid spatial autocorrelation)  Samples MUST BE INDEPENDENT of training data  We all must keep this in mind. This is the #1 reason for our coordination calls with all the mapping teams.  Need Justin’s help here.  If assess map of the continent, the results are for the entire continent. If need eco-region or country estimate, need to do assessment at that level.

31 31 4. Sampling Scheme

32 32 6. Spatial Autocorrelation  Spatial autocorrelation occurs when the presence, absence, or degree of a certain characteristic affects the presence, absence, or degree of the same characteristic in neighboring units (Cliff and Ord 1973)  Samples must be adequately spaced apart or they will be spatially autocorrelated.  This is true whether we are collecting reference data on the ground or from very high resolution imagery.

33 Future Work 33  Implement Object statistics on RHSeg Results  Resolve the issue of different extent of satellite imagery  Perform random forest on RHSeg result in R  Generate pure reference samples from AAFC cropland layer for Canada  Generate reference samples from CDL for North America  Continue our coordination calls with each mapping team

34 34


Download ppt "Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1."

Similar presentations


Ads by Google