Download presentation
Presentation is loading. Please wait.
1
Data compilation and gaps assessment
Session 3.4: Implementing the geospatial data management cycle (Part 3): Data compilation and gaps assessment MODULE 3: GEOSPATIAL DATA MANAGEMENT Session 3.4: Implementing the geospatial data management cycle (Part 3): Data compilation and gaps assessment To better understand this session, you are encouraged to read Health GeoLab Collaborative Guidance Document Part 2.3 Compiling existing data and identifying gaps:
2
Key terms used in this session
Basemap: Collection of GIS data and/or orthorectified imagery that form the background setting for a map. The function of the basemap is to provide background detail necessary to orient the location of the map. Data: Facts and statistics collected for reference or analysis Geographic feature: Man-made or naturally-created features of the Earth. Geographic object: Whereas features are in the real word (mountain, river, church, etc.), geographic objects are computer representation of features Here are the key terms that will be used in this session. 2
3
Key terms used in this session
Geospatial data: Also referred to as spatial data, information about the locations and shapes of geographic features and the relationships between them, usually stored as coordinates and topology. Master list: An authoritative, officially curated by the mandated agency, complete, up-to-date and uniquely coded list of all the active (and past active) records for a given geographic object (e.g. health facilities, administrative divisions, villages) Metadata: Information that describes the content, quality, condition, origin, and other characteristics of data or other pieces of data Here are the key terms that will be used in this session. 3
4
Key terms used in this session
Statistical data: Also attribute data. Nonspatial information about a geographic feature, usually stored in a table that can be attached to a geographic object through the use of unique identifier or ID Thematic layer: A spatial representation of analyzed geospatial data of elements of the same type (health facilities, roads, districts, etc.) Here are the key terms that will be used in this session. 4
5
Compiling existing data
Once the data needs have been identified and before collecting data in the field, existing data must first be compiled. The defined data specifications and ground reference will then be used as references to assess the compiled data. Compiling existing data (Refer to slide) By compiling existing data first, you will have an idea what data are already available to you and what data you would need to create to complete all your data needs. Compiling existing data first prevents duplication of efforts, saves time and money, and allows identification of potential gaps. 5
6
What is needed? The following must be compiled in order to have a quality dataset: The master list for the geographic features considered in the data model The thematic layers containing the geospatial representation (geographic objects) for the considered geographic features The statistical data to be attached to these features Basemaps to serve as ground reference when checking the geospatial data that has been collected Metadata What is needed? What must be compiled in order to have a quality dataset? (Refer to slide) The different data that are needed to be compiled for the different features considered in public health are explained further for each object in the next slides. 6
7
What is needed? 1. Fixed objects that can be represented by a point (health facility, village, school, …) Complete, up-to-date master list with unique ID , admin divisions and lat-long Lat/long used to create a GIS layer (points) What is needed? For fixed objects that can be represented by a point, the following must be compiled: Master list of the object with the unique ID, administrative divisions, and geographic coordinates - The geographic coordinates will be used create a GIS layer (point) Statistics/information table with ID for each object included - The unique ID will be used to link it to the GIS layer to be able to create maps ID included in the statistics/information table ID allows joining with the map 7
8
What is needed? 2. Objects that can be represented by a line or polygon to which information/statistics can be linked (admin boundaries, road, river,…) Complete, up-to-date master list with unique ID + ID included in the attribute table of the GIS layer What is needed? For objects that can be represented by a line or polygons to which information/statistics can be linked, the following must be compiled: 1. Master list of the object with the unique ID 2. Thematic layers containing the geospatial representation (geographic objects) for the considered geographic features with the unique ID included in the attribute table 3. Statistics/information table with ID for each object included Similar to the point features, the unique ID will be used to link the statistical data to the GIS layer to be able to create maps ID included in the statistics/ information table GIS Layer (shapefile) ID allows joining stats with the GIS layer 8
9
What is needed? 3. “Mobile” objects (patients, health workers, vehicle,…) Complete, up-to-date master list with unique ID and fields to capture the place of care, of residence,… through these objects respective unique ID Village What is needed? For “mobile” objects, the following must be compiled: 1. Master list of the object with the unique ID and fields to capture the place of care, residence, etc. through those objects unique ID. 2. Statistics/information table with ID for each object included You might want to track some of the objects and this can be done through the use of GPS receivers. But you might want to track some of these objects… 12034 9
10
What is needed? 4. Continuous geospatial data (Digitial Elevation Model (DEM), landcover,…) The GIS Layer What is needed? For continuous geospatial data, the following must be compiled: The GIS layer The classification table if not already included in the layer itself The classification if not already included in the layer itself 10
11
What is needed? Basemaps
Raster file with selection of layers combined into single raster or vector file Used as bottom layer/background for a map E.g. Google Maps ESRI via ArcGIS online Open Street Map What is needed? Basemaps to provide background detail necessary to orient the location of the map. Only for visualization purpose 11
12
What is needed? Metadata
The corresponding metadata should also be collected along with the data (geospatial and statistical data) as this will be used later to assess the appropriateness of the compiled data. A minimum, the metadata should include the following in order to be useful: What is the data about? Who created the data? When was it created/collected/last updated? How was the data created? What are the data specifications (geographic coordinate/projection system scale, accuracy, language, etc.)? Are there any use or redistribution restrictions attached to the data? Who can I contact if I have questions about the data? What is needed? Metadata (Refer to slide) The metadata contains the information about the data and this information is important in assessing the appropriateness of the data based on the identified data needs and data set specification. It should therefore be always collected along with the data. If such metadata is not directly attached to the data file itself, this should be collected separately and kept together with it (e.g., in the same folder). 12
13
Sources of data Master lists should only come from government entities which have the official mandate over the considered geographic feature(s). This table provides the list of the governmental entities generally in charge of the master list and associated thematic layer for the geographic features most often used in public health. Geographic feature Master list Thematic layer Governmental entity Health facilities ✔ Ministry of Health, NGO (WHO) Health districts or other reporting divisions Administrative divisions and villages Ministry of Interior, National Statistical Agency, National Mapping Agency Transportation network Not necessary Ministry of Public Works, Ministry of Transportation Hydrographic network Ministry of Environment/Agriculture Climate data (temperature, precipitation, etc.) Not applicable Ministry of Meteorology, Meteorological agency Digital Elevation Model (DEM) National Mapping Agency Land cover National Mapping Agency, Ministry of Environment/Agriculture Sources of data Where should you look for the needed master list of objects? (Refer to slide) 13
14
Sources of data However, other sources can be considered for geospatial and statistical data and basemaps depending on the needs identified at the beginning of the process and their availability. It is important to consider all of these sources as they might be complementary and under different use and redistribution rights constraints. Sources of data How about for other data – geospatial and statistical data and basemaps? (Refer to slide) By considering all the potential sources, you may be able to combine some of the compiled data to be able to have a complete set. This will also allow you to assess the different use and redistribution rights of the compiled data. It will be discussed in later slides how this can affect how you use the compiled data. 14
15
Potential sources of data
Aside from governmental entities, other potential sources of local, regional or global geospatial and statistical data are: NGOs (UN,…) and volunteer groups (i.e. OSM): administrative boundaries, road network, hydrographic network, populated places,… Research groups/universities: Population distribution grids, land cover Other type of institutions: satellite images Private sector including GIS software companies (e.g. Esri): basemap layers Potential sources of data What are these other sources for geospatial and statistical data and basemaps? (Refer to slide) 15
16
Potential sources of data
Online free shapefiles (for download) There are hundreds of websites for online free shapefiles. Here are some examples: DIVA GIS: United Nations: Wide range of spatial data (not all freely accessible) Global Administrative Unit Layers (need to apply for access) Open Street Map: openstreetmap.org, openstreetmapdata.com , Potential sources of data There are hundreds of websites for online free shapefiles if one tries to search the internet. Here are just some examples. (Refer to slide) 16
17
Potential sources of data
Online free shapefiles (for download) ISCGM: Government data Natural Earth: GADM (Global Administrative Boundaries): Potential sources of data There are hundreds of websites for online free shapefiles if one tries to search the internet. Here are just some examples. (Refer to slide) (Continued) 17
18
Potential sources of data
Online free population data (for download) Worldpop: GEOHIVE: Gridded Population of the World v4 Socioeconomic data and applications center (SEDAC): Potential sources of data Here are examples of potential sources of data for population data, satellite images, and other raster data. (Refer to slide) 18
19
Potential sources of data
Free satellite images and other raster (for download) Global Land Cover 30: GLCF: Landsat, Aster, SRTM, Forest cover, etc. CGIAR: SRTM 90m Global Forest Change 2000–2018 Data Download: global-forest/download_v1.6.html NASA Satellite Data: Potential sources of data Here are examples of potential sources of data for population data, satellite images, and other raster data. (Refer to slide) 19
20
Organizing compiled data
Once the available existing data are compiled, they must then be organized on a computer in such a way that it is easy to find by you or other people. The suggested folder organization structure is the following: 1. Data category corresponding to the different geographic features being collected (health facilities, administrative divisions, Digital Elevation Model (DEM)) Organizing compiled data What should you do once you have compiled available existing data? Suggested folder organization structure (Refer to slide) 20
21
Organizing compiled data
2. Data type. Four main types are generally considered: a. DOCUMENTS: for reports, publication and other narratives documents b. GIS: for thematic layers saved in a GIS compatible format (shapefile, GeoJSON, raster, etc.) c. MAPS: for maps saved in PDF, MS Word, or other image formats d. TABLES: for any data saved in a tabular format (Excel, csv, dbf, etc.) Organizing compiled data Suggested folder organization structure 21
22
Organizing compiled data
3. Data source with one folder for each separate source and the corresponding data saved in each of these folders. Please note that the year of data production is mentioned together with the source in the folder name when known. Organizing compiled data Suggested folder organization structure 22
23
Assessing compiled data
The compiled and organized data must be assessed in order to identify: If at least one source for each of the data needed at the beginning of the process has been found. Assessing compiled data What should you do once you have organized the compiled data? Check all compiled data against the list of data needs identified at the beginning of the process. Ideally, there should be at least one source for each of the needed data. Make a list of all the data that is not available. If not, make a list of data that has not been possible to find 23
24
Assessing compiled data
Which source/s is/are of the best quality according to the six (6) dimensions (Completeness, Uniqueness, Timeliness, Validity, Accuracy, Consistency): Comply to the defined data specifications? Consistent with the ground references? Assessing compiled data Using the data specification document and the ground references (remote sensing imagery and master lists), check the compiled data for completeness, uniqueness, timeliness, validity, accuracy, and consistency. If the compiled data do not comply with the data set specifications or are not consistent with the ground references, you may look for other sources, complete identified gaps, or do with what you have. If not, you might have to search for other sources or complete the identified gaps…or do with what you have Example of geospatial data specifications document: MOHS Myanmar 24
25
Criteria for assessing compiled data
Criteria for assessing the compiled data across the 6 dimensions of data quality Criteria for assessing compiled data This table (found in Annex 1 of the HGLC guidance document volume 2.3 Compiling existing data and identifying gaps) has the criteria that can be used to assessed the compiled data according to the 6 dimensions of data quality. For example, to assess the completeness of vector geospatial data, check that the data contains all the records reported in the master list for the time period considered foe the project. By using this table, you will have a better idea of the quality of the data you have compiled. 25 Annex 1 of
26
Possible issues when assessing data
There are some issues that may arise after the assessment such as: Temporal discrepancies: Data collected at different time period and therefore corresponding to different geographies Possible issues when assessing data In the process of assessing the data, you might come across issues with the data such as: Temporal discrepancy: This is when data are collected at different points in time which can mean these data corresponds to different geographies. In the example, notice that the district of Soroti has been divided several times through the years to form other districts. Therefore, statistical data for Soroti from different years do not correspond to the same geography. Great care must be taken to attach the statistical data to the correct geography. 26
27
Possible issues when assessing data
There are some issues that may arise after the assessment such as: Lack of documentation: The metadata is missing or incomplete which could lead to technical issues or inability to determine the appropriateness of the data for use Big gaps in authoritative data compared other sources: The gaps in authoritative data from the government might be too big compared to other sources for them be considered Possible issues when assessing data Some other issues that may arise after the assessment of data: Lack of documentation: The metadata contains the information about the data. If this is missing, you will be unable to determine if the data is appropriate for your use based on the data specifications. This can also lead to the use of data that was actually not meant to be used and shared in the first place. In not knowing its use and redistribution rights, you use it anyway, leading to problems with the owner. There are times when the data from other sources have too big a gap compared to authoritative sources. This can prevent you from considering these other sources as alternatives. 27
28
Result of data assessment
At the end of the assessment, some or all of the compiled data: May not be of sufficient quality Do not comply to the data specification and/or not consistent with the ground references Has use restrictions Can use them but under some conditions Cannot even use the data Has sharing restriction Can use the data to make a map but cannot share the data to a third party Result of data assessment After assessing the compiled data, you might find that some of them: May not be of sufficient quality which can lead to low quality data products Has use and sharing restrictions This is why it is always important to have the metadata as it will inform you of these restrictions What should you do if at the end of the initial assessment process, you find gaps in your data? You might find that you have gaps in your data at the end of the assessment process 28
29
Addressing the data gaps
When you have gaps with your data, the following options should be considered: Look for additional sources that might have been missed during the first round of data compilation Identify if combining different sources of data together could help cover the gap(s) Addressing the data gaps If after the assessment you find that you have gaps in your data, there are several ways to address these gaps: Look for additional sources that might have been missed during the first round of data compilation Identify if combining different sources of data together could help cover the gap(s) It is also possible that gaps will remain in your data set even after looking for additional sources or combining different sources. If this is the case, the remaining gaps should be properly documented and mentioned not only in the metadata profile but also on any maps that would be created using this data. If none of the above works, the remaining gaps should be properly documented and mentioned not only in the metadata profile but also on any maps that would be created using this data. 29
30
Addressing the data gaps
How do you address these remaining gaps when possible? Session Implementing the geospatial data management cycle (Part 4): Data collection and extraction Addressing the data gaps There is, however, another way to address these remaining gaps when possible. How do you do it? The next session discusses how to extract data and collect data in the field to fill the data gaps. 30
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.