Download presentation
Presentation is loading. Please wait.
Published byPatrik Jensen Modified over 5 years ago
1
Session 3.2: Implementing the geospatial data management (Part 1): Documenting the process and defining the data needs MODULE 3: GEOSPATIAL DATA MANAGEMENT Session 3.2: Implementing the geospatial data management (Part 1): Documenting the process and defining the data needs To better understand this session, you are encouraged to read Health GeoLab Collaborative Guidance Document Part 2.1 Documenting the process and defining the data needs:
2
Key terms used in this session
Data: Facts and statistics collected for reference or analysis Geographic data: Information describing the location and attributes of things, including their shapes and representation. Geographic data is the composite of spatial data and attribute data. Geographic feature: Man-made or naturally-created features of the Earth. Geographic object: Whereas features are in the real word (mountain, river, church, etc.), geographic objects are computer representation of features Here are some key terms that are used in this session. 2
3
Key terms used in this session
Geographic Information: Spatial and/or geographic data organized and presented to create some value and to answer questions Geographic Information System (GIS): An integrated collection of computer software and data used to view and manage information about geographic places, analyze spatial relationships, and model spatial processes Geospatial data: Also referred to as spatial data, information about the locations and shapes of geographic features and the relationships between them, usually stored as coordinates and topology. Statistical data: Also attribute data. Nonspatial information about a geographic feature, usually stored in a table that can be attached to a geographic object through the use of unique identifier or ID Here are some key terms that are used in this session. 3
4
Documenting the process
The geospatial data management cycle comprises several steps and implementing these steps takes a long time and may involve different individuals. Documenting each step from the beginning as precisely as possible ensures that the process can be replicated Documenting the process (Refer to slide) 4
5
Documenting the process
Some steps may require simple documentation such as just describing the choices made and why Other steps may require a more lengthy description of the processes and other elements involved: Compiling existing data Collecting or extracting data Cleaning and validating data Using the data How are the steps documented? Some steps in the geospatial data management cycle may require simple documentation such as just describing the choices made (and why) such as the step in defining the vocabulary and defining the data set specification and ground reference. However, other steps require a more lengthy description of the processes and other elements involved. For example, when documenting the collection of data, this will include the method and technology of data collection, the preparation made to implement the process, the data collection team, etc. 5
6
Documenting the process
The document on the geographic accessibility analysis done by the World Health Organization (WHO) to support maternal and newborn health in Cambodia is an example (below). All the elements and processes involved (indicators, targets, assumptions, tools, analyses, data, and norms) are fully documented in the report. Example of a report documenting the process. This document properly documented all the elements and processes involved in the analysis. You can access this document: A good document such as this will allow someone else to obtain the same results 6
7
Defining the data needs
Having kept in mind to document the whole data management cycle, the process of acquiring the data for the data products can begin by defining the data needs. Start by making a list of all the data – geospatial or statistical – that are needed to address the pre-defined objectives. How do you define the data needs? (Refer to slide) This is an important process as it will give you an idea of the scope of your project depending on the amount of data you need. 7
8
Geospatial data When making a list of data, there are different features considered in Public Health. These features need to be translated into geographic objects which are computer representation of real-world objects on a map. As the priorities of Public Health exist in the real world, there needs to a way to be able to represent them in a map to be able to analyze and process them. Think of how real-world objects such as a health facility, a village, or a province can be captured in GIS and represented on a map. The next slides shows the features considered in Public Health and how are they captured in GIS. 8
9
Geospatial data They can be separated into four groups when looking at how they would be captured in GIS. 3 4 1 2 What are the features considered in Public Health? How are they captured in GIS? There are four main groups of features considered in Public Health when looking at how they would be captured in GIS: Fixed feature represented as point Fixed feature represented as line or polygon Mobile features Continuous features (Refer to diagram in the slide. Each group is explained further in next slides.) 9
10
Geospatial data These features are:
Fixed and for which the geography can be simplified by a point (examples: household, health facility, village when boundaries are not available,...). The geography of these objects is obtained through their geographic coordinates. Fixed as well but for which the geography has to be represented by polygons due to their much larger extent (Examples: administrative divisions, health districts,...) or by a line (Example: road, river,....). Features considered in Public Health (Refer to slide) 10
11
Geospatial data Mobile (Examples: individuals, patients, vehicles,...). The geography of these objects would either be obtained by considering them attached to a fixed object or by simplifying them as a point that would be located through its geographic coordinates (latitude and longitude) taken at a given time. Continuous: some elements of our environment are not defined objects per say and not associated with one specific location, but are rather distributed spatially. These are better represented using a continuous surface (e.g. terrain, land surface attributes, population distribution) Features considered in Public Health (Refer to slide) 11
12
Types of geospatial data
These features are represented as either vector or raster format geospatial data. Vector format (shapefile) As mentioned in the previous slides, some features can be represented as either point, line, or polygon. These are vector format geospatial data. Vector is a coordinate-based data model of which there are three types – point, line, and polygon: The simplest type is the point which is a geometric element defined by a pair of x,y coordinates. A point features is a map feature that ha neither length nor area at a given scale, such as a city on a world map or a building on a city map. A line is a shape defined by a connected series of unique x,y coordinates. A line may be straight or curved. A line feature is a map feature that has length but not area at a given scale, such as a river on a world map or a street on a city map. A polygon is a closed shape defined by a connected sequence of x,y coordinate pairs (lines), where the first and last coordinate are the same and all the other pairs are unique. A polygon feature is a map feature that bounds an area at a given scale, such as a country on a world map or a district on a city map. Point Line Polygon 12
13
Types of geospatial data
Raster format (Geotiff, GRID) The raster format of geospatial data is a spatial data model that defines space as an array of equally sized cells arranged in rows and columns (such as the third image in the slide), and composed of single or multiple bands. Each cell contains an attribute value and location coordinates. Unlike a vector structure, which stores coordinates explicitly, raster coordinates are contained in the ordering of the matrix. Continuous features such as DEM or land cover are represented as raster format data. 13
14
Statistical data Meanwhile, statistical data or attribute data are non-spatial information about a geographic feature, usually stored in a table. These data can be attached to a geographic object through the use of unique identifier or ID. Examples of statistical data are: Number of doctors in hospitals Population count of provinces or districts Number of positive malaria cases reported in a health facility Statistical data Statistical data are in tabular form that contains programmatic information. It is important for statistical data to contain the unique identifier of the objects of which the data is about. 14
15
Linking geospatial and statistical data
Map Objects Attributes Health program specific Stats/Info Country Province/ Municipality Stats/Info Stats/Info District/Town/City Stats/Info wards/communes/ townships Table/graph Information Health facility The statistical data will be linked to the corresponding geographic object through the use of unique identifier or ID of each object. The official list of unique identifier can be found in the master list of each object. (More on master list in Session 3.3) When geospatial data and statistical data are combined, it is now geographic data. When the geographic data is organized and presented to create some value and to answer questions, it is now geographic information. Information Villages Stats/Info Patient Unique identifier Common assets Master lists 15
16
Developing the data model
While making the list of needed data, it is useful to look at how they relate to one another in the final database. This can be done through the development of a data model A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world. Developing the data model (Refer to slide) 16
17
Developing the data model
A data model has three (3) main levels1. They can be differentiated as: Conceptual data model which identifies the highest-level relationships between the different entities/objects2 Logical data model which describes the data in as much detail as possible, without regard to how they will be physically implemented in the database Physical data model which represents how the model will be built in the database Three main levels of a data model (Refer to slide) 1 2 Refers to any person, place, or thing that data can represent on a map 17
18
Developing the data model
Each of these models fulfils a different function and contains different features: Included feature Data model Conceptual Logical Physical Entity Names X Entity Relationship Attributes Primary Keys Foreign Keys Table Names Column Names Column Data Types Each of these models fulfils a different function and contains different features: Attributes: Data or information attached to a particular entity/object. Primary key: Key in a relational database that is unique for each record Foreign key: Set of one or more columns in a table that refers to the primary key in another table Table name: Name of the table containing the information about a particular entity/object Column data type: Format in which the data/information is captured for each field in the table (integer, character, date,...) The next slides present example of these data models. 18
19
Conceptual data model Example of conceptual data model – Malaria elimination Different shapes have been used to differentiate between: Features for which a master list (official, complete, up-to-date and uniquely coded list) is needed [2]. These are represented by rectangles. Features for which a master list is not needed/applicable due to their continuous nature. These are represented by ovals. Groups of attributes which are represented by white parallelograms. While generally not included in a conceptual data model (Table 1), this feature has been included here as it was adding some value to the overall model without overloading it. Colors are used to differentiate between entities/objects that relate to: Health (in blue), Spatially distribute malaria hazard (orange), and The malaria risk mapping and elimination (in grey). Arrows are used to indicate a relationship between two features or between a feature and a group of attribute. This relationship can be of different types, namely: Geographic (for example a health facility is located within an administrative division), Network based (a laboratory is working with a health facility), and/or Analytical (the spatial distribution of the vector habitat is obtained by combining the spatial distribution of water bodies, topography, land cover, temperature, and rainfall). This being said, only the first two types of relationships will be captured in the final database structure. 19
20
Logical data model Example of logical data model – derived from the Malaria elimination conceptual data model This particular model mainly applies to data being stored partly or fully in tabular forms. Continuous geospatial data such as the Digital Elevation Model (DEM) or land cover which do not contain a table will therefore remain represented as is in the conceptual model. It also contains attributes as well as primary and foreign keys. At the same time, relationships between entities/objects are specified using primary keys (PK) and foreign keys (FK) therefore specifying what attributes are used for this relationship. 20
21
Physical data model Example of physical data model – derived from the Malaria elimination logical data model The final step in the process consists of expanding and slightly modifying the logical data model to become a physical data model which will be used to build the database itself. This is done by doing the following on the logical model: Converting entities into specific tables names Converting attributes into column names Specifying the data type for each of the column 21
22
Evolution of the data models
The development of the data models should not stop after their initial creation. The different data models should evolve as the project implementation progresses and should also be improved, completed, and potentially updated through a consultative process among involved stakeholders. How should the data models evolve? (Refer to slide) 22
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.