Introduction to GIS Data Management CGIS-NURIntroduction to ArcGIS I
Lesson 2 overview Geographic phenomena Spatial data presentation Introduction to GIS Lesson 2 overview Geographic phenomena Spatial data presentation Data management in ArcCatalog Data in ArcMap CGIS-NURIntroduction to ArcGIS I
Geographic Phenomena (1) Introduction to GIS Geographic Phenomena (1) Geographic phenomena exist in the real world Geographic phenomena are a manifestation of an entity or process that: Can be named Can be geo-referenced (they are geographic) Can be assigned a time at which it is/was present There are different types of phenomena and by learning to recognize these types, we can select the correct way to store them for use in a GIS. CGIS-NURIntroduction to ArcGIS I
Geographic phenomena (2) Air temperature Geographic phenomena are the study objects of a GIS. Geographic phenomena exist in the real world, everything you see outside is a Geographic phenomenon. Some of the things you do not see are also Geographic phenomena like temperature. Shoreline Soil type Rocks Elevation Water temperature
Geographic phenomena (3) We need to come up with a digital representations of the geographic phenomena in order to store them in a GIS. This is not easy because different phenomena require different digital representations and multiple representations are possible for the same phenomenon. tesselation isolines TIN
Geographic phenomena (6) There are two types of geographic data, discrete data and continuous data. In continuous data, the underlying function is assumed to be continuous. Continuity means that all changes in field values are gradual (for example elevation). Discrete data cuts up the study space in mutually exclusive bounded parts, with all locations in one part having the same field value (for example land use)
Geographic phenomena (8) Continuous means that all changes in field values are gradual In a differentiable field we can measure the change. In the example on the left, we can measure the gradient (slope) as the change of elevation. Slope
Geographic phenomena (9) Discrete fields cut up the study space in subparts with a clear boundary, with all locations in one part having the same value Typical examples are land classifications, geological classes, soil types, landuse types, crop types or natural vegetation types forest agriculture agriculture agriculture road
Geographic phenomena (10) Objects are discrete and bounded entities The space between the objects is potentially ‘empty’ or ‘undetermined’, the space is not really empty, it my contain roads, gardens, driveways etc, but these are not houses or buildings. house house No house: empty
Geographic phenomena (11) The position of an object in space is determined by a combination of one or more of the following parameters Location (where is it?) Shape (what form?) Point Line Polygon (volume feature) Size (how big?) Orientation (direction?) The river is an object, with a location, Shape and a direction
Boundaries (1) Crisp Two different types of boundaries: Crisp boundaries Fuzzy boundaries Boundary Fuzzy
Data model and Data Structure (1) So far we only discussed geographic phenomena, in de following sections we discuss computer representations. Computer representations can be divided in two groups: tessellations also called the raster data model (Tiling) and the vector-data model. The next step is to understand how the data models can be applied to represent geographic fields and objects. Data Structure: Data structures provide the information that the computer requires to reconstruct the spatial data model in digital form. The diversity of data structures makes exchanging spatial data very difficult.
Data model and Data Structure (2) House Lac River Forest Raster Data Model 1 2 4 3 Vector Data Model
Spatial data model: Raster model (1) Reality In the raster data model individual cells are used as the building blocks to create a total map. The cells are of the same shape and size, and the field attribute value assigned to a cell is associated with the entire area occupied by the cell. Building Explain the sentence partition of space into pair wise disjoint cells…. Partition means to divide, Pair wise cells, means: ?????????????????( is this the value and the cell, or the row and column number) Disjoint means, each cell is a new cell (does not have anything to do with its neighbours) Important, covers the total study area, so it can not be used for objects, only for fields. (can be no data…) Grid should only be used for points. Road Field
Spatial data model: Raster model (2) Pixel or Cell= a square representing a specific portion of an area. Always with the same size Rows & Columns=Cartesian matrix. Each cell has a unique row/column address Values=one value by pixel Origin Pixel (cell) Column 3 Line 9 Value
Raster data model: Examples Satellite Imagery Classified image Scanned Image Aerial Photography
Spatial data model: Vector model (1) A vector data model uses two-dimensional Cartesian (x,y) co-ordinates to store the shape of a spatial entity In the vector world the point is the basic building block from which all spatial entities are constructed The simplest spatial entity is the point Lines and polygons are constructed by connecting a series of points into chains and polygons The more complex the shape the greater the number of points needed
Spatial data model: Vector model (2) Points are defined as single coordinate pairs (x,y) when we work in 2D or coordinate triplets (x,y,z) when we work in 3D Points are best used to represent objects that are described as shape- and sizeless single locality features. Points representing trees along a road
Spatial data model: Vector model (3) Line representations: Used to represent one-dimensional objects (roads, railroads, canals, rivers…) Line is defined by 2 end nodes and 0-n internal nodes to define the shape of the line. An internal node or vertex is like a point that only serves to define the line Begin node Vertex Line or arc End node
Spatial data model: Vector model (4) Area representations: When area objects are stored using a vector approach, the usual technique is to apply a boundary model. The area is defined by the boundary of the area You store the boundary of the area
Spatial data model: Vector model (5) A simple but naïve representation of area features would be to list for each polygon the list of lines that describes its boundary. This is called a polygon-by-polygon representation. Each line in the list would be a sequence that starts with a node and ends with one. Total boundary of the polygon
Spatial data model: Vector model (6) The reason why this is not a good representation is called data redundancy. This means that shared boundaries between polygons are stored double When storing the second boundary, some line segments are duplicated
Spatial data model: Vector model (7) Line 3 Line 2 The boundary model or topological data model is an improved representation of the polygon-by-polygon model. It stores parts of a polygon’s boundary as separate line segments. Line 1
Spatial data model: Vector model (8) 3 It also indicated which polygon is on the left and which is on the right of each arc Line D Line E 4 7 Line L 6 5 6 Line K 8 XX Line M ZZ Line J Line F 9 15 Line I 10 Line H Line N Line O 14 Line P 13 Line Q Line G 12 11 Line From Node To Node Left polygon Right polygon N 15 14 QQ P 13 O ZZ I 9 10 XX QQ
Spatial data model: Vector model (9) We can determine the left and the right polygon, because the line segment has a direction. The direction of the line segment is from the “From Node” to the “To Node” From Node 15 Left Right To node 13
Topology & Spatial Relationship disjoint/near covered by neet/adjacent contains equal covers overlap/ intersect inside Source: Wolfgang Kainz
Resolution (1) Selecting the appropriate number of points to represent an entity is similar to selecting the raster resolution. The more complex the shape of the line or polygon the more points are used.
Resolution (2) Each cell represents an area of 10 by 10 meters, the resolution is 10x10 meter The size of the area that a single raster cell represents is called the raster’s resolution 1 2 3 4
Resolution (3) refers to the size of the pixel or grid cell used for representation objects or surfaces can be represented in extremely great detail if the grid cells are very small small size of grid cell = high resolution large size of grid cell = low resolution high resolution = large data storage requirements low resolution = small data storage requirements Grid Cell Width = 30 meters Grid Cell Width = 1 km
Resolution (4) Value? Some convention is needed to state which value prevails on cell boundaries Lower and left boundaries belong to the cell. 4 3 4 2 3 3 2
Resolution (5) Factors to be considered: Resolution of input data Size of the resultant database and disk capacity Desired response time Application and analysis to be performed Limits: Resolution is different from accuracy the more homogeneous an area, the larger the cell size can be without affecting accuracy Cell size finer than the input resolution will not produce more accurate data than the input data (different cell size can be stored and analyzed together) Larger cells may encompass more than one data value (loss of resolution) Cost of for database storage, processing speed for analysis
Vector data model – raster model (1) In vector representations a georeference is explicitly associate with the geographic phenomena. A georeference is a coordinate pair from some geographic space, also known as a vector.
Vector data model – raster model (2) Rasters do not explicitly store georeferences of the phenomena. They provide a georeference of the lower left corner and the resolution. The georeference of all other cells can be derived from this information.
Vector data model – raster model (3) Vector model Simple data structure Easy and efficient overlaying Compatible with Remote Sensing imagery High spatial variability is effenciently represented Simple for programming by user Inefficient use of computer storage Errors in perimeter and shape Difficult to perform network analysis Inefficient projection transformations Loss of information when using large pixel sizes Less accurate and less appealing map output Complex data structure Difficult to perform overlaying Not compatible with RS imagery Inefficient representation of high spatial variance Compact data structure Efficient encoding of topology Easy to perform network analysis High accurate map output
Modeling surfaces (1) With modeling surfaces we mean modeling continuous fields like elevation, pollution and rainfall. A surface is a 2,5 – dimensional representation, each point only has one “height” value Surfaces can be modeled both in raster and in vector A raster surface is often indicated as a DTM (digital terrain model) In vector we have two ways to model a surface as a grid and as a TIN (Triangulated Irregular Networks)
Modeling surfaces (3) The abbreviation DTM (digital terrain model) is used to describe a digital data set which is used to model a topographic surface. The abbreviation DEM (digital elevation model), on the left you see a DEM of Rwanda A DEM contains no other data then the information about elevation. The DEM is a continuous raster layer, each cell value represents an elevation.
Modeling surfaces (4) A grid is a regularly spaced set of spot heights
Modeling surfaces (5) 30 810 350 1550 980 1250 1100 1340 45 820 A TIN is built from a set of measurements for example points of height. These points can be scattered unevenly over the study area, with areas of more change having more points Triangles are fitted through three points to form planes
Modeling surfaces (7) A Tin is a vector representa-tion and not an irregular tessellation because: Each anchor point has a stored georeference The planes do not have a stored values (like raster cells have) No value is stored for this plane A georeference and value is stored for each anchor point
Modeling surfaces (8) Main campus GIS center Ngoma Rwasave
Data management in ArcCatalog Introduction to GIS Data management in ArcCatalog CGIS-NURIntroduction to ArcGIS I
Data management in ArcCatalog: File Geodatabase Introduction to GIS Data management in ArcCatalog: File Geodatabase A collection of various types of GIS datasets held in a file system folder. This is the recommended native data format for ArcGIS stored and managed in a file system folder. Each dataset is a separate file on disk A file geodatabase is a file folder that holds its dataset files. CGIS-NURIntroduction to ArcGIS I
Data management in ArcCatalog: File Geodatabase Introduction to GIS Data management in ArcCatalog: File Geodatabase Adding data by import or export from other files or creating a new feature class Exporting data To give single feature classes to someone they have to be exported to shp files CGIS-NURIntroduction to ArcGIS I
Data management in ArcMap Introduction to GIS Data management in ArcMap Added data is symbolised in default properties: Points as small circles in a randomly chosen colour Lines as thin lines in a randomly chosen colour Polygons as filled areas in a randomly chosen colour New layer always on top of other layers in same representation type (points – lines – polygons – raster) CGIS-NURIntroduction to ArcGIS I
Data management in ArcMap Introduction to GIS Data management in ArcMap Change order by drag and drop layer Change symbology by clicking on the symbol or using the property’s Symbology tab CGIS-NURIntroduction to ArcGIS I
Exercise 2 overview Add a feature dataset to your file geodatabase Introduction to GIS Exercise 2 overview Add a feature dataset to your file geodatabase Import data to the feature dataset Add new files to your map document Change order and symbology Perform a query Explore differences of raster and vector data CGIS-NURIntroduction to ArcGIS I
Introduction to GIS CGIS-NURIntroduction to ArcGIS I