Spatial Data Entry via Digitizing May 5, 2016
Digitizing Process of collecting digital coordinates Common data entry method Process by which coordinates from a map, image, or other sources are converted into a digital format in a GIS Nearly all geographic information produced before 1960 was recorded in hardcopy form.
Digital spatial data Data in computer compatible format Text files List of coordinates Digital images Coordinate and attribute data in electronic file format Common source of digital information
Digitizing Methods Manual digitization Human guided coordinate capture from a map or image source On-screen digitizing (heads-up digitizing) Hardcopy map digitizing On-screen digitizing: digitizing on a computer screen using digital image as a back drop.
Characteristics of Manual Digitizing Data accuracy may be affected due to Equipment characteristics Errors in original scanned document Abilities and attitude of person digitizing Map scale Small errors in map production may cause significant positional errors may negatively affect the positional quality of spatial data Greater for smaller scale maps Map scale impacts the spatial accuracy of digitized data.
Larger Error at Smaller Map Scale Surface Error caused by a one millimeter Map Error Map Scale Error (m) Error (ft) 1:24,000 24 79 1:50,000 50 164 1:62,500 62.5 205 1:100,000 100 328 1:250,000 250 820 1:1,000,000 1,000 3,281 Table 4-1 of Text book Effect of 1 mm on different scales
Digitizing Process Display image on screen Trace locations of feature Point Features that are viewed as points Line linear features Starting point is called ‘starting node’ Vertices: intermediate nodes Ending node Polygon
Digitizing Errors Positional errors are inevitable in manual digitization Undershoots and overshoots Common error of digitization Undershoots: nodes that do not quite reach the line or another node Cause unconnected networks and unclosed polygons Overshoots: lines that cross over existing nodes or lines May cause difficulties in network analyses
Node and Line Snapping To reduce undershoots and overshoots while digitizing Snapping: process of automatically setting nearby points to have the same coordinates. Relies on a snap tolerance or snap distance Prevent a new node from being placed within the snap distance of an already existing node – new node is joined or snapped to the existing node ensuring connection between digitized lines
Snapping Distance/Tolerance Careful selection may reduce digitizing errors Too short or too large – Problem Should be smaller than the desired positional accuracy Should not be below the capabilities of the system used for digitizing
Reshape: Line Smoothing and Thinning Software may provide tools to Smooth Densify Thin Spline Function To smoothly interpolate curves between digitized points
Coordinate Transformation To bring spatial data into an Erath-based map coordinate system so that each data layer aligns with every other data layer Also referred to as “Registration” – because it registers the layers to a map coordinate system Control points are needed
Control Points A set of control points is used for transformation Used to estimate the coefficients for transformation equation Criteria for selecting a control point A sufficient number of control points should be selected Minimum number depends on the mathematical form of the transformation Additional points above minimum numbers are recommended to improve the quality and accuracy of the statistically-fit transformation Should be from a source that provides the highest feasible coordinate accuracy Control point accuracy should be at least as good as the desired overall positional accuracy required for the data Should be as evenly distributed as possible throughout the data area
Control Point Sources ????
Image Transformation: Georeferencing Georeferencing is the process of defining how raster data is situated in map coordinates Aligning the raster with control points
Types of Transformation Polynomial Spline Adjust transformation
Polynomial Transformation Uses a polynomial that is built upon control points and a least square fitting (LSF) algorithm It is optimized for global accuracy but does not guarantee local accuracy Yields two formulas: one for computing the output x-coordinate for an input (x, y) location and one for computing the y-coordinate for an input (x,y) location Number of control points required for this method must be 3 for a first order, 6 for a second order, and 10 for a third order source: ESRI-ARCGIS
1st Order Polynomial – Affine Transformation Employs linear equations to calculate map coordinates Provides translation, rotation and scaling 6 unknowns and hence needs 6 equations
Affine Transformation C and F are translation changes between coordinate systems – shifts in the origins from one system to the next Other parameters incorporate the change in scales and rotation angle 500,083.4 = A (103.0) + B (-100.1) + C 5,003,683.5= D (103.0) + E (-100.1) + F This generally results in straight lines on the raster dataset mapped as straight lines in the warped raster dataset.
Affine Transformation 3 GCPs - can exactly map each raster point to the target location Any more than three links introduces errors, or residuals However, add more than three GCPs because if one link is positionally wrong, it has a much greater impact on the transformation Even though the transformation error may increase as with more GCPs -the overall accuracy of the transformation will increase as well
Second- or Third-order transformation In addition to translation, scaling, rotation - raster dataset may be bent or curved
Root Mean Square Error A measure of the error—the residual error Difference between where the from point ended up as opposed to the actual location When the error is particularly large, try to remove and add control points to adjust the error RMS of near zero or zero does not mean that the image is perfectly georeferenced Typically, the adjust and spline transformations give an RMS of near zero or zero; however, this does not mean that the image will be perfectly georeferenced