Presentation is loading. Please wait.

Presentation is loading. Please wait.

Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 7. Generalization, Abstraction,

Similar presentations


Presentation on theme: "Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 7. Generalization, Abstraction,"— Presentation transcript:

1 Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 7. Generalization, Abstraction, and Metadata © John Wiley & Sons Ltd

2 Outline Introduction Generalization basics Methods of generalization Measuring the degree of generalization Metadata

3 Can a Database be Perfect? The real world is infinitely complex a perfect description would have to be infinitely large and complex A geographic database must always approximate, generalize, abstract, or simplify we have many ways of doing this in GIS

4 How Hilly is Iowa? Iowa is relatively flat compared say to Colorado, or Switzerland, or Nepal people often think of it as flat Suppose the “slope” attribute in a database is given the value 0 for an object representing the state of Iowa this is a crude approximation it is much simpler than recording the slope at 30m intervals across the state it may be good enough for some purposes

5 GIS Compresses the Real World Representations are almost always “lossy” It is important to know how much loss has occurred by measuring the difference between the data and the real world we term this uncertainty, or the degree to which data leave us uncertain about the real world

6 Metadata are the Ultimate in Compression They describe the entire contents of a data set metadata are data about data the documentation and handling instructions for data Metadata are what make data useful without documentation and handling instructions data would have no value to a user without metadata it would be impossible to find data in a library or on the WWW

7 Generalization and Fields Many geographic phenomena are conceptualized as fields exactly one value of the phenomenon exists at every point in space think of elevation and land ownership as convenient examples In principle a field can take a different value everywhere creating an infinite amount of information Tobler’s Law helps by virtually guaranteeing that variation will be smooth and slow over space

8 Six Ways of Representing a Field All involve some kind of approximation or generalization All reduce the variation of the field to a set of objects and attributes that now look similar to phenomena conceptualized as discrete objects but the conceptualizations are very different

9 The six approximate representations of a field used in GIS. A. Regularly spaced sample points. B. Irregularly spaced sample points. C. Rectangular cells. D. Irregularly shaped polygons. E. Irregular network of triangles, with linear variation over each triangle (the Triangulated Irregular Network or TIN model; the bounding box is shown dashed in this case because the unshown portions of complete triangles extend outside it). F. Polylines representing contours. ABC DEF

10 Map Specifications Topographic maps are prepared by mapping agencies using specifications specific to each scale a scale’s specification sets the rules for representing real-world features on the map these rules involve generalization and approximation If a map meets its specification it can be said to be perfectly accurate even though its contents do not match the real world perfectly

11 Methods of generalization McMaster and Shea (1992) define 10 distinct types of generalization Generalization can affect a database permanently (database generalization) or can be temporary for the purpose of display (cartographic generalization)

12 Weeding Simplifying the shape of a line or an area by reducing the number of points in its representation The Douglas-Poiker algorithm drops points from a polyline or a polygon using a user-defined tolerance distance

13 4 1 15 A 2 3 B Tolerance The first two steps of the Douglas-Poiker algorithm. The endpoints of the polyline are first connected (A), and the point lying furthest from this line is found. If it lies further than the user-supplied tolerance distance, it is selected as a member of the simplified line, along with the two endpoints, and a new cycle of the algorithm is started. In the next cycle Points 2 and 3 lie within the tolerance of the line 1-4-15, but Point 7 does not. 7

14 In the final step 7 points remain (identified with green disks), including 1, 4, 7, and 15. No points are beyond the user- defined tolerance distance from the line.

15 Merging Another common form of generalization by aggregating adjacent areas Small areas can be generalized by removing any that fall below a user-defined threshold known as the Minimum Mapping Unit or MMU such areas are merged with their most similar neighbors

16 Measuring the Degree of Generalization Representative Fraction the ratio of distance on the map to distance on the ground also known as the scale e.g., 1:50,000 every 10 cm on the map correspond to 5 km on the ground

17 Scale for Digital Databases How can a digital database have a representative fraction if there are no distances to be measured in the database? A system of conventions allows digital databases to have scales e.g., use the scale of the map that was digitized or scanned to create the database

18 Minimum Mapping Unit Area can be a misleading indicator of importance e.g., a riparian zone along a stream

19 Spatial Resolution The smallest distance over which change is recorded Easily defined for raster data, but not for vector e.g., if census reporting zones vary greatly in area, what is the spatial resolution of census data? Resampling can create false spatial resolution e.g., dividing every pixel into 4 does not necessarily give finer spatial resolution spatial resolution is defined by the process of observation, not by such transformations as resampling

20 Example of resampling. The original cells outlined in black have been resampled to the cells outlined in red. New attributes of each cell have been assigned using the largest area rule.

21 Example of resampling an existing DEM to obtain a new DEM with shorter spacing between sample points. The black dots are the new DEM sample points, and the existing DEM provided mean elevations for each red square. The apparent improvement in spatial resolution as a result of resampling may not be justified.

22 Metadata Needed to automate the process of search for data compare using a library catalog Needed to determine the fitness of a data set for use particularly regarding quality Needed to handle data effectively e.g., format Needed to identify notable data contents e.g., to find images of an interesting hurricane

23 Metadata can be Expensive to Generate They represent a high level of abstraction and may need an expert to define But the benefits are substantial metadata make it possible to find data sets, and use them effectively they allow the benefits of investments in data to be realized

24 The U.S. FGDC Standard Content Standards for Digital Geospatial Metadata (CSDGM) Defined by a committee of U.S. Federal agencies Now widely used worldwide The basis of a new international standard Potentially several hundred items for one data set but easily boiled down to a much smaller number

25 The Dublin Core Standard Devised by the digital library community Suitable for any type of data, geospatial included easily extended to include essential items for geospatial data e.g., the latitude and longitude limits of the data set’s coverage

26 Geolibraries Repositories of data that can be searched for data covering geographic areas of interest this was very difficult in a conventional library using a card catalog each data set in a geolibrary is identified with a geographical footprint in a search, footprints are matched to the area of interest defined by the user

27 The Alexandria Digital Library The user picks an area of interest by interacting with a map, specifying latitude and longitude limits, or giving a place name. The library returns all data sets whose footprints match the query area, and which match other criteria also supplied by the user.

28 Collection-level Metadata Metadata describe each data set and allow users to search geolibraries such as Alexandria But how does the user know which geolibrary to search? collection-level metadata describe the contents of entire collections


Download ppt "Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 7. Generalization, Abstraction,"

Similar presentations


Ads by Google