Monday, October 12, 2015

Errors in GIS

Editing is an important operation during the process of database creation. GIS provides tools for examining coverages for mistakes. Errors may be introduced either from the original data source or during encoding. The process of detecting and removing errors by editing is called cleaning. Errors may be classified into:

  1. Entity errors
  2. Attribute errors and
  3. Entity-Attribute agreement error
Entity error may be of the following three types:
  1. Missing entities
  2. Incorrectly placed entities and
  3. Disordered entities
GIS systems should perform an operation to build topology after digitization process. Topology provides explicit information about relationships between entities in the database. This allows identification of types of errors in digital maps. Some of these errors are indicated by 'text-based error flags' while others might have to be interpreted by visually inspecting the database statistics. Six common types of errors are listed below:
  1. Not all entities that should be present are entered
  2. Extra entities have been digitised
  3. Not all entities are in the right place or are of the correct shape or size
  4. Some entities that are not supposed to be connected to each other are connected
  5. Not all polygons should have a single label point to identify them
  6. No entities should be outside the boundary identified by registration marks
A procedure for comparing the digitized entities and original map document is to produce a monitor display or a hard-copy plot of the pre-edited database . GIS sometimes provide several symbols that indicate errors.

Below is a detailed description of specific types of errors that are related to the six general types.
  • Nodes are often described as 'to nodes' and 'from nodes' and indicate overall extent of a line feature. Nodes should not occur at every line segment along a line or a polygon. Nodes should not be used to indicate points between line segments to indicate directional changes in line. Thus, false nodes or pseudo nodes is the first form of error that occurs when a line connects with itself (island pseudo node). Nodes are used to identify the existence of an intersection between two streets or a connection between two features (Eg: Stream and a Lake). GIS should flag existence of pseudo nodes using a graphic symbol. Pseudo nodes are NOT errors but only flags indicating the presence of potential problems. Pseudo nodes occur often due to pushing the wrong button on the digitizing puck. Incorrect nodes can be corrected by either selecting individually and deleting or by adding nodes where needed to convert an island to a polygon that is attached to other polygons.
  • A dangling node is a single node that is connected to a single entity. Dangle nodes can result from:
      • failure to close a polygon
      • failure to connect the node to the object (undershoot) or
      • going beyond the entity (overshoot)
    • sometimes, the problem is due to incorrect placement of digitizing puck or small setting the fuzzy tolerance distance.
  • In case of dangling nodes, a good practice is to overshoot than to undershoot as it is easier to find overshoots than undershoots.
  • The two types of errors that can occur in case of label points in polygons are:
      • Missing labels and
      • Too many labels
    • Both the above errors are caused due to failure to keep track of the digitizing process. In most cases, the error is caused due to:
      • Confusion
      • Disruption in digitizing process or
      • Fatigue
    • Such errors are easy to find and are indicated by a graphic device distinguishing them from other error types.
    • Editing is simply adding label points where necessary and deleting label points wherever they are not required.
  • Another common form of digitising error while using vector data model is treating each polygon as a separate entity resulting in digitising adjacent lines between polygon more than once. Failure to place the digitizing puck exactly at the correct location for each point along the line results in a series of tiny graphic polygons called SLIVER POLYGONS. Sliver polygons can also occur due to overlay operations.
    • The easy way to avoid sliver polygons during input is to use a GIS that does not require digitising the same line twice. If a line is digitized twice, the presence of a dangling node confirms this and the redundant line can be removed eliminating the problem.
    • Finding slivers in the absence of dangling nodes is very difficult. One way is to compare the number of polygons in the digital coverage with that of the original input map.
  • Another problem related to polygons is the production of weird polygons.Weird polygons are polygons with missing nodes. Frequently, this error is caused due to digitizing a point in the wrong place or in the wrong order.
    • A simple way to avoid this problem is to number the input points or by establishing a set pattern for the digitising polygons. 
After editing the coverage to correct the errors the GIS software will have to rebuild the topological structure on the basis of new entities. Finally, the new coverage must be saved.