Saturday, October 10, 2015

Conflation in GIS

Conflation is the action of unifying two distinct datasets into a new dataset. This may be relatively easy to extremely difficult depending upon the complexity of representation and the size and quality of datasets involved.

Conflation terminology:

'Matching' is the activity of identifying features or data elements that represent the same real-world entity.

'Alignment' describes the degree to which the two features or data elements have coincident geometry

'Adjustment' is the alteration of geometry or attributes of matched features to align them.

The 'reference dataset' is the one which is to be conflated to. It is of greater spatial accuracy than the 'subject dataset'

The 'subject dataset' is the dataset to be matched or adjusted.

Conflation problems can be classified into:
-Horizontal
-Vertical and
-Internal

Horizontal conflation is the process of eliminating discrepancies along the common boundary of datasets that are adjacent to one another. These include datasets containing data from same feature classes. For example, aligning boundaries of adjacent coverages or edge-matching neighbouring networks.

Vertical conflation involves matching or eliminating discrepancies between datasets that occupy the same area in space. For example, road network matching between two representations of roads in the same region.
Two important types of vertical conflation are:
-Version matching and
-Feature Alignment

In version matching, the input datasets consist of different versions of the same features. The conflation process helps identify matching features. Attributes are transferred between matched features and unmatched features are transferred completely. For example, matching different versions of road networks for the same geographical area.

In feature alignment, the input data consists of features from two or more different feature classes that have some defined relationship to each other. The conflation process is aimed at removing discrepancies between datasets that falsify this relationship. For example, geometric alignment. A specific example in this context is that of aligning boundaries of different kinds of feature classes such as municipal districts

Internal conflation involves resolving features or element within a single dataset. For example, coverage cleaning may require removal of overlaps in polygons within a coverage.

CONFLATION WORKFLOW:
The process of conflation can be broken down into the following sub tasks:
i. Data pre-processing: This step normalizes the datasets and ensures that they are compatible. This may involve format translation and other basic preparation of the datasets. An example of data pre-processing is to ensure that the datasets must have the same coordinate system.
ii. Data quality assurance: In this step, the internal consistency of the datasets is verified and improved if necessary. Sometimes, conflation tasks require that datasets have an internal level of consistency. For example, coverage alignment algorithms require that the input datasets are a clean coverage.
iii. Dataset alignment: In case the datasets are mis-aligned, an initial alignment process is required to carry-out precise conflation. This alignment is coarse-grained in nature and does not align individual features.
iv. Feature matching: This step involves matching of common features between datasets. After this phase is performed, the discrepancies between datasets would have been identified and can be visualised. It is used to provide statistical summaries of data quality.
v. Geometry alignment removes discrepancies between geometries
vi. Information transfer involves updating one dataset with information from the other. This information can be either attributes or geometry to be added to an existing feature or entire features to be added to the dataset.

SCHEMATIC REPRESENTATION OF CONFLATION WORKFLOW