Thursday, December 3, 2015

Output formatting

GIS output is in the form of:

  1. Maps
  2. Charts

  1. Charts can be used to display tabular data
  2. Types of charts available are generally of six types. They are:
    1. Area chart
    2. Bar chart
    3. Line chart
    4. Column chart
    5. Pie chart and
    6. Scatter chart
A map is a spatial model of the real world and is differentiated from it (the real world) due to:
  • Abstraction 
    • Imaginary & physical features
    • Past, Present and future features
  • Focus
    • Selection and classification of features to be included in map
  • Simplification
    • Simplification of complex features 
    • Exaggeration of small features
  • Symbolization
    • Using symbols to represent objects
  • Scale
    • Ratio of distance on map to distance on Earth
  • Projection
    • Representing curved Earth on a flat plane
  • Purpose 
    • To describe, measure, communicate/persuade.
Types of maps 
  1. Planimetric
  2. Topographic
  3. Cadastral
  4. Image
  5. Thematic
Below listed are a few general principles to create maps:
  1. Output maps (GIS output) should be kept simple to understand and not cluttered with too much information by relying excessively on software defaults
  2. Only the area under study should be included along with a small map inserted showing the relative location.
  3. The maps should be created keeping in mind the final publication sale. This implies a size reduction for report or journal publication. It should be borne in mind that point symbols reduce in visibility at twice the rate of line features.
  4. The title should be mentioned at the top in thick line font: serif/sans-serif; case not critical
  5. Map labels should be formatted as per the following guidelines:
    • labels should be in serif font (first letter uppercase and remaining lowercase)
    • labels should be placed above and to the right of the feature (2nd choice: above and to the left)
    • Font size hierarchy should indicate relative importance
    • Water features traditionally labeled in italic font
  6. The map should be created in black and white and the colour should be added later
Colour
Hue refers to the actual colour; Saturation refers to the amount of the colour, while value refers to the amount of black used.
For optimum readability, low saturation in the background and high saturation in the foreground is recommended.

Choice of shading
In case of:
  • Quantitative data:
    • Symbols should have visual progression corresponding to data values
    • For polygons, using monochromatic colour ramp: same colour, different saturation
      • can highlight top and bottom with contrasting colour
      • "visual progression" should be ensured if different fill patterns are used
    • For point symbols, different sizes of the same symbol must be used.
  • Qualitative data:
    • For polygons, different fill colours or different patterns should be used
    • For point symbols, same size of different symbols should be used
  • The legend should be big enough to clearly show different fill patterns

Simple analysis in GIS

Simple analysis in GIS:

  1. Spatial measurements: GIS makes spatial measurements easy to perform. Spatial measurements can be
    1. Distance between two points
    2. The area of a polygon
    3. The length of a line or boundary.
    4. Calculations can be 
      1. of a simple nature, such as measuring areas on one map, or 
      2. complex, such as measuring overlapping areas on two or more maps. 
  2. Information Retrieval: A GIS can help point at a location, object, or area on the screen and retrieve recorded information about it from the Database Management System (DBMS) which holds the information abut the map’s features. In order for a GIS to answer the question "what is where?" we need to carry out retrieval. Retrieval is the ability of the DBMS or GIS to get back on demand data that were previously stored (Clarke, 1997). As Clarke put it "Geographic search is the secret to GIS data retrieval" so GIS systems have embedded DBMSs, or link to a commercial DBMS.
  3. Searches by attribute: Most GIS systems include a basic relational database system. All DBMS include functions for basic data display. Searches by attribute are then controlled by the capabilities of database manager. Find is the basic attribute search. Find is intended to get a single record.
  4. Searches by geography: In a map database the records are features. The GIS spatial retrieval is the generating maps, which allow searching for information visually and highlight the result. For example to generate a report; the spatial equivalent would to produce a finished map, the spatial equivalent of a find is locate. Spatial equivalents of the DBMS queries result in locating sets of features, or building new GIS layers. These include: Spatial searching, browsing the map and picking features, Spatial sorting to identify features that result from attribute sorting.Combinations of spatial and attribute queries can build some complex and powerful GIS operations. Typical GIS searches are point in polygon, line in polygon, and point distance to line.
  5. Spatial overlay: One basic way to create or identify spatial relationships is through the process of spatial overlay. Spatial overlay is accomplished by joining and viewing together separate data sets that share all or part of the same area. The result of this combination is a new data set that identifies the spatial relationships.
  6. Boundary analysis: Boundary analysis, which is referred to as districting and helps define regions according to certain criteria. This procedure is used to define area of specific demographic characteristic.  Since districting is normally an iterative process involving the development of numerous scenarios based on various combinations of desired criteria, the computing power of the GIS helps in saving time and effort. GIS helps to interactively define proposed boundaries and have related population totals automatically computed in rapidly and efficiently. 
  7. Buffer analysis: Buffer analysis is used to identify areas surrounding geographic features. The process involves generating a buffer around existing geographic features and then identifying or selecting features based on whether they fall inside or outside the boundary of the buffer.
  8. Neighborhood Operations: Neighborhood operations can evaluate the characteristics of the area surrounding a specific location. Neighborhood operations include the following: Search (Average, Diversity, Majority, Maximum/Minimum, and total), Topographic, Interpolation (interpolation involves using known cell values to predict predicting the values of intermediate cells), and Contour Generation. 
  9. Connectivity Functions: Connectivity functions involve traversing an area and accumulating values: Contiguity measures, Proximity, Network functions, Spread, Seek and Stream functions

Modelling in GIS

A model is a representation of some part of the real world and hence has certain characteristics common with the real world. It therefore possible to study and operate on the model instead of the real world under various conditions. This is a very effective method to answer 'what if' questions. The procedure can be repeated by changing the data or altering the parameters of the model.
A map is a miniature representation of some part of the real world is a model while databases are also models. Maps and databases are usually static models.
Modelling and GIS are inseperable as GIS is a tool for modelling the real world.
Models can be static if the input and output both correspond to the same point in time, or dynamic, if the output represents a later point in time than the input. Static models often take the form of indicators combining various inputs to create a useful output. On the other hand, dynamic models represent a process that modifies or transforms some aspect of the Earth's surface through time.
In the context of GIS, modeling is defined as the operations of GIS that emulate processes of the real world at an instant or over a period of time. GIS models can be used to evaluate or predict future landscapes.

Analog GIS model is a representation of the real-world system in which every part of the real world appears in miniature in the model. In an analog model, all aspects of the system must be scaled by
the same ratio for the model to be valid.
In a digital model, all operations are conducted using a computer. Data is assembled in a data model and relevant aspects of the real world and coded to patterns of 0s and 1s. Digital models do not have a representative fraction. The level of geographic detail is captured in the spatial resolution.

Discrete models imitate processes that operate between discrete entities. Continuous models are formed on variables that are continuous functions of location. The discrete-object view and the continuous field-view are widely accepted distinctions between the conceptualizations of geographic space and geographic variation.
Geographic space is empty except where it is occupied by point, line or area objects that may overlap, do not exhaust all available space and are countable. For example, the discrete-object view is best to describe and represent biological organisms or man-made features like buildings, vehicles or fire hydrants.
In the continuous field view, there are no gaps in coverage and there is exactly one variable at each location.Continuous field models express knowledge of the operation of the physical system in terms of partial differential equations which relate the values, rate of change through time, spatial gradients and spatial curvatures of the continuously varying quantities.

Individual and aggregate models- It is possible to model any system using a set of rules about the mechanical behavior of the system’s basic objects (individual model). However, if the number of basic objects is far too large for this approach to be practical, the problem is solved by replacing individual objects with continuously varying estimates of abstracted properties such as density. Another approach is to aggregate (aggregate model) individual objects into larger wholes and to model the system through the
behavior of these aggregates.

Cellular Automata: In a cellular automaton, spatial variation is represented as a raster of fixed resolution, each cell is assigned to one of a number of defined states. Such models have been used widely to study processes of urban growth, in which case the possible states are limited. At each time step, the next state of each cell is determined by a number of rules based on the properties of the cell and its neighbors and on the states of the cell and its neighbors. The concepts of cellular automata were first explored by John Conway.

AGENT-BASED modeling is a computer based simulation in which a program is written to simulate the real-world situation. An agent-based model consists of an environment or framework that defines the scope and rules of actions, along with a number of agents representing one or more actors whose parameters and behaviors are defined. When the model is run, the characteristics of each agent are tracked through time and space. Agent-based modeling has found many interesting applications to geographic phenomena. Several efforts have been made to apply agent-based modeling to the emergence of land-use and land-cover patterns, with particular emphasis on the processes that lead to
greater fragmentation of land cover as a result of development and thus to problems for species that require specialized natural habitat.

Maintenance and analysis of spatial and attribute data

Maintenance and analysis of spatial data
Maintenance and analysis of spatial data can be performed by one of more of the operations described below:

  1. Format transformation - These functions convert between data formats of different systems. For example: reading a DXF file into a GIS
  2. Geometric transformation - These functions help transform device coordinates (coordinates from digitizing tables or screen coordinates) into world coordinates (geographic coordinates, etc)
  3. Map projections - These functions help map spatial geographic coordinates on a flat surface (map) and vice-versa
  4. Edge matching - This is the process of joining two or more map sheets. Feature representations should be matched at map sheet boundaries.
  5. Editing of graphic elements - These functions are used to remove and correct errors in order to prepare a clean dataset for topology building
  6. Coordinate thinning - This function involves removal of redundant coordinates from line representations. 
Maintenance and analysis of attribute data
The functions involving maintenance and analysis of attribute data correspond to functions of conventional database systems and contain attribute editing and attribute query functions.

Monday, November 30, 2015

Text patterns and line styles

Text Patterns


Line styles
Line Styles define how line features are rendered on a map. Line styles offer numerous controls for color, gradient or pattern fills, dash patterns and many other properties.

Line styles have the following properties listed below:

  1. Name - Each line style is given a name for easy identification and use when creating a map
  2. Colour - This property controls the colour and appearance of the line feature. The line can be filled with a colour, gradient, hatch pattern or bit mapped texture.
  3. Thickness - This property sets the thickness of the line (in pixels).

Pattern Type - Selects whether to use a basic (predefined) dash pattern or create a custom pattern. 
Dash Style - Allows a choice of dash style to be used. The following styles are generally used:
  1. Solid: This pattern draws a solid outline without dashing and this is the default setting
  2. DashDot: This pattern draws a dashed outline with a “dash-dot” pattern.
  3. DashDotDot: This pattern draws a dashed outline with a “dash-dot-dot” pattern
  4. Dot: This pattern draws a finely dotted outline
  5. Dash: This pattern draws a dashed outline
A Custom Pattern Type is also available where the Dash Style property will be displayed as a text box where a custom dash pattern is entered. Dash patterns are expressed as a comma-delimited list of pixel lengths. The first number represents the length of the first dash in the pattern. The next number represents the length of the first space after that dash, and so on. When the end of the sequence is reached, it repeats.

A Line Join determines what type of joint to draw at the point where one line ends and another begins.

Sampling GIS data

Sampling GIS data
Since it is virtually impossible to obtain detailed information regarding every location on the Earth's surface (specifically for GIS), it is best to take samples from a smaller representative subset. GIS deals with explicitly spatial data. The two ways in which spatial data can be sampled is:
-Directed sampling and
-Non-directed sampling
Directed sampling involves making decisions about the objects to be viewed and cataloged. This population occupies an area and both the area and the sample population form the sampling frame.
Non-directed, probability-based sample is preferable since it eliminates bias.
Probabilistic sampling is of four types:
-Random sampling
-Systematic sampling
-Stratified sampling and
-Homogeneous sampling
The above sampling types can be combined to create a hybrid sampling scheme.
Random spatial sampling is the most basic sampling design. It allows each point, line, area or surface feature to be selected with the same probability as the next.
Systematic sampling use a repeatable pattern as a basis for selection.
Stratified spatial sampling selects small areas within which individual spots, objects or features are sampled. Stratifying simplifies the process by dividing the task into small regions.

Data quality refers to the fitness for use of data for intended applications. The components of data quality are listed below:
-The data must be reliable and accurate to be considered as usable. It should be in agreement with the real world being represented.
-The data must be current and up-to-date for the intended application.
-The data must be relevant in terms of content and level of detail
-The data must be timely for the application
-The data must be complete and precise or the degree of uncertainty must be indicated
-The data must be concise and intelligible (comprehensible to the user)
-The data must be stored in a format that can be conveniently handled (data handling involves one or more of the following operations: maintenance, transmission, distribution, classification, re-sampling, retrieval and updation)
-The data must be documented regarding its source to enable users to determine their suitability for a certain application
-The data must be stored in map projection that best meets the requirements of the application with regards to the preservation of area, shape, distance and direction
-The data should be captured at a scale using a classification scheme suitable to the application for which it will be used
-Cartographic properties (physical condition of mapping media, quality of line work, use of color and symbology, classification of features, map maintenance and revision cycle) play an important role in determining the quality of data. 

Types of errors in GIS data, their elimination and accuracies

TYPES OF ERRORS IN GIS DATA THEIR ELIMINATION AND ACCURACIES 
Maps created by GIS have deficiencies.  These deficiencies occur due to “Errors” that may have taken place at different stages of GIS implementation. Flaws in data are referred to as errors. An error is the physical difference between the real world and the GIS output. Errors may be single, definable departures from reality, or maybe persistent widespread deviations throughout a whole database. Errors reduce the accuracy of the map generated. Using well defined and controlled procedures errors can be avoided.
There are two types of errors in GIS:
        I.      Source Errors:
                 They are the errors that are present in “Source Data” that is given to the GIS. They occur      before the actual implementation of GIS:

                                            a.      Instrumental inaccuracies:
                                               ● Satellite/ air photo/ GPS/ surveying (spatial).
                                               ● Inaccuracies in attribute measuring instruments.
                                            b.      Human Processing:
                                               ● Misinterpretation (e.g. photos), spatial and attribute.
                                               ● Effects of scale change and generalization.
                                               ● Effects of classification (nominal / ordinal / interval).
                                            c.      Actual Changes:
                                               ● Gradual 'natural' changes: river courses, glacier recession.
                                               ● Catastrophic change: fires, floods, landslides.
                                               ● Seasonal and daily changes: lake/sea/ river levels.
                                               ● Man-made: urban development, new roads.
                                               ● Attribute change: forest growth (height etc.), discontinued trail /roads,                 road surfacing.


        II.      Processing Errors:
                  They are the errors that occur during the processing of the data i.e. during the                       implementation of GIS.

                   a.     Input:
                       ● Digitizing: human error, the width of a line, spikes, knots, also entering attribute                                                      data.
                       ● Dangling nodes (connected to only one arc): permissible in arc themes (river                    headwaters etc.).
                       ● Pseudo-nodes (connected to one or two arcs) - permissible in island arcs, and where    attributes change,
                            e.g. road becomes paved from dirt or vice versa.
                       ● Projection input error.

        b.     Manipulation:
                ● Interpolation of point data into lines and surfaces.
                ● Overlay of layers, digitized separately, e.g. soils and vegetation.
                ● The compounding effects of processing and analysis of multiple. layers: for example, if        two layers each have correctness of 90%, the accuracy of the resulting overlay is                                   around 81%
                ● Density of observations.
                ● Inappropriate or inadequate inputs for models.

         c.     Output:
                 ● Scale changes - detail and scale bars.
                 ● Color palettes: intended colors don't match from screen to Printer.

In addition to the above, errors may also be introduced during data transfer and conversion. These types of errors, may lead to loss of data.
It is a misconception that data from a GIS is of a higher quality. The quality of information from a GIS depends upon the quality  of the data used as input to the GIS.

Precision indicates how closely several positions fall in relation to each other. Precision is the recorded level of detail of the data.

Accuracy is a measure of the closeness of one or more positions to a position that is known and defined in terms of an absolute reference system. Accuracy is the extent to which an estimated data value approaches its true value (Aronoff, 1989).

Bias in GIS data is the systematic variation of data from reality. Bias is a consistent error throughout a dataset. Bias in GIS data may be caused due to human or technical sources.

ERROR, PRECISION, ACCURACY and BIAS affect the quality of individual datasets.

GIS error elimination
The simplest method of checking for data errors in GIS is by visual inspection. Comparison of data in GIS format with the original map reveals major errors. Double digitising is an error checking method used by large companies. This involves digitising the same map twice and comparing the two copies to identify inconsistencies. This is a costly and time consuming method of error checking. Statistical methods can be used to pinpoint potential errors. 

Components of data quality

COMPONENTS OF DATA QUALITY
There are two important components of data quality. They are:
-Terminology used for describing problems and
-Sources, propogation and management of errors.
GIS analysis and results rely heavily on access to reliable, good quality data.
Quality is defined as "degree of excellence". In case of GIS, data quality indicates how good the data are. This in-turn indicates the suitability of data for a specific purpose. Data sets used for analysis should be COMPLETE, COMPATIBLE, CONSISTENT and APPLICABLE for the analysis being performed.
A COMPLETE dataset covers the entire study area and the time period of interest or the data are complete spatially and temporally along with a complete set of attribute information.
COMPATIBLE datasets are developed using SIMILAR METHODS of data capture, storage, manipulation and editing.
CONSISTENT data should come from SINGLE SOURCE DOCUMENT and be digitized by a SINGLE PERSON.
APPLICABILITY refers to SUITABILITY of data for a set of COMMANDS, OPERATIONS or ANALYSIS. 

Graphic symbols

Graphic symbols
A symbol is a graphic pattern that is used to represent a feature on a map. According to the type of features they represent, symbols are classified into marker symbols (representing point and node features), line symbols (representing arcs, routes, etc), shade symbols (to fill polygons and regions with solid colour or shade patterns) and text symbols (representing text used to label features).
Symbols on a map are either points, lines or areas. Each symbol might differ in size, shape, density, texture, orientation and colour. A poor match between the real world and the symbol used to depict it can confuse the user regarding its true nature. Colour is important in influencing the user's understanding of the map. The overall impact of the map can be affected by colour and symbolism. Black and white maps use differently shaped symbols and shading patterns are effective for simple patterns while being easier and cheaper to reproduce. Colour maps are used to effectively represent complex spatial patterns. The shape and pattern of symbolism used should bear relation to the feature being represented. 
Careful choice of shape and pattern of symbolism used can influence the user's impression of the map. Density and texture of shading can affect the impression given by the map. To ensure correct interpretation, the map user is given the necessary information by a key or legend. A key tells the user what the shading patterns, colours, line and point symbols mean.
Intellectual objectives and visual objectives of map design are sometimes in conflict. Symbols on the map (logical and aesthetic) dictate a consistent approach. Cartographers have established (over a period of generations) an exhaustive set of conventions and traditions to deal with such cases. These conventions were a result of trial and error and testing among map users. They act as guidelines for cartographic representations.


Friday, November 27, 2015

Types of errors in GIS

Maps created by GIS have deficiencies.  These deficiencies occur due to “Errors” that may have taken place at different stages of GIS implementation. These errors reduce the accuracy of the map generated. Using well defined and controlled procedures these errors can be avoided.
There are two types of errors in GIS:
                I.      Source Errors:
                        They are the errors that are present in “Source Data” that is given to the GIS.
                        They occur before the actual implementation of GIS:
                        a.      Instrumental inaccuracies:
                                               ● Satellite/ air photo/ GPS/ surveying (spatial).
                                               ● Inaccuracies in attribute measuring instruments.
                        b.      Human Processing:
                                               ● Misinterpretation (e.g. photos), spatial and attribute.
                                               ● Effects of scale change and generalization.
                                               ● Effects of classification (nominal / ordinal / interval).
                        c.      Actual Changes:
                                               ● Gradual 'natural' changes: river courses, glacier recession.
                                               ● Catastrophic change: fires, floods, landslides.
                                               ● Seasonal and daily changes: lake/sea/ river levels.
                                               ● Man-made: urban development, new roads.
                                               ● Attribute change: forest growth (height etc.), discontinued trail /                                                             roads, road surfacing.


              II.      Processing Errors:
                       They are the errors that occur during the processing of the data i.e. during the                                        implementation of GIS.

                       a.     Input:
                                             ● Digitizing: human error, the width of a line, spikes, knots, also entering                                                  attribute data.
                                             ● Dangling nodes (connected to only one arc): permissible in arc themes                                                 (river headwaters etc.).
                                             ● Pseudo-nodes (connected to one or two arcs) - permissible in island                                                        arcs, and where attributes change, e.g. road becomes paved from dirt                                                      or vice versa.
                                             ● Projection input error.

                     b.     Manipulation:
                                             ● Interpolation of point data into lines and surfaces.
                                             ● Overlay of layers, digitized separately, e.g. soils and vegetation.
                                             ● The compounding effects of processing and analysis of multiple.                                                            layers: for example, if two layers each have
                                                 correctness of 90%, the accuracy of the resulting overlay is around                                                        81%.
                                             ● Density of observations.
                                             ● Inappropriate or inadequate inputs for models.

                     c.     Output:
                                             ● Scale changes - detail and scale bars.
                                             ● Color palettes: intended colors don't match from screen to Printer. 

Wednesday, November 25, 2015

Neighbourhood operations

NEIGHBORHOOD OPERATIONS 

Neighborhood operations, also called proximity analyses, consider the characteristics of neighboring areas around a specific location.  These functions either modify existing features or create new feature layers, which are influenced, to some degree, by the distance from existing features.  All GIS programs provide some neighborhood analyses, which include buffering, interpolation, Theissen polygons, and various topographic functions.

Buffering
Buffering creates physical zones around features.  These “buffers” are usually based on specific straight-line distances from selected features common to both raster and vector systems, are created around point, line, or polygon features.  The resulting buffers are placed in an output polygon feature layer.  Once complete, buffer layers are used to determine which features (in other layers) occur either within or outside the buffers (spatial queries), to perform overlay, or to measure the area of the buffer zone.  They are the most used neighborhood operation.

Interpolation
Interpolation is a method of predicting or estimating pixel values at unsampled locations based on the known values of neighboring pixels.  Since it is impractical to take measurements at all locations across the study area due to money, time, legal, and physical constraints, interpolation is required between known pixel values (sampled locations).  With interpolation, a continuous surface like elevation, temperature, and soil characteristics can be created.  Because of its continuous nature, interpolation is only available within raster-based systems.
There are many different types of interpolation:

  1. Linear interpolation, the simplest form, assumes that the value change over distance from recorded pixels is uniform.  This interpolation method is not always appropriate, so there are other methods like
  2. Fixed-radius Local Averaging
  3. Inverse Distance Weighted
  4. Trend Surface
  5. Splines and
  6. Kriging.  
All of these interpolation methods look at the values of the recorded pixels to generate the value of the pixels that fall in between.  The methods differ in how they weigh the recorded attributes and in the number of observations used for each method.  No method is accurate in every situation.


Theissen polygons (voronoi or proximal polygons)
Theissen polygons are boundaries created around points within a point layer.  The resultant polygons form around each of the points, and they delineate territories around which any location inside the polygon is closer to the internal point (that created it) than to any other point in the layer.  Attributes associated with each point are assigned to the resultant polygon.  It is a vector and raster process.

Topographic Functions
Topographic functions use Digital Elevation Models (DEMs) to illustrate the lay of the land.  DEMs are raster layers containing elevation data in each pixel.  From these values, you produce output layers to portray slope (inclination), aspect (direction), and hillshading.  These topographic functions are typical neighborhood processes; each pixel in the resultant layer is a product of its own elevation value as well as those of its surrounding neighbors.

Slope layers exhibit the incline or steepness of the land.  It is the change in elevation over a defined distance.
Aspect is the compass direction in which a slope faces.  From north, it is usually expressed clockwise from 0 to 360 degrees.
Hillshading, which is cartographically called shaded relief, is a lighting effect which mimics the sun to highlight hills and valleys.  Some areas appear to be illuminated while others lie in shadows.

These functions are raster processes, most can be mimicked in a vector environment by Triangulated Irregular Networks (TIN).  In addition, topographic functions can derive vector isolines (contours).

Neighborhood functions operate on the neighboring features of a given feature or set of features.
Search functions allow the retrieval of features that fall within a given search window (rectangle, circle, or polygon).

Line-in-polygon and point-in-polygon functions determine whether a given linear or point feature is located within a given polygon, or they report the polygon(s) that a given point or line are contained in.

Topographic functions compute the slope or aspect from a given digital representation of the terrain (digital terrain model or DTM).
Interpolation functions predict unknown values using the known values at neighboring locations.
Contour generation functions calculate contours as a set of lines that connect points with the same attribute value. Examples are points with the same elevation (contours), depth (bathymetric contours), barometric pressure (isobars), or temperature (isothermal lines).

Record overlay

Overlay is one of the most common and powerful GIS functions.  It investigates the spatial association of features by “vertically stacking” feature layers to investigate geographic patterns and determine locations that meet specific criteria.
Overlay operations belong to the most frequently used functions in a GIS application. They allow to combine two different layers and apply the set theoretic operations of intersection, union, difference, and complement.
It is the best-known GIS function. Vector and raster models both perform overlay, but their overlay functions differ considerably.

Vector (Logical) Overlay
Vector overlay predominantly overlays polygons in one layer over polygons in another layer, but it can also be used to overlay point or line features over polygon layers.  Sometimes referred to as topological or logical overlay, it is conceptually and mathematically more demanding than raster overlay.  There are three types of vector overlay operations:
Line on polygon is similar to point in polygon, but lines are superimposed on polygons.  This type of spatial join either appends polygon attributes to line features falling within them or counts and aggregates line attribute data to the polygon layer’s data file.
  1. Polygon on polygon is where one polygon layer is superimposed over another polygon layer to create a new output polygon layer.  The resultant polygons may contain some or all of the attributes from the polygons in which they were created.  Several types of polygon on polygon overlay exist, including intersection (A and B), union (A or B), and clip (A not B).  These Boolean operators work both on the attribute table and the geography. 
    1. Intersection computes the geometric intersection of all of the polygons in the input layers. Only those features that share a common geography are preserved in the output layer.  Any polygon or portion of a polygon that falls outside of the common area is discarded from the output layer.  The new polygon layer can possess the attribute data of the features in the input layers.
    2. Union combines the features of input polygon layers.  All polygons from the input layers are included in the output polygon layer.  It can also possess the combined attribute data of the input polygon layers.
    3. Clip removes those features (or portions of features) from an input polygon layer that overlay with features from a clip polygon layer.  The clip layer acts as a cookie cutter to remove features (and portions of features) that fall inside the clip layer. 
  2. Point in polygon is where a layer of point features is superimposed over a layer of polygon features.  The two layers produce a point layer that includes attributes from the surrounding input layer polygons. Alternatively, the number of point features falling within each polygon are counted and stored as a new attribute in the polygon layer.  Other point attributes can be aggregated (summed, averaged, etc.) and included as attributes in the polygon’s data file.  The transferring of attributes based on their geographic position is called a spatial join.
  3. Line on polygon is similar to point in polygon, but lines are superimposed on polygons.  This type of spatial join either appends polygon attributes to line features falling within them or counts and aggregates line attribute data to the polygon layer’s data file.
Raster (Arithmetic) Overlay
Raster overlay superimposes at least two input raster layers to produce an output layer.  Each cell in the output layer is calculated from the corresponding pixels in the input layers.  To do this, the layers must line up perfectly; they must have the same pixel resolution and spatial extent.  If they do not align, they can be adjusted to fit by the pre-processing functions. After preprocessing, raster overlay is flexible, efficient, quick, and offers more overlay possibilities than vector overlay. 

Raster overlay, frequently called map algebra, is based on calculations which include arithmetic expressions and set and Boolean algebraic operators to process the input layers to create an output layer.  The most common operators are addition, subtraction, multiplication, and division, but other popular operators include maximum, minimum, average, AND, OR, and NOT.  Raster overlay simply uses arithemetic operators to compute the corresponding cells of two or more input layers together, uses Boolean algebra like AND or OR to find the pixels that fit a particular query statement, or executes statistical tests like correlation and regression on the input layers

Monday, November 9, 2015

IMPORTANT OBJECTIVE QUESTIONS

Geographic coordinate systems contain DATUMS that are based on SPHEROIDS
An AFFINE transformation is commonly used in GIS to transform maps between coordinate systems
The process of converting a newly digitised map into projected coordinates is called MAP TO MAP TRANSFORMATION
Image to map transformation is also called GEOREFERENCING
GIS is differentiated from other information systems due to its SPATIAL ANALYSIS FUNCTIONS
The two fundamental functions of a GIS are: (i) GENERATION OF MAPS and (ii) GENERATION OF TABULAR REPORTS
Spatial analysis builds operational procedures on SPATIAL RELATIONSHIPS BETWEEN MAP FEATURES
If no spatial information is required to ask a question, the query is called an ATTRIBUTE QUERY
The design of analysis depends upon the PURPOSE OF STUDY
The use of GIS to inquire geographic features and retrieve associated attribute information is called IDENTIFICATION
The tools that characterise spatial variability are: SPATIAL AUTOCORRELATION FUNCTION & VARIOGRAM
Spatial analysis can be INDUCTIVE, DEDUCTIVE or NORMATIVE
The ability to CALCULATE & MANIPULATE DISTANCES forms the basis of spatial analysis
Point patterns can be identified as CLUSTERED, DISPERSED or RANDOM
The six categories of spatial analysis are: QUERIES & REASONING, MEASUREMENTS, TRANSFORMATIONS, DESCRIPTIVE SUMMARIES, OPTIMIZATION and HYPOTHESIS TESTING
The four interpolation methods are: INVERSE DISTANCE WEIGHTING (IDW), TRIANGULAR IRREGULAR NETWORKS (TIN), REGULARISED SPLINES WITH TENSION (RST) and KRIGING
Operators associated with connectivity are: NETWORK ANALYSIS, DIFFUSION MODELS, CELLULAR AUTOMATA and AGENT BASED MODELS
The two connectivity functions widely used in GIS are: (i) CONTIGUITY and (ii) SPREADING
The four parameters used to measure proximity are: (i) TARGET LOCATIONS, (ii) UNIT OF MEASUREMENT, (iii) FUNCTION TO CALCULATE PROXIMITY & (iv) AREA TO BE ANALYSED
A common type of proximity analysis is ESTABLISHMENT OF A BUFFER ZONE
The three principal types of GIS analysis performed by networking are: (i) PREDICTION OF LOADING, (ii) RATE OPTIMENTATION & (iii) RESOURCE ALLOCATION
INTERVISIBILITY FUNCTIONS rely on digital elevation data to define surrounding topography
A GIS application can be classified into: (i) FOUR DIMENSIONAL GIS, (ii) MULTIMEDIA GIS, (iii) WEB GIS & (iv) VIRTUAL REALITY GIS
The four categories under which GIS functions fall are: (i) RETRIEVAL/CLASSIFICATION/MEASUREMENT FUNCTIONS, (ii) OVERLAY FUNCTIONS, (iii) NEIGHBOURHOOD FUNCTIONS & CONNECTIVITY FUNCTIONS
Vector methods of spatial analysis are good for SPARSE DATA SETS
Raster methods of spatial analysis are good for GRID CALCULATIONS
Examples of neighbourhood functions are: AVERAGE, DIVERSITY, MINIMUM/MAXIMUM & TOTAL
Five steps involved in cartographic modeling are: (i) STATEMENT OF PROBLEM OR OBJECTIVES, (ii) STATEMENT OF CONDITIONS OR ASSUMPTIONS (iii) METHODOLOGY (iv) IMPLEMENTATION & (v) EVALUATION
Raster data analysis is based on CELLS & RASTERS
Reclassification is also called RECODING or TRANSFORMING
The two types of global functions are: (i) EUCLIDEAN DISTANCE & (ii) WEIGHTED DISTANCE
The four groups of mathematical functions provided by raster calculator are: (i) LOGARITHMIC, (ii) ARITHMETIC, (iii) TRIGONOMETRIC and (iv) EXPONENTIAL
The four major reasons requiring reclassification are: (i) TO REPLACE VALUES, (ii) TO GROUP VALUES, (iii) TO RECLASSIFY VALUES & (iv) TO SET SPECIFIC CELLS TO A VALUE
The two sub-systems in a knowledge based system are: (i) KNOWLEDGE BASE & (ii) AN INFERENCE ENGINE
Knowledge based systems were first developed by ARTIFICIAL INTELLIGENCE RESEARCHERS
A GIS model may be descriptive or PRESCRIPTIVE; DETERMINISTIC or stochastic; STATIC or dynamic; deductive or INDUCTIVE
The five types of digital elevation data are: (i) DIGITAL SURFACE MODEL (DSM), (ii) DIGITAL TERRAIN MODEL (DTM), (iii) BATHYMETRIC SURFACE, (iv) MIXED SURFACE and (v) POINT CLOUD ELEVATION FILE
The process of detecting and removing errors by editing is called CLEANING
Errors are classified into (i) ENTITY ERRORS, (ii) ATTRIBUTE ERRORS & (iii) ENTITY-ATTRIBUTE AGREEMENT ERRORS
Undershoots and overshoots are types of DANGLING NODES
Remotely sensed data are used to measure several environmental parameters like: (i) SURFACE & CLOUD TOP REFLECTANCES,(ii) ALBEDO,(iii) AREA AND POTENTIAL YIELD OF CROP TYPES & (iv) HEIGHT AND DENSITY OF FOREST STANDS
Satellite scanners operate in either a SWEEP mode or a PUSHBROOM mode
An overlay operation can be classified as (i) POINT-IN-POLYGON, (ii) LINE-IN-POLYGON, or (iii) POLYGON-IN-POLYGON
An overlay operation is called INTERSECT if it uses the and connector
An overlay operation is called UNION if it uses the or connector
An overlay operation is called IDENTITY if it uses a combination of the and and or connectors
An overlay operation is called DIFFERENCE or symmetrical difference if it uses the XOR connector
The most common type of output of a GIS is a MAP
The five types of vector data analysis are: (i) BUFFERING, (ii) OVERLAY, (iii) MEASURING DISTANCES, (iv) SPATIAL STATISTICS FOR PATTERN ANALYSIS & (v) TOOLS FOR MAP MANIPULATION

Saturday, November 7, 2015

COST-BENEFIT ANALYSIS OF AN INVESTMENT IN GIS

COST-BENEFIT ANALYSIS OF AN INVESTMENT IN GIS

GIS is sometimes criticised as an expensive solution to a simple problem. This is true when using high-end GIS to solve problems that can be tackled by simple inexpensive desktop mapping packages. Hence it is difficult to quantify the benefits of using GIS. The organisation paying for the software may find it difficult to realise the financial benefits immediately whereas outsiders gaining access to the products will be pleased with the higher accuracy. However, the digital output products will have ling-term benefits.

Major costs are incurred in the early stages of a project in GIS and the benefits may materialise later into the project cycle. This is due to the fact that an initially large investment results in lower maintenance and updating costs resulting in sustainable benefits in the long run. The long term benefits are significantly higher as the process results in a multi-purpose digital database.

The introduction of GIS and requires a change of routine and expense not only for software and hardware but also for data purchase, training, planning and organizational restructuring.

Some of the costs not directly related to GIS but required for data input into GIS are:
-costs involved for data collection
-costs involved in data conversion

The various cost components required for implementation of a project in GIS are listed below:
-Evaluation of available data and development of a data conversion strategy
-Requirement of computers with fast processors, plenty of storage space, other peripherals
 required by GIS such as digitizers, large-format colour printers, etc.
-Evaluation and selection of GIS mapping software
-Prototype development
-Hardware/Software system configuration/customisation
-Establishing human resource planning to ensure a smooth transition to the new system
-Training is the fourth most expense of any GIS activity. It amounts to almost 5 to 10% of the
 total project cost.
-Costs involved in database design, data modeling and procedural manual development
-Additional costs are incurred if the old and new systems have to be operated in parallel during the transition period.
-Costs are incurred for data acquisition and data purchase or data capture and data conversion
-Since data conversion is labour intensive and error prone, a rigorous procedure for checking the resulting data for positional accuracy and logical consistency should be part of the process
-System maintenance involving software and hardware upgrades along with training due to such upgrades is to the tune of 10% of initial investment per year
-A periodic review of the GIS group’s work should be part of the regular activities
-Development of data distribution strategies

The various benefits of GIS are listed below:
Benefits are of two types:-
Efficiency benefits and
Effectiveness benefits

Efficiency benefits include:
-Cost savings
-Productivity gains

These benefits may be:
-measurable benefits
-indirect effects

Another benefit of GIS is the greater credibility and authority of map products.
Digital techniques help produce maps requested by customers quickly and cheaply.
Digital approach helps produce maps with higher accuracy.
A digital database ensures a high degree of consistency
GIS supports a thriving secondary market in associated mapping services.

Effectiveness benefits include:
-Improved analysis
-Improved policy making
-Improved data sharing and
-Improved outreach

Monday, November 2, 2015

Organizing data for analysis

GIS software organizes spatial data in a thematic approach that categorizes data in vertical layers. The definition of layers is fully dependent on the requirements.
Typical layers used in natural resource management agencies include forest cover, soil classification, elevation, road network (access), ecological areas, hydrology, etc.

Spatial data layers are commonly input one at a time, e.g. forest cover. Accordingly, attribute data is entered one layer at a time. Depending on the attribute data model used by the data storage subsystem data must be organized in a format that will facilitate the manipulation and analysis tasks that will be required. Most often, the spatial and attribute data may be entered at different times and linked together later.


Vector data analysis

Vector data analysis uses geometric objects of points, line and polygon. The accuracy of analysis depends on the accuracy of these objects in terms of location and shape.
Vector data analysis includes:
  1. Buffering
  2. Overlay
  3. Measuring distances
  4. Spatial statistics for pattern analysis and
  5. Tools for map manipulation

Types of output data in GIS

Output is the final product of any analysis. Analysis of geographic data using GIS software provides a tool to effectively communicate the results to decision makers.

The most common form of output from a GIS is a map. Other, non-cartographic forms of GIS output are tables and charts containing spatial and non-spatial attribute information.

The other types of outputs are:

  • Cartograms
  • Directions
  • Customer lists
  • 3D diagrams and movies
  • Contour maps
  • Non-contiguous cartograms
  • Vector map
  • Non-contiguous map
  • Pin map
  • Hatch map
  • Density map
  • Gradient map
  • Symbol map
  • Territory map
  • Pie map
  • Bar maps
  • Line graph maps
  • Prism map
  • Flow map
  • Classified choropleth map
  • Graduated circles
  • GIS provides a graphical way to view attribute value related to a spatial object

Data retrieval query (SQL)

Data retrieval involves the capability to easily select data for graphic or attribute editing, updating, querying, analysis and/or display.
Retrieval functions allow the selective search and manipulation of data without the need to create new entities.
Querying is the capability to retrieve data, usually a data subset, based on some user defined formula. These data subsets are often referred to as logical views. Often the querying is closely linked to the data manipulation and analysis subsystem. Several GIS software vendors have attempted to standardize their querying capability by use of a Standard Query Language (SQL).

Structured Query Language (SQL) is a data query and manipulation language designed for relational databases. It was developed by IBM in the 1970s. A new development has been to extend SQL to Object-oriented database management systems and to spatial data.

In order to use SQL to access a database, the syntax of the query language must be used. The basic syntax of SQL is:

select <Attribute list> from <Relation> where <Condition>
Here, 'select','from' and 'where' are the keywords and Attribute list is the field from the database, Relation refers to a table and Condition refers to criteria for data query.



Overlay operations

Overlay is defined as a GIS operation that combines the geometries and attributes of input layers to create the output.
The geometry of the output represents the geometric intersection of features from the input layers.
Each feature on the output contains a combination of attributes from the input layers and this combination differs from its neighbours.
Features to be overlaid must be spatially registered and be based on the same coordinate system. In addition, the layers must be in the same zone and should have the same datum.

Feature type and Overlay
Considering the feature type, there are two groups of overlay operations.

  1. Using two polygon layers as inputs
  2. Using one polygon layer and another layer that contains either lines or points
Hence, overlay operations can be classified as either:
  1. point-in-polygon
  2. line-in-polygon or
  3. polygon-in-polygon
The layer that may be either a point, line or polygon layer is called the input layer and the layer which is a polygon layer is called the overlay layer.

Point-in-polygon overlay operation, the same point features in the input layer are included in the output and each point is assigned the attributes of the polygon within which it falls.

In a line-in-polygon overlay operation, output contains the same line features as the input but each line feature is dissected by polygon boundaries on the overlay layer. This implies that the output layer has more line segments than the input layer.
Each line segment on the output layer combines attributes from the input layer and the underlying polygon.

The most common overlay operation is the polygon-on-polygon overlay operation. This operation involves two polygon layers and the output combines the polygon boundaries from the input and overlay layers to create a new set of polygons.
Each polygon carries attributes from both layers and and these attributes differ from those of adjacent polygons.

Overlay methods are based on boolean connectors (AND, OR and XOR)
An overlay operation is called:
  • intersect if it uses AND connector
  • is called union if it uses OR connector and
  • is called difference or symmetrical difference if it uses the XOR connector
  • is called identity or minus if it uses a combination of AND and OR connectors
Union preserves ALL features from the inputs. The area extent of the output combines the area extents of both input layers. The Union overlay method requires that both input layers be polygon layers.

Intersect preserves only those features that fall within the extent common to both input layers. Input layers may contain different feature types. It is a preferred method of overlay as any feature on its output has attribute data from both the input layers.

Symmetrical difference preserves features not common to either of the input layers. It is opposite to the OR connector in terms of output area extent.

Identity preserves only features that fall within the area extent of the layer defined as the input layer.

The overlay operations combine features and attributes from input layers and the output is useful for query and modeling purposes.

Analysis of non spatial attribute data

Non-spatial data or attribute data is that data that is independent of all location considerations.

The analysis module usually contain four important functions:
1. Selection is a simple operation, but it is important because all subsequent work is based on the results of the selection process.
2. Manipulation has to do with aggregation, buffering, overlaying and interpolation.
3. Exploration is the first step in discovering any kind of pattern or cluster in a data set. Explorative spatial data analysis (ESDA) uses the data in an inductive way to get new insight about spatial patterns and relations - Spatial statistics such as Moran's I and the G statistics are important tools in explorative spatial data analysis.
4. Confirmation is a set of tools for estimation of process models, simulation and forecasting.

Analysis in GIS is different from other statistical analysis since the attribute data establishes links to maps for visual analysis. Before starting any analysis, the problem should be assessed and an an objective should be established The following steps outline the basic procedure for geographical analysis:

  1. Establish the objectives and criteria for the analysis. Define the problem and then identify a sequence of operations to produce meaningful results. 
  2. Prepare the data for spatial operations. Prepare all map coverages for the proposed data analysis. Add one or more attributes to coverages in the database if necessary. 
  3. Perform the spatial operations and combine the coverages, e.g. creating buffering zones around features, manipulating spatial features and performing polygon overlay. 
  4. Prepare the derived data for tabular analysis and ensure the feature attribute table contains all the items needed to hold the new values to be created. 
  5. Perform the tabular analysis. Calculate and query the relational database using the model defined in the first step.
  6. Evaluate and interpret the results. Examine the results and determine whether the answers are valid. Simple map displays and reports can help in this evaluation
  7. Refine the analysis if needed and repeat the analysis.


Tuesday, October 27, 2015

Transformation in GIS

A geographical transformation is a mathematical operation that converts the coordinates of a point in one geographic coordinate system to the coordinates of the same point in another geographic coordinate system.
Since geographic coordinate systems contain datums that are based on spheroids, a geographic transformation also changes the underlying spheroid. There are several methods, which have different levels of accuracy and ranges, for transforming between datums.
A geographic transformation always converts geographic (latitude–longitude) coordinates. Some methods convert the geographic coordinates to geocentric (X,Y,Z) coordinates, transform the X,Y,Z coordinates, and convert the new values back to geographic coordinates

Affine transformation is a geographic transformation that scales, rotates, skews, and/or translates images or coordinates between any two Euclidean spaces. It is commonly used in GIS to transform maps between coordinate systems.

In an affine transformation, parallel lines remain parallel, the mid-point of a line segment remains a mid-point and all points on a straight line remain on a straight line.

Geometric transformation is the process of using a set of control points and transformation equations to register a digitised map, satellite image, or an air photo to a projected coordination system.

Geometric transformation converts a newly digitised map into projected coordinates by a process called map-to-map transformation. A remotely sensed image is converted to projected coordinates using image-to-map transformation. This is also called georeferencing

Different methods have been proposed for transformation from one coordinate system to another. Each method is differentiated based on the geometric property it preserves and the changes it allows. Transformation results in either:
  • Changes in position and direction
  • Uniform change of scale or
  • Changes in size and shape
Below are listed the various transformations and their effect on a rectangular object:
  1. Equiarea transformation permits rotation of rectangle and preserves its shape and size.
  2. Similarity transformation permits rotation of rectangle and preserves its shape but not the size.
  3. Affine transformation allows angular distortion but preserves parallelism of lines
  4. Projective transformation allows both angular and length distortions and thus allows the rectangle to be transformed into an irregular quadrilateral.
Generally, Affine transformations are used for map-to-map or image-to-map transformations and projective transformation is used for aerial photographs with relief displacement.

Spatial Analysis in GIS

SPATIAL ANALYSIS IN GIS

GIS is differentiated from other information systems due to its spatial analysis functions.
Spatial analysis functions are used to answer questions about the real world using  GIS databases as a model of the real world.
Spatial analysis techniques are used to create an image of reality that can be easily understood.
Basic spatial analysis can be performed at various levels:

  1. Sorting data in attribute tables for presentation
  2. Performing arithmetic, boolean and statistical operation on attribute tables
  3. Compiling new data based on original and derived attributes or based on geographic relationships.
  4. Within each level operations may be logical, arithmetic, geometric, statistical or a combination of any of these four types.

The two fundamental functions of GIS are:
  1. Generation of maps and
  2. Generation of tabular reports 
Spatial analysis requires logical connections between attribute data and map features. Spatial analysis builds operational procedures on spatial relationship between map features. 

Attribute query involves selecting information by the use of logical questions. When no spatial information is required to ask a question, the query is considered an attribute query. 

A spatial query involves selecting features based on spatial relationships. The answer to such queries can be obtained by a hard-copy map or by using a GIS.

Basic GIS analysis involves attribute queries and spatial queries. Complicated analysis require a series of GIS operations involving multiple attribute and spatial queries, alteration of original data and generation of new data sets.

An effective spatial analysis uses the best available methods appropriately for different types of attribute queries, spatial queries and data alteration. 

The design of the analysis depends on the purpose of the study.

The use of GIS to inquire geographic features and retrieve associated attribute information is called identification. This process generates new set of maps by query and analysis. Spatial analysis helps make the new information clearer. GIS operational procedure and analytical tasks that are suited for spatial analysis are discussed below:
  1. Single layer operations are procedures that correspond to queries and alterations of data that operate on a single data layer. For example, creating a buffer zone (silence zone) around all schools in a city is a single layer operation
  2. Multi layer operations are used for manipulation of spatial data on multiple data layers. For example, the overlay of two input data layers produces a map of combined polygons.
  3. Topological overlays: These are multi layer operations that allow combining features from different layers to form a new map and give new information and features that were not present in the individual maps.
  4. Point pattern analysis deals with examination and evaluation of spatial patterns and processes of point features.
  5. Network analysis: It is designed specifically for line features organized in connected features and typically applies to transportation problems and network analysis. For example: school bus routing, walking distance, bus stop optimization, etc
  6. Surface analysis deals with the spatial distribution of surface information in a three dimensional structure.
  7. Grid analysis involves processing of spatial data in a regularly spaced form. 
  8. Fuzzy spatial analysis is based on fuzzy set theory. Fuzzy set theory is a generalization of boolean algebra where zones of gradual transition are used to divide classes instead of crisp boundaries. Fuzzy algebra offers various other methods to combine different sets of data for landslide zonation map preparation. Fuzzy logic can also be used to handle mapping errors or uncertainty.
Geostatistical tools for spatial analysis
Geostatistics studies the spatial variability of regionalized values. Tools to characterise spatial variability are:
  1. Spatial auto-correlation function and
  2. Variogram
Spatial auto-correlation examines the correlation of a random process with itself in space. Examples of such phenomena are: 
                                      -Total amount of rainfall
                                      -Toxic element concentration
                                      -Elevation at triangulated points, etc
The spatial auto-correlation function depicted as a graph is called a spatial auto-correlogram and this gives an insight into the spatial behaviour of the phenomena under study.

A variogram is calculated from the variance of pairs or points at different separation.

Spatial analysis is a vital part of GIS and can be used for many applications like:
  1. Site suitability
  2. Natural resource management
  3. Environmental Disaster Management
Spatial Analysis is the heart or core of GIS because it includes transformations, manipulations and methods that can be applied to geographic data to support decisions, reveal patterns and anomalies not immediately obvious and add value.

Spatial analysis is a set of methods whose results change when locations of the objects being analysed or the frame used to analyse them changes.

Spatial analysis can be:
  1. Inductive: Examining empirical evidence and searching for patterns that might support new theories or general principles
  2. Deductive: Focussing on testing of known theories against data
  3. Normative: Using spatial analysis to develop new or better designs
Analysis can also be carried out on attribute tables of a GIS by plotting one variable against the other as a scatterplot and examining the dependence of one variable on one or more independent variables

Regression analysis can be used to find the simplest relationship and multiple regression can be used to understand the effects of multiple independent variables.

Changing relationship between variables with space is called spatial heterogeneity.

One of the most powerful features of a GIS is the ability to join tables based on common geographic location.

The point-in-polygon operation is used to determine if a point lies inside or outside a polygon.

The polygon overlay is similar to the point-in-polygon operation.

Overlay in raster is very simple. The attributes of each cell are combined according to a set of rules. 

The ability to calculate and manipulate distances forms the basis of spatial analysis.

Distance along a route (represented by a poly-line) is calculated by adding the lengths of each segment of the poly-line.

Since poly-lines short-cut corners, the length of a poly-line is shorter than the length of the object it represents leading to slight discrepancy.

Buffering builds new objects by identifying all areas that are within certain specified distance of the original object.

Point patterns can be identified as clustered, dispersed or random.

There are SIX CATEGORIES of spatial analysis:
  1. Queries and reasoning
  2. Measurements
  3. Transformations
  4. Descriptive summaries
  5. Optimization and
  6. Hypothesis testing
Queries and reasoning are the most basic analysis operations where GIS is used to answer simple questions. No changes occur in the database and no new data are produced.

Measurements involve measurement of simple properties of objects such as length, area or shape and relationship between pair of objects such as distance or direction.

Transformations are simple methods of spatial analysis that change data-sets by combining them or comparing them to obtain new data-sets and finally new ideas. Transformations use simple arithmetic, geometric or logical rules. They include operations that convert raster data to vector data and vice versa. They may create fields from collections of objects or detect collection of objects in fields.

Descriptive summaries attempt to capture the essence of a data-set in one or two numbers. 

Optimization techniques are normative in nature and are designed to select ideal locations for objects given specific constraints. They are widely used in market research, package delivery industry, etc.

Hypothesis testing focusses on reasoning from the results of a limited sample to make genaralizations about an entire population. Hypothesis testing is the basis for inferential statistics  and forms the core of statistical analysis.

Spatial analysis can be done by overlay analysis by overlaying land use and flood zone to determine the residential parcels inside a flood zone area. This data can be used by insurance companies to target their insurance sales.

Farmers can use interpolation to examine soil samples from a farm area.

Shop owners can establish their stores based on location (distance and density) analysis

Data types in spatial analysis
The three types of data used to characterize problems of spatial analysis are:
  1. Events or point patterns: Examples of this type are: crime spots, disease occurrences, localization of vegetal species, etc.
  2. Continuous surfaces: Examples of this type are: geological maps, topographical maps, ecological maps, etc.
  3. Areas with counts: Examples of this type are: population surveys, health statistics, etc that are demarcated by closed polygons (postal zones, municipalities, etc)

Problems of spatial analysis deal with environmental and socioeconomic data. 

Basic concepts of spatial analysis
  1. Spatial dependency is an important concept to understand and analyse a spatial phenomena. "Everything is related to everything else, but near things are more related than distant things" -Waldo Tobler (First law of Geography)
  2. The computational expression of the concept of spatial dependence is spatial autocorrelation.
  3. Statistical inference for spatial data: An important consequence of spatial dependence is that statistical inferences on this type of data will not be as efficient as in the case of independent samples of the same size.
Spatial interpolation is the process of manipulating spatial information to extract new information and meaning from original data. GIS provides spatial analysis tools for calculating feature statistics and carrying out geoprocessing activities as data interpolation.

The two widely used interpolation methods are:
  1. Inverse Distance Weighting (IDW) and
  2. Triangular Irregular Networks (TIN)
Other interpolation methods are:
  • Regularized Splines with Tension (RST)
  • Kriging or Trend Surface Interpolation

Friday, October 23, 2015

Connectivity functions

CONNECTIVITY FUNCTIONS
Connectivity functions examine relationships between objects in terms of adjacency and relative-effective-distance. Operators associated with connectivity include:
-Network analysis
-Diffusion models
-Cellular automata and
-Agent based models

The two connectivity functions widely used in GIS are:
-Contiguity and
-Spreading

Contiguity analysis exploits the topological relationships between objects. It helps determine if two objects are adjacent or if they share a node. Contiguity analysis also helps determine a pattern of spread.
Grouping vector data is used to reclassify a map and dissolve polygon boundaries. Grouping is also done on raster data but is less elegant than grouping operation on vector data.

Contiguity is used to measure shortest and longest straight line distances across and area and to identify areas of terrain with specified size and shape characteristics.

Proximity Functions. The simple distance between features. Four parameters are used to measure proximity are listed below
1. target locations.
2. unit of measurement.
3. a function to calculate proximity and
4. the area to be analyzed.

A common type of proximity analysis is the buffer zone.

Network Functions
A network is a set of interconnected linear features that form a pattern or framework. City Streets, Power Transmission Lines, and Airline Service Routes are examples.
There are three principal types of GIS Analysis performed by Networking.
1. Prediction of loading on the network itself (prediction of flood crests),
2. Rate optimentation (emergency routing of ambulances), and
3. Resource allocation (zones for servicing rescue areas)

Networks analysis entails four components.
1. set of resources (goods to be delivered)
2. one or more locations where the resources are located (several warehouses where the goods are located)
3. an objective to deliver the resources to a set of destinations (customer locational data base) and
4. Set of constraints that places limits on how the objective can be met (is it economically feasible to deliver goods from one point to another)

Spread Functions
Spread functions help determine the "BEST" way to go from point A to point B

Seek or Stream Functions
Seek or Stream Functions refer to a function that is directed outward in a step-by-step manner using a specified decision rule. This function can be used to evaluate erosion potential.

Intervisibility functions
This function is a graphic depiction of the area that can be seen from the specified target areas. Intervisibility functions rely on digital elevation data to define the surrounding topography.

Wednesday, October 14, 2015

Classification of GIS

A GIS application can be classified into the following types:

  1. Four dimensional GIS
  2. Multimedia or hypermedia GIS
  3. Web GIS and
  4. Virtual Reality GIS
The above types of GIS are briefly discussed below:


  1. Four dimensional GIS are designed to handle three dimensions of space and one dimension of time. The spatio-temporal representations can handle only two dimensions of space and one dimension of time. 
  2. Multimedia/hypermedia GIS allow the user to access a range of georeferenced multimedia data by selecting resources from a georeferenced image map base. A map serving as the primary index to multimedia data in a multimedia geo-representation is called a hypermap. Multimedia and virtual geo-representations can be stored either in extended relational databases, object databases or in application specific data stores.
  3. Web GIS: Widespread access to the internet coupled with the use of web browsers and the explosion of geographic information has made it possible to develop new forms of multimedia geo-representations on the web. Many geomatic web solutions are web-based and are rapidly overtaking desktop GIS with the future trends following the same direction.
  4. Virtual Reality GIS: Virtual Reality GIS have been developed to allow the creation, manipulation and exploration of geo-referenced virtual environments. For example, use of Virtual Reality Markup Language (VRML) to experiment with different scenarios. Virtual Reality GIS can also be web-based. An example of application of Virtual Reality GIS is 3D simulation for planning in various scenarios.

Tuesday, October 13, 2015

Analysis functions in GIS

GIS ANALYSIS FUNCTIONS
GIS analysis functions fall into four categories:
  1. Retrieval/Classification/Measurement functions
  2. Overlay functions
  3. Neighbourhood functions and
  4. Connectivity functions
  • Retrieval functions basically involve a selective search
Classification/Reclassification functions involve two operations:
  1. Identifying a set of features as belonging to a group and
  2. Defining patterns
Measurement functions measure distances, lengths, perimeters and areas

A selective search is an example of a retrieval function. It involves selection of attributes based on graphic selection tools used to select areas in the map displayed.

Overlay functions could be:
  1. Arithmetic 
    1. Addition
    2. Subtraction
    3. Multiplication
    4. Division
  2. Logical
    1. Used to find where specific conditions occur (and, or, >,< etc)
Vector methods are good for sparse data sets while raster methods are easier for grid calculations

Neighbourhood functions
The basic functions that fall under this domain are:
  1. Average
  2. Diversity
  3. Minimum/Maximum and
  4. Total
The parameters that need to be defined to operate these functions are:
  1. Target locations
  2. Specification of neighbourhood
  3. Function to be performed on neighbourhood elements
  4. Search operation is one of the most common neighbourhood function
  5. Neighbourhood function on a vector model is a specialised search function while on a raster model, polygons are on a separate layer and points and lines are on a separate layer.
  6. Theissen polygon operation

Cartographic modeling by GIS analysis - procedure with an example

CARTOGRAPHIC MODELING
Cartographic modeling involves the use of basic GIS functions in a logical sequence to solve complex spatial problems.

It was developed to model:

Statement of conditions or assumptions
  1. Land-use planning alternatives and
  2. Applications that require integrated analysis of multiple geographically distributed factors
  • The term was coined by Dana Tomlin in 1983.
  • Cartographic modeling lies completely under the raster domain. 
  • The nature of analysis is purely additive or subtractive and this complements the values assigned to the raster format of the data
  • The digitised data is layered and these layers are combined to construct constraint maps that can be analysed with reference to any specific geographic problem to arrive at the best alternative.
A cartographic model has the ability to form a logical sequence. The process of cartographic modeling is characterised by working backward to insure that all data that will be needed are identified. This helps to avoid collecting data that will not be needed. The process insures that any judgements to be made are explicitly identified. Hence, subjective judgements are an integral part of cartographic modeling.

Cartographic modeling is a common way of expressing and organising the methods by which spatial variables and spatial operations are selected and used to develop an analytical solution within a GIS.

Cartographic modeling is based on the concept of data layers, operations and procedures. Cartographic modeling capabilities are found in most GIS software. 

Modeling is a logical or mathematical formulation that attempts to simulate some aspect of the real world.

The five steps involved in cartographic modeling are listed below and elaborated subsequently:
  1. Statement of problem or objectives
  2. Statement of conditions or assumptions
  3. Methodology
  4. Implementation and
  5. Evaluation
Statement of problem involves dividing the problem into sub-problems. The objective provides a direction and a clear end to the activity. It helps by the possible routes to solving for the objectives

Statement of conditions or assumptions includes the conditions of the problem. For example: current state, background or case history of the problem. Assumptions in the model define the limitations of the analysis. An assumption of most models is that the processes of the past will continue in the future.

Methodology involves:
  1. Assembly of sub-models into a model, which can be sub-divided into:
    1. Identification of sub-problems (analogous to the concept of divide and conquer)
    2. Development of sub-models that address the sub-problems
    3. Development of a strategy for integrating the sub-models
    4. Developing a flowchart that shows the parts in the context of the whole
  2. Identification of:
    1. Data sets needed
    2. Spatial operations
    3. Non-spatial operations and
    4. Interaction of spatial and non-spatial data
Implementation involves:
Implementing the model using the analytic tools available in GIS. It also involves implementation of techniques to circumvent the limitations of the GIS system.

Evaluation
This involves testing the effectiveness of the model. If the model does not conform to expectations,  its assumptions and components should be re-examined and adjusted where necessary. The above procedures should be performed in an iterative fashion until the objectives are achieved.

EXAMPLE

Problem: The municipal corporation of a city would like to measure the environmental equity as compared to the siting of waste transfer stations
Restating the problem: Is one particular income class bearing the burden of waste transfer stations ?

Conditions and assumptions
  1. Impact of waste transfer sites on surrounding communities is negligible beyond 500m
  2. Within 500m the effect is uniform
  3. Income classes are distributed evenly throughout a census tract
Methodology:
  • Data sets needed
    • income by census tract
    • location of waste transfer sites
  • Spatial operations
    • points in polygon
    • buffer
    • overlay
  • Operations on attributes
    • select
    • reclass
    • calculate area estimates and
    • generate statistics
Implementation:
  • Outline for flow of implementation
    • select only waste transfer sites in the city
    • generate a 500m buffer around these sites
    • select income data from census data and reclass into three income classes: low, medium and high
    • add a field to hold the original area of each census tract prior to polygon intersection
    • recalculate the income classes based on percentage of census tract left in the intersected polygons. Use the original area field that was brought along in the intersection
    • calculate the totals for each of three generated income classes for the entire city
    • generate pie charts for the number in each income class for:
      •  the entire city and 
      • affected areas
    • create a map showing output
Evaluation
  • The results obtained should be evaluated against the methodology used to test the validity of the model.
  • The model should accurately represent the process being modeled.
  • A statistical analyses that includes both qualitative and quantitative observations should be performed.
  • Based on the above listed criteria, changes to improve the model should be documented and the modeling should be repeated.