Monday, November 30, 2015

Text patterns and line styles

Text Patterns


Line styles
Line Styles define how line features are rendered on a map. Line styles offer numerous controls for color, gradient or pattern fills, dash patterns and many other properties.

Line styles have the following properties listed below:

  1. Name - Each line style is given a name for easy identification and use when creating a map
  2. Colour - This property controls the colour and appearance of the line feature. The line can be filled with a colour, gradient, hatch pattern or bit mapped texture.
  3. Thickness - This property sets the thickness of the line (in pixels).

Pattern Type - Selects whether to use a basic (predefined) dash pattern or create a custom pattern. 
Dash Style - Allows a choice of dash style to be used. The following styles are generally used:
  1. Solid: This pattern draws a solid outline without dashing and this is the default setting
  2. DashDot: This pattern draws a dashed outline with a “dash-dot” pattern.
  3. DashDotDot: This pattern draws a dashed outline with a “dash-dot-dot” pattern
  4. Dot: This pattern draws a finely dotted outline
  5. Dash: This pattern draws a dashed outline
A Custom Pattern Type is also available where the Dash Style property will be displayed as a text box where a custom dash pattern is entered. Dash patterns are expressed as a comma-delimited list of pixel lengths. The first number represents the length of the first dash in the pattern. The next number represents the length of the first space after that dash, and so on. When the end of the sequence is reached, it repeats.

A Line Join determines what type of joint to draw at the point where one line ends and another begins.

Sampling GIS data

Sampling GIS data
Since it is virtually impossible to obtain detailed information regarding every location on the Earth's surface (specifically for GIS), it is best to take samples from a smaller representative subset. GIS deals with explicitly spatial data. The two ways in which spatial data can be sampled is:
-Directed sampling and
-Non-directed sampling
Directed sampling involves making decisions about the objects to be viewed and cataloged. This population occupies an area and both the area and the sample population form the sampling frame.
Non-directed, probability-based sample is preferable since it eliminates bias.
Probabilistic sampling is of four types:
-Random sampling
-Systematic sampling
-Stratified sampling and
-Homogeneous sampling
The above sampling types can be combined to create a hybrid sampling scheme.
Random spatial sampling is the most basic sampling design. It allows each point, line, area or surface feature to be selected with the same probability as the next.
Systematic sampling use a repeatable pattern as a basis for selection.
Stratified spatial sampling selects small areas within which individual spots, objects or features are sampled. Stratifying simplifies the process by dividing the task into small regions.

Data quality refers to the fitness for use of data for intended applications. The components of data quality are listed below:
-The data must be reliable and accurate to be considered as usable. It should be in agreement with the real world being represented.
-The data must be current and up-to-date for the intended application.
-The data must be relevant in terms of content and level of detail
-The data must be timely for the application
-The data must be complete and precise or the degree of uncertainty must be indicated
-The data must be concise and intelligible (comprehensible to the user)
-The data must be stored in a format that can be conveniently handled (data handling involves one or more of the following operations: maintenance, transmission, distribution, classification, re-sampling, retrieval and updation)
-The data must be documented regarding its source to enable users to determine their suitability for a certain application
-The data must be stored in map projection that best meets the requirements of the application with regards to the preservation of area, shape, distance and direction
-The data should be captured at a scale using a classification scheme suitable to the application for which it will be used
-Cartographic properties (physical condition of mapping media, quality of line work, use of color and symbology, classification of features, map maintenance and revision cycle) play an important role in determining the quality of data. 

Types of errors in GIS data, their elimination and accuracies

TYPES OF ERRORS IN GIS DATA THEIR ELIMINATION AND ACCURACIES 
Maps created by GIS have deficiencies.  These deficiencies occur due to “Errors” that may have taken place at different stages of GIS implementation. Flaws in data are referred to as errors. An error is the physical difference between the real world and the GIS output. Errors may be single, definable departures from reality, or maybe persistent widespread deviations throughout a whole database. Errors reduce the accuracy of the map generated. Using well defined and controlled procedures errors can be avoided.
There are two types of errors in GIS:
        I.      Source Errors:
                 They are the errors that are present in “Source Data” that is given to the GIS. They occur      before the actual implementation of GIS:

                                            a.      Instrumental inaccuracies:
                                               ● Satellite/ air photo/ GPS/ surveying (spatial).
                                               ● Inaccuracies in attribute measuring instruments.
                                            b.      Human Processing:
                                               ● Misinterpretation (e.g. photos), spatial and attribute.
                                               ● Effects of scale change and generalization.
                                               ● Effects of classification (nominal / ordinal / interval).
                                            c.      Actual Changes:
                                               ● Gradual 'natural' changes: river courses, glacier recession.
                                               ● Catastrophic change: fires, floods, landslides.
                                               ● Seasonal and daily changes: lake/sea/ river levels.
                                               ● Man-made: urban development, new roads.
                                               ● Attribute change: forest growth (height etc.), discontinued trail /roads,                 road surfacing.


        II.      Processing Errors:
                  They are the errors that occur during the processing of the data i.e. during the                       implementation of GIS.

                   a.     Input:
                       ● Digitizing: human error, the width of a line, spikes, knots, also entering attribute                                                      data.
                       ● Dangling nodes (connected to only one arc): permissible in arc themes (river                    headwaters etc.).
                       ● Pseudo-nodes (connected to one or two arcs) - permissible in island arcs, and where    attributes change,
                            e.g. road becomes paved from dirt or vice versa.
                       ● Projection input error.

        b.     Manipulation:
                ● Interpolation of point data into lines and surfaces.
                ● Overlay of layers, digitized separately, e.g. soils and vegetation.
                ● The compounding effects of processing and analysis of multiple. layers: for example, if        two layers each have correctness of 90%, the accuracy of the resulting overlay is                                   around 81%
                ● Density of observations.
                ● Inappropriate or inadequate inputs for models.

         c.     Output:
                 ● Scale changes - detail and scale bars.
                 ● Color palettes: intended colors don't match from screen to Printer.

In addition to the above, errors may also be introduced during data transfer and conversion. These types of errors, may lead to loss of data.
It is a misconception that data from a GIS is of a higher quality. The quality of information from a GIS depends upon the quality  of the data used as input to the GIS.

Precision indicates how closely several positions fall in relation to each other. Precision is the recorded level of detail of the data.

Accuracy is a measure of the closeness of one or more positions to a position that is known and defined in terms of an absolute reference system. Accuracy is the extent to which an estimated data value approaches its true value (Aronoff, 1989).

Bias in GIS data is the systematic variation of data from reality. Bias is a consistent error throughout a dataset. Bias in GIS data may be caused due to human or technical sources.

ERROR, PRECISION, ACCURACY and BIAS affect the quality of individual datasets.

GIS error elimination
The simplest method of checking for data errors in GIS is by visual inspection. Comparison of data in GIS format with the original map reveals major errors. Double digitising is an error checking method used by large companies. This involves digitising the same map twice and comparing the two copies to identify inconsistencies. This is a costly and time consuming method of error checking. Statistical methods can be used to pinpoint potential errors. 

Components of data quality

COMPONENTS OF DATA QUALITY
There are two important components of data quality. They are:
-Terminology used for describing problems and
-Sources, propogation and management of errors.
GIS analysis and results rely heavily on access to reliable, good quality data.
Quality is defined as "degree of excellence". In case of GIS, data quality indicates how good the data are. This in-turn indicates the suitability of data for a specific purpose. Data sets used for analysis should be COMPLETE, COMPATIBLE, CONSISTENT and APPLICABLE for the analysis being performed.
A COMPLETE dataset covers the entire study area and the time period of interest or the data are complete spatially and temporally along with a complete set of attribute information.
COMPATIBLE datasets are developed using SIMILAR METHODS of data capture, storage, manipulation and editing.
CONSISTENT data should come from SINGLE SOURCE DOCUMENT and be digitized by a SINGLE PERSON.
APPLICABILITY refers to SUITABILITY of data for a set of COMMANDS, OPERATIONS or ANALYSIS. 

Graphic symbols

Graphic symbols
A symbol is a graphic pattern that is used to represent a feature on a map. According to the type of features they represent, symbols are classified into marker symbols (representing point and node features), line symbols (representing arcs, routes, etc), shade symbols (to fill polygons and regions with solid colour or shade patterns) and text symbols (representing text used to label features).
Symbols on a map are either points, lines or areas. Each symbol might differ in size, shape, density, texture, orientation and colour. A poor match between the real world and the symbol used to depict it can confuse the user regarding its true nature. Colour is important in influencing the user's understanding of the map. The overall impact of the map can be affected by colour and symbolism. Black and white maps use differently shaped symbols and shading patterns are effective for simple patterns while being easier and cheaper to reproduce. Colour maps are used to effectively represent complex spatial patterns. The shape and pattern of symbolism used should bear relation to the feature being represented. 
Careful choice of shape and pattern of symbolism used can influence the user's impression of the map. Density and texture of shading can affect the impression given by the map. To ensure correct interpretation, the map user is given the necessary information by a key or legend. A key tells the user what the shading patterns, colours, line and point symbols mean.
Intellectual objectives and visual objectives of map design are sometimes in conflict. Symbols on the map (logical and aesthetic) dictate a consistent approach. Cartographers have established (over a period of generations) an exhaustive set of conventions and traditions to deal with such cases. These conventions were a result of trial and error and testing among map users. They act as guidelines for cartographic representations.


Friday, November 27, 2015

Types of errors in GIS

Maps created by GIS have deficiencies.  These deficiencies occur due to “Errors” that may have taken place at different stages of GIS implementation. These errors reduce the accuracy of the map generated. Using well defined and controlled procedures these errors can be avoided.
There are two types of errors in GIS:
                I.      Source Errors:
                        They are the errors that are present in “Source Data” that is given to the GIS.
                        They occur before the actual implementation of GIS:
                        a.      Instrumental inaccuracies:
                                               ● Satellite/ air photo/ GPS/ surveying (spatial).
                                               ● Inaccuracies in attribute measuring instruments.
                        b.      Human Processing:
                                               ● Misinterpretation (e.g. photos), spatial and attribute.
                                               ● Effects of scale change and generalization.
                                               ● Effects of classification (nominal / ordinal / interval).
                        c.      Actual Changes:
                                               ● Gradual 'natural' changes: river courses, glacier recession.
                                               ● Catastrophic change: fires, floods, landslides.
                                               ● Seasonal and daily changes: lake/sea/ river levels.
                                               ● Man-made: urban development, new roads.
                                               ● Attribute change: forest growth (height etc.), discontinued trail /                                                             roads, road surfacing.


              II.      Processing Errors:
                       They are the errors that occur during the processing of the data i.e. during the                                        implementation of GIS.

                       a.     Input:
                                             ● Digitizing: human error, the width of a line, spikes, knots, also entering                                                  attribute data.
                                             ● Dangling nodes (connected to only one arc): permissible in arc themes                                                 (river headwaters etc.).
                                             ● Pseudo-nodes (connected to one or two arcs) - permissible in island                                                        arcs, and where attributes change, e.g. road becomes paved from dirt                                                      or vice versa.
                                             ● Projection input error.

                     b.     Manipulation:
                                             ● Interpolation of point data into lines and surfaces.
                                             ● Overlay of layers, digitized separately, e.g. soils and vegetation.
                                             ● The compounding effects of processing and analysis of multiple.                                                            layers: for example, if two layers each have
                                                 correctness of 90%, the accuracy of the resulting overlay is around                                                        81%.
                                             ● Density of observations.
                                             ● Inappropriate or inadequate inputs for models.

                     c.     Output:
                                             ● Scale changes - detail and scale bars.
                                             ● Color palettes: intended colors don't match from screen to Printer. 

Wednesday, November 25, 2015

Neighbourhood operations

NEIGHBORHOOD OPERATIONS 

Neighborhood operations, also called proximity analyses, consider the characteristics of neighboring areas around a specific location.  These functions either modify existing features or create new feature layers, which are influenced, to some degree, by the distance from existing features.  All GIS programs provide some neighborhood analyses, which include buffering, interpolation, Theissen polygons, and various topographic functions.

Buffering
Buffering creates physical zones around features.  These “buffers” are usually based on specific straight-line distances from selected features common to both raster and vector systems, are created around point, line, or polygon features.  The resulting buffers are placed in an output polygon feature layer.  Once complete, buffer layers are used to determine which features (in other layers) occur either within or outside the buffers (spatial queries), to perform overlay, or to measure the area of the buffer zone.  They are the most used neighborhood operation.

Interpolation
Interpolation is a method of predicting or estimating pixel values at unsampled locations based on the known values of neighboring pixels.  Since it is impractical to take measurements at all locations across the study area due to money, time, legal, and physical constraints, interpolation is required between known pixel values (sampled locations).  With interpolation, a continuous surface like elevation, temperature, and soil characteristics can be created.  Because of its continuous nature, interpolation is only available within raster-based systems.
There are many different types of interpolation:

  1. Linear interpolation, the simplest form, assumes that the value change over distance from recorded pixels is uniform.  This interpolation method is not always appropriate, so there are other methods like
  2. Fixed-radius Local Averaging
  3. Inverse Distance Weighted
  4. Trend Surface
  5. Splines and
  6. Kriging.  
All of these interpolation methods look at the values of the recorded pixels to generate the value of the pixels that fall in between.  The methods differ in how they weigh the recorded attributes and in the number of observations used for each method.  No method is accurate in every situation.


Theissen polygons (voronoi or proximal polygons)
Theissen polygons are boundaries created around points within a point layer.  The resultant polygons form around each of the points, and they delineate territories around which any location inside the polygon is closer to the internal point (that created it) than to any other point in the layer.  Attributes associated with each point are assigned to the resultant polygon.  It is a vector and raster process.

Topographic Functions
Topographic functions use Digital Elevation Models (DEMs) to illustrate the lay of the land.  DEMs are raster layers containing elevation data in each pixel.  From these values, you produce output layers to portray slope (inclination), aspect (direction), and hillshading.  These topographic functions are typical neighborhood processes; each pixel in the resultant layer is a product of its own elevation value as well as those of its surrounding neighbors.

Slope layers exhibit the incline or steepness of the land.  It is the change in elevation over a defined distance.
Aspect is the compass direction in which a slope faces.  From north, it is usually expressed clockwise from 0 to 360 degrees.
Hillshading, which is cartographically called shaded relief, is a lighting effect which mimics the sun to highlight hills and valleys.  Some areas appear to be illuminated while others lie in shadows.

These functions are raster processes, most can be mimicked in a vector environment by Triangulated Irregular Networks (TIN).  In addition, topographic functions can derive vector isolines (contours).

Neighborhood functions operate on the neighboring features of a given feature or set of features.
Search functions allow the retrieval of features that fall within a given search window (rectangle, circle, or polygon).

Line-in-polygon and point-in-polygon functions determine whether a given linear or point feature is located within a given polygon, or they report the polygon(s) that a given point or line are contained in.

Topographic functions compute the slope or aspect from a given digital representation of the terrain (digital terrain model or DTM).
Interpolation functions predict unknown values using the known values at neighboring locations.
Contour generation functions calculate contours as a set of lines that connect points with the same attribute value. Examples are points with the same elevation (contours), depth (bathymetric contours), barometric pressure (isobars), or temperature (isothermal lines).

Record overlay

Overlay is one of the most common and powerful GIS functions.  It investigates the spatial association of features by “vertically stacking” feature layers to investigate geographic patterns and determine locations that meet specific criteria.
Overlay operations belong to the most frequently used functions in a GIS application. They allow to combine two different layers and apply the set theoretic operations of intersection, union, difference, and complement.
It is the best-known GIS function. Vector and raster models both perform overlay, but their overlay functions differ considerably.

Vector (Logical) Overlay
Vector overlay predominantly overlays polygons in one layer over polygons in another layer, but it can also be used to overlay point or line features over polygon layers.  Sometimes referred to as topological or logical overlay, it is conceptually and mathematically more demanding than raster overlay.  There are three types of vector overlay operations:
Line on polygon is similar to point in polygon, but lines are superimposed on polygons.  This type of spatial join either appends polygon attributes to line features falling within them or counts and aggregates line attribute data to the polygon layer’s data file.
  1. Polygon on polygon is where one polygon layer is superimposed over another polygon layer to create a new output polygon layer.  The resultant polygons may contain some or all of the attributes from the polygons in which they were created.  Several types of polygon on polygon overlay exist, including intersection (A and B), union (A or B), and clip (A not B).  These Boolean operators work both on the attribute table and the geography. 
    1. Intersection computes the geometric intersection of all of the polygons in the input layers. Only those features that share a common geography are preserved in the output layer.  Any polygon or portion of a polygon that falls outside of the common area is discarded from the output layer.  The new polygon layer can possess the attribute data of the features in the input layers.
    2. Union combines the features of input polygon layers.  All polygons from the input layers are included in the output polygon layer.  It can also possess the combined attribute data of the input polygon layers.
    3. Clip removes those features (or portions of features) from an input polygon layer that overlay with features from a clip polygon layer.  The clip layer acts as a cookie cutter to remove features (and portions of features) that fall inside the clip layer. 
  2. Point in polygon is where a layer of point features is superimposed over a layer of polygon features.  The two layers produce a point layer that includes attributes from the surrounding input layer polygons. Alternatively, the number of point features falling within each polygon are counted and stored as a new attribute in the polygon layer.  Other point attributes can be aggregated (summed, averaged, etc.) and included as attributes in the polygon’s data file.  The transferring of attributes based on their geographic position is called a spatial join.
  3. Line on polygon is similar to point in polygon, but lines are superimposed on polygons.  This type of spatial join either appends polygon attributes to line features falling within them or counts and aggregates line attribute data to the polygon layer’s data file.
Raster (Arithmetic) Overlay
Raster overlay superimposes at least two input raster layers to produce an output layer.  Each cell in the output layer is calculated from the corresponding pixels in the input layers.  To do this, the layers must line up perfectly; they must have the same pixel resolution and spatial extent.  If they do not align, they can be adjusted to fit by the pre-processing functions. After preprocessing, raster overlay is flexible, efficient, quick, and offers more overlay possibilities than vector overlay. 

Raster overlay, frequently called map algebra, is based on calculations which include arithmetic expressions and set and Boolean algebraic operators to process the input layers to create an output layer.  The most common operators are addition, subtraction, multiplication, and division, but other popular operators include maximum, minimum, average, AND, OR, and NOT.  Raster overlay simply uses arithemetic operators to compute the corresponding cells of two or more input layers together, uses Boolean algebra like AND or OR to find the pixels that fit a particular query statement, or executes statistical tests like correlation and regression on the input layers

Monday, November 9, 2015

IMPORTANT OBJECTIVE QUESTIONS

Geographic coordinate systems contain DATUMS that are based on SPHEROIDS
An AFFINE transformation is commonly used in GIS to transform maps between coordinate systems
The process of converting a newly digitised map into projected coordinates is called MAP TO MAP TRANSFORMATION
Image to map transformation is also called GEOREFERENCING
GIS is differentiated from other information systems due to its SPATIAL ANALYSIS FUNCTIONS
The two fundamental functions of a GIS are: (i) GENERATION OF MAPS and (ii) GENERATION OF TABULAR REPORTS
Spatial analysis builds operational procedures on SPATIAL RELATIONSHIPS BETWEEN MAP FEATURES
If no spatial information is required to ask a question, the query is called an ATTRIBUTE QUERY
The design of analysis depends upon the PURPOSE OF STUDY
The use of GIS to inquire geographic features and retrieve associated attribute information is called IDENTIFICATION
The tools that characterise spatial variability are: SPATIAL AUTOCORRELATION FUNCTION & VARIOGRAM
Spatial analysis can be INDUCTIVE, DEDUCTIVE or NORMATIVE
The ability to CALCULATE & MANIPULATE DISTANCES forms the basis of spatial analysis
Point patterns can be identified as CLUSTERED, DISPERSED or RANDOM
The six categories of spatial analysis are: QUERIES & REASONING, MEASUREMENTS, TRANSFORMATIONS, DESCRIPTIVE SUMMARIES, OPTIMIZATION and HYPOTHESIS TESTING
The four interpolation methods are: INVERSE DISTANCE WEIGHTING (IDW), TRIANGULAR IRREGULAR NETWORKS (TIN), REGULARISED SPLINES WITH TENSION (RST) and KRIGING
Operators associated with connectivity are: NETWORK ANALYSIS, DIFFUSION MODELS, CELLULAR AUTOMATA and AGENT BASED MODELS
The two connectivity functions widely used in GIS are: (i) CONTIGUITY and (ii) SPREADING
The four parameters used to measure proximity are: (i) TARGET LOCATIONS, (ii) UNIT OF MEASUREMENT, (iii) FUNCTION TO CALCULATE PROXIMITY & (iv) AREA TO BE ANALYSED
A common type of proximity analysis is ESTABLISHMENT OF A BUFFER ZONE
The three principal types of GIS analysis performed by networking are: (i) PREDICTION OF LOADING, (ii) RATE OPTIMENTATION & (iii) RESOURCE ALLOCATION
INTERVISIBILITY FUNCTIONS rely on digital elevation data to define surrounding topography
A GIS application can be classified into: (i) FOUR DIMENSIONAL GIS, (ii) MULTIMEDIA GIS, (iii) WEB GIS & (iv) VIRTUAL REALITY GIS
The four categories under which GIS functions fall are: (i) RETRIEVAL/CLASSIFICATION/MEASUREMENT FUNCTIONS, (ii) OVERLAY FUNCTIONS, (iii) NEIGHBOURHOOD FUNCTIONS & CONNECTIVITY FUNCTIONS
Vector methods of spatial analysis are good for SPARSE DATA SETS
Raster methods of spatial analysis are good for GRID CALCULATIONS
Examples of neighbourhood functions are: AVERAGE, DIVERSITY, MINIMUM/MAXIMUM & TOTAL
Five steps involved in cartographic modeling are: (i) STATEMENT OF PROBLEM OR OBJECTIVES, (ii) STATEMENT OF CONDITIONS OR ASSUMPTIONS (iii) METHODOLOGY (iv) IMPLEMENTATION & (v) EVALUATION
Raster data analysis is based on CELLS & RASTERS
Reclassification is also called RECODING or TRANSFORMING
The two types of global functions are: (i) EUCLIDEAN DISTANCE & (ii) WEIGHTED DISTANCE
The four groups of mathematical functions provided by raster calculator are: (i) LOGARITHMIC, (ii) ARITHMETIC, (iii) TRIGONOMETRIC and (iv) EXPONENTIAL
The four major reasons requiring reclassification are: (i) TO REPLACE VALUES, (ii) TO GROUP VALUES, (iii) TO RECLASSIFY VALUES & (iv) TO SET SPECIFIC CELLS TO A VALUE
The two sub-systems in a knowledge based system are: (i) KNOWLEDGE BASE & (ii) AN INFERENCE ENGINE
Knowledge based systems were first developed by ARTIFICIAL INTELLIGENCE RESEARCHERS
A GIS model may be descriptive or PRESCRIPTIVE; DETERMINISTIC or stochastic; STATIC or dynamic; deductive or INDUCTIVE
The five types of digital elevation data are: (i) DIGITAL SURFACE MODEL (DSM), (ii) DIGITAL TERRAIN MODEL (DTM), (iii) BATHYMETRIC SURFACE, (iv) MIXED SURFACE and (v) POINT CLOUD ELEVATION FILE
The process of detecting and removing errors by editing is called CLEANING
Errors are classified into (i) ENTITY ERRORS, (ii) ATTRIBUTE ERRORS & (iii) ENTITY-ATTRIBUTE AGREEMENT ERRORS
Undershoots and overshoots are types of DANGLING NODES
Remotely sensed data are used to measure several environmental parameters like: (i) SURFACE & CLOUD TOP REFLECTANCES,(ii) ALBEDO,(iii) AREA AND POTENTIAL YIELD OF CROP TYPES & (iv) HEIGHT AND DENSITY OF FOREST STANDS
Satellite scanners operate in either a SWEEP mode or a PUSHBROOM mode
An overlay operation can be classified as (i) POINT-IN-POLYGON, (ii) LINE-IN-POLYGON, or (iii) POLYGON-IN-POLYGON
An overlay operation is called INTERSECT if it uses the and connector
An overlay operation is called UNION if it uses the or connector
An overlay operation is called IDENTITY if it uses a combination of the and and or connectors
An overlay operation is called DIFFERENCE or symmetrical difference if it uses the XOR connector
The most common type of output of a GIS is a MAP
The five types of vector data analysis are: (i) BUFFERING, (ii) OVERLAY, (iii) MEASURING DISTANCES, (iv) SPATIAL STATISTICS FOR PATTERN ANALYSIS & (v) TOOLS FOR MAP MANIPULATION

Saturday, November 7, 2015

COST-BENEFIT ANALYSIS OF AN INVESTMENT IN GIS

COST-BENEFIT ANALYSIS OF AN INVESTMENT IN GIS

GIS is sometimes criticised as an expensive solution to a simple problem. This is true when using high-end GIS to solve problems that can be tackled by simple inexpensive desktop mapping packages. Hence it is difficult to quantify the benefits of using GIS. The organisation paying for the software may find it difficult to realise the financial benefits immediately whereas outsiders gaining access to the products will be pleased with the higher accuracy. However, the digital output products will have ling-term benefits.

Major costs are incurred in the early stages of a project in GIS and the benefits may materialise later into the project cycle. This is due to the fact that an initially large investment results in lower maintenance and updating costs resulting in sustainable benefits in the long run. The long term benefits are significantly higher as the process results in a multi-purpose digital database.

The introduction of GIS and requires a change of routine and expense not only for software and hardware but also for data purchase, training, planning and organizational restructuring.

Some of the costs not directly related to GIS but required for data input into GIS are:
-costs involved for data collection
-costs involved in data conversion

The various cost components required for implementation of a project in GIS are listed below:
-Evaluation of available data and development of a data conversion strategy
-Requirement of computers with fast processors, plenty of storage space, other peripherals
 required by GIS such as digitizers, large-format colour printers, etc.
-Evaluation and selection of GIS mapping software
-Prototype development
-Hardware/Software system configuration/customisation
-Establishing human resource planning to ensure a smooth transition to the new system
-Training is the fourth most expense of any GIS activity. It amounts to almost 5 to 10% of the
 total project cost.
-Costs involved in database design, data modeling and procedural manual development
-Additional costs are incurred if the old and new systems have to be operated in parallel during the transition period.
-Costs are incurred for data acquisition and data purchase or data capture and data conversion
-Since data conversion is labour intensive and error prone, a rigorous procedure for checking the resulting data for positional accuracy and logical consistency should be part of the process
-System maintenance involving software and hardware upgrades along with training due to such upgrades is to the tune of 10% of initial investment per year
-A periodic review of the GIS group’s work should be part of the regular activities
-Development of data distribution strategies

The various benefits of GIS are listed below:
Benefits are of two types:-
Efficiency benefits and
Effectiveness benefits

Efficiency benefits include:
-Cost savings
-Productivity gains

These benefits may be:
-measurable benefits
-indirect effects

Another benefit of GIS is the greater credibility and authority of map products.
Digital techniques help produce maps requested by customers quickly and cheaply.
Digital approach helps produce maps with higher accuracy.
A digital database ensures a high degree of consistency
GIS supports a thriving secondary market in associated mapping services.

Effectiveness benefits include:
-Improved analysis
-Improved policy making
-Improved data sharing and
-Improved outreach

Monday, November 2, 2015

Organizing data for analysis

GIS software organizes spatial data in a thematic approach that categorizes data in vertical layers. The definition of layers is fully dependent on the requirements.
Typical layers used in natural resource management agencies include forest cover, soil classification, elevation, road network (access), ecological areas, hydrology, etc.

Spatial data layers are commonly input one at a time, e.g. forest cover. Accordingly, attribute data is entered one layer at a time. Depending on the attribute data model used by the data storage subsystem data must be organized in a format that will facilitate the manipulation and analysis tasks that will be required. Most often, the spatial and attribute data may be entered at different times and linked together later.


Vector data analysis

Vector data analysis uses geometric objects of points, line and polygon. The accuracy of analysis depends on the accuracy of these objects in terms of location and shape.
Vector data analysis includes:
  1. Buffering
  2. Overlay
  3. Measuring distances
  4. Spatial statistics for pattern analysis and
  5. Tools for map manipulation

Types of output data in GIS

Output is the final product of any analysis. Analysis of geographic data using GIS software provides a tool to effectively communicate the results to decision makers.

The most common form of output from a GIS is a map. Other, non-cartographic forms of GIS output are tables and charts containing spatial and non-spatial attribute information.

The other types of outputs are:

  • Cartograms
  • Directions
  • Customer lists
  • 3D diagrams and movies
  • Contour maps
  • Non-contiguous cartograms
  • Vector map
  • Non-contiguous map
  • Pin map
  • Hatch map
  • Density map
  • Gradient map
  • Symbol map
  • Territory map
  • Pie map
  • Bar maps
  • Line graph maps
  • Prism map
  • Flow map
  • Classified choropleth map
  • Graduated circles
  • GIS provides a graphical way to view attribute value related to a spatial object

Data retrieval query (SQL)

Data retrieval involves the capability to easily select data for graphic or attribute editing, updating, querying, analysis and/or display.
Retrieval functions allow the selective search and manipulation of data without the need to create new entities.
Querying is the capability to retrieve data, usually a data subset, based on some user defined formula. These data subsets are often referred to as logical views. Often the querying is closely linked to the data manipulation and analysis subsystem. Several GIS software vendors have attempted to standardize their querying capability by use of a Standard Query Language (SQL).

Structured Query Language (SQL) is a data query and manipulation language designed for relational databases. It was developed by IBM in the 1970s. A new development has been to extend SQL to Object-oriented database management systems and to spatial data.

In order to use SQL to access a database, the syntax of the query language must be used. The basic syntax of SQL is:

select <Attribute list> from <Relation> where <Condition>
Here, 'select','from' and 'where' are the keywords and Attribute list is the field from the database, Relation refers to a table and Condition refers to criteria for data query.



Overlay operations

Overlay is defined as a GIS operation that combines the geometries and attributes of input layers to create the output.
The geometry of the output represents the geometric intersection of features from the input layers.
Each feature on the output contains a combination of attributes from the input layers and this combination differs from its neighbours.
Features to be overlaid must be spatially registered and be based on the same coordinate system. In addition, the layers must be in the same zone and should have the same datum.

Feature type and Overlay
Considering the feature type, there are two groups of overlay operations.

  1. Using two polygon layers as inputs
  2. Using one polygon layer and another layer that contains either lines or points
Hence, overlay operations can be classified as either:
  1. point-in-polygon
  2. line-in-polygon or
  3. polygon-in-polygon
The layer that may be either a point, line or polygon layer is called the input layer and the layer which is a polygon layer is called the overlay layer.

Point-in-polygon overlay operation, the same point features in the input layer are included in the output and each point is assigned the attributes of the polygon within which it falls.

In a line-in-polygon overlay operation, output contains the same line features as the input but each line feature is dissected by polygon boundaries on the overlay layer. This implies that the output layer has more line segments than the input layer.
Each line segment on the output layer combines attributes from the input layer and the underlying polygon.

The most common overlay operation is the polygon-on-polygon overlay operation. This operation involves two polygon layers and the output combines the polygon boundaries from the input and overlay layers to create a new set of polygons.
Each polygon carries attributes from both layers and and these attributes differ from those of adjacent polygons.

Overlay methods are based on boolean connectors (AND, OR and XOR)
An overlay operation is called:
  • intersect if it uses AND connector
  • is called union if it uses OR connector and
  • is called difference or symmetrical difference if it uses the XOR connector
  • is called identity or minus if it uses a combination of AND and OR connectors
Union preserves ALL features from the inputs. The area extent of the output combines the area extents of both input layers. The Union overlay method requires that both input layers be polygon layers.

Intersect preserves only those features that fall within the extent common to both input layers. Input layers may contain different feature types. It is a preferred method of overlay as any feature on its output has attribute data from both the input layers.

Symmetrical difference preserves features not common to either of the input layers. It is opposite to the OR connector in terms of output area extent.

Identity preserves only features that fall within the area extent of the layer defined as the input layer.

The overlay operations combine features and attributes from input layers and the output is useful for query and modeling purposes.

Analysis of non spatial attribute data

Non-spatial data or attribute data is that data that is independent of all location considerations.

The analysis module usually contain four important functions:
1. Selection is a simple operation, but it is important because all subsequent work is based on the results of the selection process.
2. Manipulation has to do with aggregation, buffering, overlaying and interpolation.
3. Exploration is the first step in discovering any kind of pattern or cluster in a data set. Explorative spatial data analysis (ESDA) uses the data in an inductive way to get new insight about spatial patterns and relations - Spatial statistics such as Moran's I and the G statistics are important tools in explorative spatial data analysis.
4. Confirmation is a set of tools for estimation of process models, simulation and forecasting.

Analysis in GIS is different from other statistical analysis since the attribute data establishes links to maps for visual analysis. Before starting any analysis, the problem should be assessed and an an objective should be established The following steps outline the basic procedure for geographical analysis:

  1. Establish the objectives and criteria for the analysis. Define the problem and then identify a sequence of operations to produce meaningful results. 
  2. Prepare the data for spatial operations. Prepare all map coverages for the proposed data analysis. Add one or more attributes to coverages in the database if necessary. 
  3. Perform the spatial operations and combine the coverages, e.g. creating buffering zones around features, manipulating spatial features and performing polygon overlay. 
  4. Prepare the derived data for tabular analysis and ensure the feature attribute table contains all the items needed to hold the new values to be created. 
  5. Perform the tabular analysis. Calculate and query the relational database using the model defined in the first step.
  6. Evaluate and interpret the results. Examine the results and determine whether the answers are valid. Simple map displays and reports can help in this evaluation
  7. Refine the analysis if needed and repeat the analysis.