Tuesday, September 29, 2015

Data structure in a GIS

Data Structure
-Information organization is concerned with the internal organization of data.
-It represents the user's view of data (conceptualizing of the real world)
-It is the lowest level of data abstraction, which can be done with or without any intent for computer implementation
-It is expressed in terms of data models as shown below:
 -The vector and raster methods of representing the real world are "data models"
-The relational, network, hierarchical and object-oriented databases are "database models" --- they are the software implementation of data models

-Data structure represents a higher level of data abstraction than information organization as it is concerned with the design and implementation of information organization.
-It represents the human implementation-oriented view of data
-It is expressed in terms of database models
(Data structure is software dependent).
-Data structure forms the basis for the next level of data abstraction in information system: file structure or file format
-File structure is the hardware implementation-oriented view of data
-It reflects the physical storage of the data on some specific computer media such as magnetic tapes or hard disk
-Hence, file structure is hardware-dependent

Descriptive data structures

Descriptive data structures describe the design and implementation of the information organization of non-spatial data.
-As most commercial implementations of information systems today are based on the relational and object-oriented database models, the data structures of these models are explained below:
- Relational data structure: the relational data structure is the table which is formally called a relation (shown below).
-a relation is a collection of tuples that correspond to the rows of the table
-the number of tuples in a relation is called the cardinality
-a tuple is made up of attributes that correspond to the columns of the table
-the number of attributes in a tuple is called the degree
-each relation has a unique identifier called the primary key
-the primary key is a column or combination of columns that at any given time has no identical values in any two rows - this means that the values of each row of the primary key are always unique
-this allows the use of the primary keys to relate data in different tables in data processing (shown below)
-the primary keys in those tables are called foreign keys
-in order to enforce database integrity, relations are always normalized
-normalization is built on the concept of normal form
-a relation is said to be in a certain normal form if it satisfies a prescribed set of conditions (Date, 1995)
-at minimum, a relation in the relational database has to satisfy the conditions of the first, second and third normal forms
-First Normal Form (1NF) --- a relation is said to be in 1NF if and only if its tuples contain no repeating attributes (i.e. there must not be multiple values for a single entity which might theoretically result from multiple sampling at a particular location)
-Second Normal Form (2NF) --- a relation is said to be in 2NF if it satisfies the condition for 1NF and if every non-key attribute is irreducibly dependent on the primary key
-Third Normal Form (3NF) --- a relation is said to be in 3NF if it satisfies the condition for 2NF and the non-key attributes are mutually independent
-Object-oriented data structure

Unlike the relational data structure, there is not a formalized object-oriented data structure; meaning that different object-orientation implementations have different data structures
However, object-oriented data structure can be explained in generic terms using the concepts of object identify, object structure and type constructors
The concept of object identity
-Each object in an object-oriented database is provided a unique system-generated object identifier (OID)
-The OID is for internal reference by the system and is therefore transparent to the user
-The OID is immutable, i.e. its value remains unchanged
-Even when a particular object is removed from the database, its OID will never be assigned to any new object
The concept of object structure
-The concept of object structure allows complex objects to be constructed from simple objects
-Each object is viewed as a triple (i, c, v) where
i = the object's unique identifier (OID) 
c = a constructor (which indicates how the object value is constructed) 
v = object value
Different object-oriented systems use different constructors, including: atom, tuple, set, list and array
An object value 'v' is interpreted on the basis of the value of the constructor c in the triple (i, c, v) that represents the object
-if c = atom, then v is an atomic value (i.e. it is an indivisible value)
-if c = tuple, then v is a tuple containing one or more attributes with their respective OIDs
-if c = set, then v is a set of object identifiers (OIDs) for a set of objects of the same type
-if c = list, then v is an ordered list of OIDs of the same type
-if c = array, then v is an array of OIDs of the same type
The concept of type constructors
A type constructor is used by an object-oriented definition language (OODDL) to define the data structure for an object-oriented database schema as shown in the figure below. 
Graphical data structures

Raster data structure
In the raster data structure space is subdivided into regular grids of square grid cells or other forms of polygonal meshes known as picture elements as shown in the figure below.

Wednesday, September 23, 2015

Data formats in a GIS

The vector formats used in a GIS are listed below:

  1. Postscript is a page definition language that is usually used to export or print a map. It supports graphics in both vector and raster formats.
  2. Digital Exchange Format (DXF) is an external format used for transferring files between computers or between software packages. It is produced by AutoCAD. It does not have topology, but offers good detail on drawings, line widths, styles, colour and text. DXF is typically constructed in 64 layers and each layer consists of different features allowing the user to separate features. The main disadvantage of this format is that it lacks topology and spatial analysis.
  3. Digital Line Graph (DLG). These are distributed by the government and features are in separate files that most GIS packages will import. Extra data manipulation is often necessary. DLGs consist of line work with contour lines removed.
  4. TIGER (Topologically Integrated Geographic Encoding and Referencing system) was designed for use with the 1990 U.S. census. In this system, points, lines and areas can be explicitly addressed. It does not rely on adjacency information or graph theory. Real-world features such as meandering streams and irregular coastlines are given a graphic portrayal that is more representative of their true geographic shape. TIGER files are useful in non-census related research.
  5. Shapefile A shapefile is a vector data format for storing the location, shape and attributes of geographic features. A shapefile is stored in a set of related files and contains one feature class.
  6. Scalable Vector Graphics An SVG is an image that is an extension of the XML language. Any program that recognizes XML can display the SVG image. The scalable part of the term emphasizes that it is possible to zoom-in on an image without loss of resolution. SVG files have the added advantages of being smaller and arriving faster than conventional image files such as pdf, gif and jpeg
  7. Arc-info coverage This is a data model for storing geographic features using ArcInfo software. A coverage stores a set of thematically associated data considered to be a unit. It usually represents a single layer such as soils, streams, roads or land use. In a coverage, features are stored as both primary features (points, arcs, polygons) and secondary features (tics, links, polygons). Feature attributes are described and stored independently in Feature Attribute Tables
  8. ArcInfo Interchange file (e00) An ArcInfo interchange file is also known as export file. It is used to enable a coverage, grid or TIN and an associated INFO table to be transferred between different machines. ArcInfo Interchange files have a .e00 extension which increments to .e01, .e02 and so on if the interchange file is composed of several separate files.
  9. Geo-database A geo-database is an object oriented data model that represents geographic features and attributes as objects and the relationship between objects is hosted inside a relational database management system. A geo-database can store objects, such as feature classes, feature data sets, non-spatial tables and relationship classes.
The raster formats used in a GIS are listed below:
  1. Standard raster format: Many of the formats are based on photographic formats. The file structure has a header with a fixed length and a keyword to identify the format. In the header, the length of one record in bits and the number of rows and columns can be found. Frequently the header has a color table that specifies the colours to be projected.
  2. Tag Image File Format: This format is associated with scanners. It can use run length and other image compression schemes. It is not limited to 256 colours like a GIF.
  3. GEO-TIFF: As a part of header in TIFF format it puts lat/long at the edges of the pixels.
  4. Graphic Interchange Format (GIF): This format for image files is widely used on the internet. It is well suited for images with sharp edges and relatively few gradations of colour.
  5. Joint Photographers Experts Group JPEG is a common picture format. It uses a variable resolution compression system offering both partial and full resolution recovery .
  6. Digital Elevation Models: DEMs have two types of displays - a 30 meter elevation data from 1:24,000 seven-and-a-half minute quadrangle map and 1:250,000 3-arc-second digital terrain data. DEMs are produced by the National Mapping Division of United States Geological Survey.
  7. Band Interleaved by Pixel (BIP) and Band Interleaved by Line (BIL): BIP and BIL systems are produced by remote sensing systems. The primary difference between them is the technique used to store brightness values captured simultaneously in each of several colours or spectral bands.
  8. RS Landsat: Landsat satellite imagery and BIL information are used in RS Landsat. In one format, using BIL, pixel values from each band are pulled out and combined. Programs that use this kind of information include IDRISI, GRASS and MapFactory. It is relatively easy to exchange information from within these raster formats.

Cost and Path analysis using GIS

A cost path analysis enables one to calculate the various costs associated with traveling from one location to another. Costs in this sense can have a number of variations such as:

  • actual monetary expenditure
  • time and effort required to complete the movement from one place to another
  • costs of construction and
  • environmental costs, such as damage to habitats and ecosystems.
In cost path analysis, these various types of cost may be combined into one comprehensive grid. The path of least cost from one point to another may then be determined on this "cost surface". Routes with high associated costs are less favorable than routes with a lower cost associated with it.

Map layers or criteria which may be useful in creating a cost surface include:

  1. Elevation
  2. Land cover
  3. Land ownership (private, public, etc.)
  4. Slope
  5. Bodies of water
A cost surface, or cost grid, is a raster grid in which the value in each cell is the cost that a particular activity or object would be in that cell. It can also be an indexed value based on costliness. Costs could be measured monetarily or in other ways such as amount of time. A cost surface includes the cost of reaching certain cells from one or more source cells.

Tuesday, September 22, 2015

Basic data structures used in computers

GIS involves storing entity and attribute data in order to retrieve any combination of these objects. In order to perform these operations the data must be stored in the computer as a file structure. Each graphical entity must be stored explicitly along with its attributes. The following are the basic data structures commonly used in computers:

-Simple lists: It is the most basic file structures
Data is placed in the order in which they are entered
New data is placed at the end
Lack of structure makes searching very inefficient
This list is simple, unstructured and unordered

-Ordered sequential files:
The normal search strategy used here is a divide-and-conquer approach
The number of search operations using this strategy is defined as log2(n+1) operations

-Indexed files:
In this file structure, each entity will have several descriptive attributes assigned to it
An index is required to find a particular entity
Indexed files begin with a set of records and item keys
Indexed files can be indexed either as direct files or inverted files
Indexed files can be used to ignore all other records that do not satisfy the search criteria 

DBMS used by GIS and relational database model

A DBMS or Database Management System is a set of computer programs for managing an integrated or shared database.

A DBMS provides tools for:
-Data input
-Manipulation and output

Arc/Info uses Info and ArcView uses dBase to handle local files
AutoCAD Map uses VISION to manage its database
MGE uses database management tools from Oracle/Informix & provides an interface to other DBMS

The relational database model
A database is a collection of interrelated tables in digital format.
The four types of database design are:
-Flat file
-Network and

A flat file contains ALL data in a LARGE TABLE
Ex:- Extended feature attribute table or a spreadsheet

A hierarchical database organizes data at many levels and uses only ONE-TO-MANY association between levels

A network database builds connections across tables

Network database design and hierarchical database require that linkages between tables must be known in advance. This requirement makes the database complicated and inflexible

Compared to other database designs, a relational database is simple and flexible

Each table in the database can be prepared, maintained and edited separately from other tables

The tables can remain separate until a query or analysis requires attribute data from different tables to be linked together
Since the need for linking tables is temporary, a relational database is efficient for both data management and data processing.

Georelational data model Vs Object oriented data model

A georelational data model uses a split data system (spatial data and attribute data in different files) and the two tables are linked by feature ID. Each map feature has a unique label ID. Attribute data are stored in a “Feature Attribute Table”. Each map feature has a unique label ID.

In a table:

The object oriented data model stores both spatial and attribute data in a single database to reduce processing overload.


Manual digitizing is a time-consuming and tedious process. In order to digitize several complex maps, two expensive alternatives currently in use are:
-Scanning and
-Automatic line following

Scanning is the most common method used for automatic digitization. This method is mainly used when raster data are required since it is the automatic output format from most scanning software. Scanned image of a topographic map can be used as a background raster data set for plotting a vector infrastructure data (such as pipelines or cables). In such cases, the raster background image is very useful as a context for the data.
A scanner is a piece of hardware for converting an analog source document into digital raster format. All scanners work by sampling the source document using transmitted or reflected light. High-quality and large-format scanners require the source document to be placed on a rotating drum, and a sensor moves along the axis of rotation. The accuracy of scanned output data depends on:
-the quality of the scanner
-the quality of the image-processing software used to process the scanned data and
-the quality and complexity of the source document

The resolution of the scanner used affects the quality and quantity of output data.
High resolution scanners produce large volumes of data. This in-turn places heavy burden on:
-computing power and
-storage space
thereby increasing time spent editing the scanned data to make it usable by a GIS

Current scanning technology can produce data of a higher quality than manual digitizing. Hence, manual digitizers are becoming obsolete.

Another form of automatic digitizer is the automatic line follower. This method is used when digital versions of clear, distinctive lines on a map are required. This method is similar to manual digitizing with the exception that it uses a laser and light-sensitive device to follow lines on a map. Scanners produce raster output while the automatic line follower produces a vector output as (x,y) coordinate strings. The data produced by this method is suitable for vector GIS. Automatic line followers are not as common as scanners due to their complexity. Automatic line followers face difficulties when digitizing features such as dashed or contour lines.


The most common method of encoding spatial features from paper maps is manual digitizing. It is useful when a selection of features are required from paper maps. Manual digitizing is also used for map encoding where it is important to reflect the topology of features, since information about the direction of line features can be included. It is also used for digitizing features of interest from hard-copy aerial photographs.
Manual digitizing requires a table digitizer that is linked to a computer workstation. The table digitizer is a large flat tablet, whose surface is laid over a very fine mesh of wires. Attached to the digitizer via a  cable is a cursor that can be moved freely over the surface of the table. Buttons on the cursor allow the user to send instructions to the computer. The position of the cursor on the table is registered by reference to its position above the wire mesh.
Most manual digitizers are used either in the point mode or stream mode. In the point mode, each line segment is digitized beginning with a start node. Change in direction of the line is recorded with a digitized point and the line segment ends with an end node. Thus, complex lines require a greater number of points between start and end nodes. In this context, smooth curves are difficult to digitize as they require an infinite number of points to record their true shape. The minimum resolution is governed by the distance between the wires in the digitizing table (typically 0.1mm). Some digitizing packages allow the user to record smooth curves as mathematically defined splines or bezier curves.
In the stream mode, the digitizer records points according to a stated time interval or on a distance basis. An end node is required to stop the digitizer recording further points. The speed at which cursor is moved above the line determines number of points recorded. Choice between point mode and stream mode is a matter of personal preference.
Stream mode digitizing requires more skill than point mode digitizing. Stream mode usually generates more points and hence larger files than point mode.


Keyboard entry is often referred to as keycoding. It involves data entry into a file at a computer terminal. This technique is mainly used for entering attribute data that are only available on paper. For a small amount of data keyboard entry is a manageable task with the possibility of typographical errors. With the increasing amount of data, alternatives in the form of text scanners and optical character recognition (OCR) softwares are increasingly being used to read in data automatically. Attribute data, in a digital format, are linked to relevant map features in the spatial database using identification codes. These codes are allocated to each point, line or area feature in the dataset.
The coordinates of spatial entities can be entered by keyboard entry. However, this method can be used only when the coordinates are known and there are not too many to be entered.

Friday, September 18, 2015

Data types used in GIS

Data types: GIS data is of two types.
i) Attribute data and
ii)Spatial data
Attribute data refers to the properties of spatial entities. It is also called aspatial data or non-spatial data or tabular data. Attribute data is defined as data that describe the characteristics of spatial features.
 Spatial data refers to location and is an element of reality. Spatial data is defined as data that describe the geometry of spatial features.

Data structure: The basic data structures for GIS are:
i) Vector
ii) Raster
iii) Triangulated Irregular Network (TIN) and
iv) Tabular information (attribute table)

Vector data structure consists of points, lines and polygons.
Points are a pair of (x,y) coordinates to specify a single location
Lines are defined by a set of coordinate pairs.
Polygons are sets of coordinate pairs that define the boundary of an enclosed area.
In vector data layers, the feature layer is linked to an attribute table. An individual feature is linked to a row (record) in the attribute table.

In a Raster data structure, the world is represented by an array of gridded cells. In this type of data structure, a point is described by one cell, a line or a polygon by a zone of cells. A raster grid can store values that represent categories. A grid attribute table has a value and a count field. The value field has a number representing information regarding the grid cell. The count field shows how many grid cells have the same value.
Grid cells can also store continuous values like elevation.
The main source of raster data is digital image photo or satellite imagery.
An example of raster analysis is neighbour cell analysis.

Comparison of Vector and Raster data structures
Vector data structures are used for:
i. Features with discrete shapes and boundaries
ii. Database management, Database query and reporting
iii. Network analysis and
iv. Generating high quality maps
Raster data structures are used for :
i. Continuous surfaces that change gradually over space (Eg: Soil, Land cover, Vegetation, Pollution)
ii. Spatial analysis and modeling (Eg: Agricultural suitability)

Triangulated Irregular Network (TIN) It is a three dimensional data structure for representing surfaces.
Examples of vector elevation data are:
i. Contour lines and
ii. Spot height points
Vector elevation data is good for visualization but cannot be used for analysis unless it is converted into a TIN.
TIN data structure is used for modeling small areas with high precision elevation data. It can use multiple data inputs. A TIN data structure provides more efficient storage than DEM or contour lines. It can be used to model:
i. Roads and road-cuts
ii. Dam construction and
iii. Urban flood modeling
A TIN is very effective in case of availability of high precision data.

Attribute Table is a flat file with columns and rows.
A row consists of ALL geographic features pertaining to a certain entity
A column consists of an item of information about a feature
The common attribute field types are:
i. Numeric (integers or decimals)
ii. Text (strings)
iii. Date and
iv. Binary Large OBject (BLOB)

Sunday, September 13, 2015

Map projections and spatial referencing

Map projections transfer the spherical earth onto a two-dimensional surface. This process introduces errors. Depending on the projection method chosen, there might be a distortion in directions while distances are displayed accurately. Sometimes, shape may be preserved at the expense of accurate area estimates. Different map projections have been prepared for different parts of the world for mapping different sized areas and for different applications. There are hundreds of different map projections. The results of analysis will be affected in different ways by different map projections.

Depending on the development surface, map projections can be classified into:
Conic projection, Cylindrical projection and Planar or Azimuthal projection families as shown below.

One of the functions of a GIS is to allow the integration of data from different sources, the ability to ALTER projections is a fundamental ability of many GIS. There are hundreds of different map projections and some GIS provide the capability to reproject data for most of these.

A referencing system is used to locate a feature on the Earth's surface (3 dimensional) or a map of the Earth (2 dimensional). Several methods of spatial referencing exist and they can be grouped into three categories listed below:
-Geographic Coordinate System (GCS)
-Rectangular Coordinate System (RCS) and
-Non-Coordinate System (NCS)


Lines of longitude are also called MERIDIANS
The arbitrary choice for a central line of longitude is that which runs through the Royal Observatory in Greenwich, England. It is also called PRIME MERIDIAN.
The meridians are widest apart at the Equator and closest together at the Poles
The relative distance between lines of longitude where they intersect lines of latitude is ALWAYS EQUAL.
lines of latitude lie at right angles to lines of longitude and run parallel to one another
The latitude with the greatest circumference is called the EQUATOR (Central parallel) and it is equidistant from the poles.
At the poles, the lines of latitude are represented by a single point - the pole.
Using the lines of longitude and lines of latitude, ANY POINT ON THE EARTH's SURFACE can be loacated in degrees and minutes.

All features on the Earth can be located relative to one another and the distance between them can be calculated.

The shortest distance between TWO points on the Earths surface is known as GREAT CIRCLE DISTANCE.

Examples of RCS are:
-UK Ordnance Survey's National Grid and
-Universal Transverse Mercator (UTM) plane grid system

Non-Coordinate Systems provide spatial references using a descriptive code rather than a coordinate. Examples are postal codes used throughout the world. While some of the codes are fully numeric (U.S. post codes, Indian PIN codes), others are alphanumeric (U.K. post codes).  The advantage of this system is that it is used by the postal department and hence continuously maintained and updated. In the wester

A Geographic Coordinate System (GCS) defines locations on the Earth using a three dimensional spherical surface. A GCS includes:
-An angular unit of measure
-A prime meridian and
-A datum (based on a spheroid)

The longitudes and latitudes encompass the globe forming a gridded network called a GRATICULE

The equator defines the line of ZERO latitude. The prime meridian defines the line of ZERO longitude. The point where the zero latitude and zero longitude intersect defines the origin (0,0)

Circles with the same radius as the spherical Earth are called GREAT CIRCLES. ALL MERIDIANS AND THE EQUATOR ARE GREAT CIRCLES.

Although the Earth is assumed to be roughly spherical for most practical considerations, with all its irregularities due to the various natural landforms, it is now a well accepted fact that the shape of the Earth is an OBLATE SPHEROID.

The Earth is sometimes assumed to be SPHERICAL to make mathematical calculations easier. For a small scale map (1:5,000,000) the difference between a sphere and a spheroid is not detectable. However, for large scale maps (1:1,000,000) a spheroid is necessary to accurately represent the of shape of the Earth.

A spheroid is defined by:
-semi-major axis 'a' or
-semi-minor axis 'b' or

flattening (f) = (a-b)/a flattening is a small value; hence 1/f is generally used.

Flattening ranges between 0 and 1. Flattening of the Earth is approximately 0.003353
Square of the eccentricity is similar to flattening and describes the shape of a spheroid

e^2 = (a^2-b^2)/a^2

Birth and History of GIS

The primary force behind the development of GIS was thematic cartography. visualization and generation of application of application-specific maps. Traditional cartographic processes were human intensive, inflexible and error-prone. With the arrival of new technologies like surveying using GPS, aerial photography and photogrammetric interpretations of remotely sensed images the manual cartographic process  was replaced by digital cartographic process resulting in reduced dependence on human skill and eliminating errors in the process. The basic concepts of digital cartography can be traced to the work done by the Land Inventory Branch of the Canadian Government and the HARVARD lABORATORY FOR COMPUTER GRAPHICS AND SPATIAL ANALYSIS. Commercial GIS like ARC/INFO and Geomedia suit of solutions considered to be the first generation GIS were launched by Environmental Systems Research Institute (ESRI) and Intergraph respectively. Other popular GIS softwares are Geographic Resource Analysis Support System (GRASS), Quantum Geographic Information System (QGIS), System for Automated Geoscientific Analysis (SAGA) and many more available in the public domain for free.

GIS helps answer questions about:
-Trends and

Before the advent of GIS, the analysis of the large volumes of spatial data coupled with time constraints and limited techniques made this process very difficult or impossible.  GIS techniques help in quick remodelling for modifications in the geographic context. The end product of a GIS is map used for public appraisal and feedback.  One of the most convenient aspects of a GIS is that it permits repetition of the analysis of a problem for a modified set of conditions on the same base map to evaluate options in a detailed and scientific manner.

GIS can also be used as a tool for policy formulation.
GIS can be used for environmental management problems and is currently being used by environmental managers all over the world.

GIS is a computer system covering three main components:
-Software module and Appropriate procedures
- Proper organizational context (Trained personnel)

GIS uses spatially referenced or geographical data. It performs management and analysis tasks on these data. Output of GIS is mainly in the form of maps, graphs, text and summary statistics. GIS is used to add value to spatial data.

GIS software has been designed to handle spatial data. Spatial data are characterised by
-information about position
-connection with other features and
-details about non spatial characteristics

The method of representing geographic space occupied by spatial data as a series of thematic layers is known as LAYER-BASED APPROACH.

Features in the real world are expressed in the computer as discrete objects. This method is known as OBJECT-ORIENTED APPROACH.

All geographic features are classified into THREE TYPES for entry into a GIS. They are:

Data input and updating are the most important, expensive, complex and time-consuming part of a GIS project.

The ability of GIS to transform spatial data and perform spatial analysis distinguishes it from other types of information systems.

Aronoff (1989) classified GIS analysis procedures into three types:
-Storage and retrieval
-Constrained queries and
-Modelling procedures (what-if scenarios)

Data layers in a GIS contain data of ONLY ONE ENTITY TYPE. (Point or line or area data)

GIS has become an accepted tool for the management and analysis of spatial data.

All GIS are computer representations of some aspect of the real world.

The simplified view of the real world adopted by GIS is often termed as a model.

Data is simply a collection of alphanumeric characters with no meaning. Data to which a context is added becomes meaningful. Hence, Information is data to which meaning and context are added(Hanold, 1972). Data can be PRIMARY or SECONDARY. This data has three dimensions:
-Temporal (WHEN the data was collected)
-Thematic (describing the CHARACTER) and
-Spatial (WHERE the object or phenomena occurs)

In GIS, the thematic data are generally referred to as NON-SPATIAL or ATTRIBUTE DATA

Distinction between thematic and topographic maps is that thematic maps display data related to a particular theme such as soil, geology, geomorphology, landuse, population, natural resources, etc while topographic maps contain a diverse set of data on different themes. For example: land use, relief and cultural features may appear together on the same topographic map. Unwin (1981) argues that "a topographic map is simply a composite of different kinds of maps".

Topology is simply the geometric relationships of objects.

The method chosen to represent a spatial feature depends on the scale of the map. The relationship between scale and detail is called SCALE RELATED GENERALIZATION.

GIS - Basics

Geographic Information System is an interdisciplinary field of knowledge, comprising geography, digital cartography, computer science, mathematics, image processing, pattern recognition, digital photogrammetry and remote sensing.

GIS can be defined as a system of systems collecting, processing and analysing spatio-temporal information regarding the Earth features. It involves people preparing the data, the system (input and output devices, computing platforms and networking) and the users using the system.

Operational Information Systems (urban planner, agriculture and land management systems, etc) spatio-temporal data collected over a period of time is required to be processed. These systems DO NOT exploit the TIME CRITICALITY associated with spatial data. Processing of spatio-temporal data is done by GIS.

GIS emerged primarily due to necessities arising due to use of digital maps in civil and military applications. It has left a positive impression on varied fields such as operation planning, situation representation and terrain feature measurement. However, the primary force that led to the development of GIS as an information system was thematic cartography (composition of maps based on a particular theme, collaborative visualization of operation information and application specific map generation ).

Geographic Information Systems (GIS) is a set of digital ideas and concepts designed to provide answers to questions based on mapped data. It helps in analysing mapped data to understand distributions and explain these patterns. It can be used to make predictions of future distributions and effects.

GIS is a modern enabling technology. Its applications are used in a wide range of disciplines.

Geographic Information Systems (GIS) are tools that process spatial data into information. This information can be used to make decisions about some portion of the Earth.

GIS is difficult to define as it involves integration of several subjects. There is no absolutely agreed upon definition of a GIS. The definition of a GIS depends on the person or organisation using it.

David Rhind defines GIS as a “computer system for collecting, checking, integrating and analysing information related to the surface of the Earth.”

GIS can be though of as a series of sub-systems within a larger system. According to this definition, GIS has the following subsystems:

  1. A data input subsystem that collects and preprocesses spatial data from various sources.
  2. A data storage and retrieval subsystem that organises spatial data in a manner that allows retrieval, updating and editing.
  3. A data manipulation and analysis subsystem that performs tasks on the data, aggregates and disaggregates, estimates parameters and constraints, and performs modelling functions
  4. A reporting subsystem that displays all or part of the database in tabular, graphic or map form.

Map is the fundamental tool for analysis of spatially related data.
Analysis subsystem is the heart of the GIS.
In GIS, all data are explicitly SPATIAL.
In GIS, every feature on the Earth is considered to be either a 'POINT', 'LINE' or 'AREA'.
POINTS are said to have 'zero dimensionality'.
LINE objects are “one dimensional”.
Objects that occupy both length and width are called “AREAS”.
AREAS are a series of lines that start and end at the same location.
Adding the dimension of height to areas explains the existence of SURFACES.
Point features are discrete while surface features are continuous.
The non-spatial (aspatial) information that help in the description of features in space are called the feature's ATTRIBUTES.
The measurement framework for geographic data is called, “GEOGRAPHIC DATA MEASUREMENT”. One level of measurement is the “NOMINAL SCALE” (named). Another level of measurement is the “ORDINAL SCALE”. Comparisons can be made ONLY in the “INTERVAL LEVEL OF DATA MEASUREMENT”.

The most useful level of data measurement is “RATIO”. The ONLY level of data measurement that allows a direct comparison between two spatial variables is “RATIO”.

A primary purpose of GIS is to analyse the relationship of objects in space.
The important terms related to spatial patterns are: “PROXIMITY”, “REGULAR PATTERN”, “RANDOM PATTERN” and “DISPERSED PATTERN”.
The powerful capabilities of modern GIS include the ability to ILLUSTRATE, QUANTIFY and DESCRIBE spatial associations and thus allow examination of mechanisms that cause these associations.
GIS helps identify spatial patterns and explains pattern interactions.

GIS - objective questions

  • A map is a/an MODEL/ABSTRACTION of the real world
  • A map legend links ATTRIBUTES to geographic information
  • SPATIAL DATA is always graphical
  • Attributes are also called ASPATIAL data or NON-SPATIAL data
  • EASTINGS refer to X-coordinate
  • NORTHINGS refer to Y-coordinate
  • POINT is a single location in space
  • The manner in which ENTITIES or geographical features are represented on a map is DEPENDENT ON SCALE
  • LABELS distinguish geographic features of the same type
  • Every feature on the Earth is represented on a map using a SYMBOL
  • A small scale map covers a LARGE area
  • A DATABASE is a repository capable of storing large amounts of data
  • CARTOGRAPHY is the art and science of making maps
  • As scale decreases, size of the features may decrease causing:
  • AZIMUTHAL projections are also called planar projections
  • Scale factor is a ratio of ACTUAL SCALE to PRINCIPAL SCALE
  • All geographic features are in TWO tangible forms
  • In any projection, the parallel on the development surface that maintains true scale is called STANDARD PARALLEL
  • CONFORMAL projections are used for NAVIGATION purposes requiring accurate directions
  • Conformal projections are also called ORTHOMORPHIC projections
  • DATUM is a set of reference values that serve as a reference for mapping
  • GIS is primarily a DECISION SUPPORT SOFTWARE within a SPATIAL framework
  • A topographic map is a COMPOSITE of different kinds of maps
  • In a projection, distortion is least when the map shows SMALL areas and maximum when the map shows LARGE areas
  • Representative fraction for reference globe is called PRINCIPAL SCALE
  • Principal scale is the ratio of RADIUS OF EARTH to RADIUS OF GLOBE
  • Scale factor is the ratio of ACTUAL SCALE to PRINCIPAL SCALE
  • The two tangible forms in which all geographic features exist are DISCRETE and CONTINUOUS
  • Different kinds of information that are stored in various ways is called a THEME
  • Difference in elevation of an area between tops of hills and bottoms of valleys is known as RELIEF of the terrain
  • A point of known elevation and position is indicated on a map by the letters B.M. meaning BENCH MARK with altitude given to the nearest foot
  • CONTOURS represent the third dimension on a map
  • The difference in elevation represented by adjacent contour lines is termed as CONTOUR INTERVAL
  • The art and science of map making is called CARTOGRAPHY
  • Lines of latitude are called PARALLELS and lines of longitude are called MERIDIANS
  • A cylindrical projection yields a RECTANGULAR graticule whereas a conic projection yields a FAN shaped graticule
  • CONIC projections are suited to map areas having EAST-WEST extents
  • TRANSVERSE MERCATOR projection is suited to map areas with NORTH-SOUTH extents
  • The three important components of GIS are:
  • Three popular GIS software application packages along with their vendors are:
        • MGE/MGA system           -           INTERGRAPH CORPORATION
        • ARC/INFO                        -           ESRI
        • SPANS                              -            TYDAC
  • GIS uses a database approach because of:
  • Output from a GIS is normally:
        • MAP
        • TABULAR DATA or
        • REPORTS
  • POINT is the basic building block from which all spatial entities are constructed
  • A closed ring of coordinate pairs defines the boundary of a POLYGON/AREA
  • TOPOLOGY is concerned with connectivity between entities
  • The three elements of topology are:
        • ADJACENCY
        • CONTAINMENT and
  • The four prominent applications of remote sensing and GIS in agriculture are:
        • CROP HEALTH

Thursday, September 3, 2015

GIS-Unit 4-Syllabus-OU

GIS Analysis Functions for Integrated Analysis of Spatial and Attribute Data: Retrieval and classification functions: Overlay operations, neighbourhood operations, connectivity functions, output formatting, map annotations text patterns and line styles, graphic symbols, cartographic modeling by GIS analysis procedure with an example.

Presentation of geo-data and analysis: Types of output data - Types of errors elimination and accuracies-sampling-components of data quality