Workshop on Population
Data Analysis, Storage and Dissemination Technologies
Bangkok, 27-30 March
2001
STAT/WDT/1
21 March 2001
ENGLISH ONLY
ECONOMIC AND SOCIAL COMMISSION FOR ASIA AND THE
PACIFIC
Workshop on Population Data Analysis, Storage
and Dissemination Technologies
27-30 March 2001
Bangkok
An Alternative Approach
for Presenting Small Area Statistics: A Grid Square
Database*/
(Items 4 and 5 of the provisional agenda)
*/
This paper, prepared by Mr Toshio Shigematsu,
Resource Person, has been reproduced as submitted.
It has been issued without formal editing.
1.
Introduction
Cartographic presentation of small area statistics
is a powerful means for visualizing the census
results. For this purpose, most often, statistics
of enumeration districts (ED) or census blocks
are presented using the boundaries of EDs or
census blocks. However, this requires digitizing
the ED boundaries within a whole country which
will be a very costly and time-consuming work.
It would be fair to say that most developing
countries cannot afford to undertake this even
under the current rapid improvement of costs
for acquiring digital data through such devices
as GPS, digitizer and scanner.
This paper suggests that those countries, which
do not have adequate amounts of resources for
digitizing their ED boundaries, consider an
alternative method, not a new nor a sophisticated
but a less costly method, for presenting small
area statistics through a grid square database
approach. What is involved in this approach
is (a) dividing a whole territory or a region
into grid squares, each grid square being a
rectangular of equal size, 100 meters (U.K.)
to 1-kilometers (Japan and the Republic of Korea)
and (b) allocating population data of ED to
the grid squares. The paper explains how to
do this allocation. Also, the paper examines
the advantages and disadvantages of this approach.
2.
Costs of Creating and Maintaining GIS
Maps make it easier to present, analyze and
disseminate census results. They facilitate
the identification of local patterns of important
demographic and social indicators. Maps are
thus an integral part of policy analysis in
the public and private sectors.
However, there have been many warnings and
reservations for creating GIS because of costs.
UN handbook on geographic
information systems1/
states that the initial investment and the
following maintenance costs of GIS should
not be underestimated. Like any new technology
or organizational transformation, the introduction
of GIS involves a change in routine and significant
expense, not only for software and hardware
but also for data purchase, training, planning
and organizational restructuring. In
fact, the significant costs involved are the
main reason why the section on GIS in the
revised Principles and Recommendations
for Population and Housing Censuses (United
Nations, 1998) are worded very carefully.
The indirect costs, in particular, are often
underestimated and may lead to the failure
of a GIS project.
The Draft Guideline for
Dissemination of this Workshop2/
warned that "less advanced national statistical
offices should not contemplate the introduction
of a GIS facility before all other dissemination
options have been fully investigated. The
cost of implementing a GIS facility would
outweigh any benefits unless a sophisticated
user group exists within the country that
could contribute towards the development costs
associated with such a venture."
Although digitized map
of ED is the heart of constructing a GIS,
digitizing the boundary of census enumeration
will be a very costly undertaking, regardless
of the approaches you may choose; direct data
collection with GPS or geographic data conversion
of hard-copy maps or aerial photographs through
digitizing or scanning.
1/
UN Handbook on geographic information systems
and digital mapping, p.6. 2/UN
ESCAP, Workshop on Application of New Information
Technology to Population Data, Bangkok 12 - 20
October 1999: Draft Guideline for Dissemination:
Section 3.6. Innovative and economical approaches
for less advanced NSOs.
3.
Grid square database
The grid square database
has been created in such countries as U.K.,
Japan and Republic of Korea, etc. for the
purpose of presenting the data of small area.
Data from the grid square database are widely
utilized, particularly for urban and rural
planning and administration in national and
local governments, for urban and regional
studies in research institutes and for marketing
in commercial fields.
For the purpose of creating
a database on the grid square basis, the whole
territory of a nation or a region is subdivided
into smaller areas with the same size and
the same rectangular shape. The size of a
grid square utilized varies among countries.
U.K. employs a size of 100 meters by 100 meters,
while Japan and Korea delineate the standard
Grid Square with a size of about 1000 meters
by 1000 meters.
The grid square data in
U.K. were compiled by aggregating the data
for all households located within each grid
square, while those in Japan and Korea were
compiled by aggregating the data for all enumeration
districts involved within each grid square.
The methodologies of the aggregation are explained
below.
4.
Methods of allocating households or enumeration
districts to a grid square3/4/
1) Methods of allocating
households or housing units to a grid square
The most precise results will be achieved
if individual households or housing units are
allocated to grid cells. (Figure a) However,
in order to allocate each household or housing
unit, it is necessary to have a map in large
scale on which the geographical location of
each household is indicated precisely. Without
such detailed map, it would be difficult to
employ this method for creating the grid square
data base, although with falling prices of GPS
units one can imagine that more countries will
produce such data by equipping each enumerator
with a GPS during the census. However, for now,
this is not a practical idea for the most of
developing countries.
2) Methods of allocating
enumeration districts to a grid square
Allocation of enumeration districts to a
grid square can have several approaches.
One is to simply allocate
enumeration districts to a grid square, if
more than of its area falls into that cell
(Figure b).
On the other hand, a large
enumeration district may include several much
smaller grid squares (Figure c). In
this case, the ED data could be assigned in
total to the grid cell that contains the population
centroid of the ED. The population centroid
defines a representative point in the
ED that should coincide with the largest population
concentration in the area. Alternatively,
the data can be distributed evenly across
all grid cells that fall into the enumeration
area.
The centroid or representative
points can also be used directly to allocate
ED data to grid squares. In this method, the
centroid point of each enumeration district
has to be defined at the location of the center
of population or at the most principal population
cluster within the enumeration district. Then,
data of those enumeration districts, whose
centroid point falls within a grid square,
are to be added to the data of that grid square.
Using the digitizer or GPS the locations of
the centroids of enumeration districts can
be digitized and recorded in the computer-file
of the enumeration districts and, then, the
allocation process can be performed easily
by computer.
The centroid or the location
of the central point within a census ED may
be determined by finding the site of the chief
settlement within the census ED on the topographical
map on a large scale, such as 1/25,000.
3/
UN Handbook _____: p.113. 4/
Ohtomo, A (1991). Small area statistical databases.
Second Interregional Workshop on Population
Databases and Related Topics. Jakarta, 14-19 January.
New York. p. 114.
5.
Advantages and disadvantages of the grid square
data base approach5/
1) Advantages of grid square
database
The grid square data can
be easily aggregated into a larger area delineated
arbitrarily, whereas data for villages, localities,
or enumeration districts can not since they
have different sizes and irregular shapes.
Data for an area arbitrarily delineated can
be obtained simply by adding up the data for
the grid square encircled with the boundary
of the area. On the other hand, in the case
of using the data of villages, localities,
etc. having different sizes and irregular
shapes, an extensive adjustment of the aggregated
data may be often needed because of the possibility
of a sizable discrepancy between the aggregated
area and the area actuary delineated.
The grid square data constructed
usually carry with it attributes indicating
the relative location of the grid square,
that is, longitude and latitude, and, therefore,
a distance between two grid squares can be
calculated without an actual measurement on
the map.
Such attribute of the grid
square described above enable an efficient
application of the computer mapping technology.
For example, data for grid squares can be
printed in the form of statistical map without
providing additional geographical information
of these areas into the computer.
Such attributes as mentioned
in (a) could also facilitate integrating population
data with those from other sources which are
often presented in different units of statistical
area. For instance, enumeration districts
might have been delineated independently among
population census and other censuses such
as establishment census, agriculture census,
industry census and commerce census, etc.
In such cases, if data of other censuses were
compiled to grid squares directly or based
on data for their enumeration districts, data
from population census could be utilized in
an integrated manner with data from other
censuses on small area basis.
The grid square data normally
enable a much easier statistical comparison
on the map, compared with those based on areas
having different sizes and irregular shapes.
Therefore, it provides an area-standardized
comparison of statistical characteristics
among areas concerned on the grid square map.
2) Disadvantages of grid
square database
Against the advantages of the grid data base
mentioned above, there are disadvantages, most
of which arise from the characteristics of grid
squares, as described below:
A great deal of manual
work may be needed for allocating each household
or enumeration district to each grid square.
Normally, the allocation of household or enumeration
district to each grid square is made manually
on the map by using a digitizer or other tools.
Consequently, when the number of households
or enumeration districts is large, a significant
amount of manual work would be needed.
There are often inaccuracies
in the data aggregated for the grid square,
particularly when an aggregation is made on
the basis of the data of enumeration districts.
In this case, the level of inaccuracy would
rise as the size of grid square is made smaller.
Data may not be obtained
for an area perfectly identical in shape of
the boundary to the area where the data are
needed, because of the aggregation of rectangular
shaped areas. Therefore, in order to
obtain data for the area possibly identical
to the original area, the grid squares should
be delineated as small as possible.
Needless to say, data are
not available for an area smaller in size
than grid square.