|Workshop on Population
Data Analysis, Storage and Dissemination Technologies
| Bangkok, 27-30 March
ECONOMIC AND SOCIAL COMMISSION FOR ASIA AND THE
STAT/WDT/Republic of Korea
21 March 2001
Workshop on Population Data Analysis, Storage
and Dissemination Technologies
27-30 March 2001
|Population Data Analysis,
Storage and Dissemination Technologies in Korea
| Prepared by Sei-young
and Youn-young Park3/
I> Census Data Collection
II> Census Data Dissemination
This paper has been reproduced as submitted.
It has been issued without formal editing.
Deputy Director, Statistical Information Division,
National Statistical Office, Republic of Korea.
Statistician, Population Census Division, National
Statistical Office, Republic of Korea.
|1. Census 2000 in Republic of
Korea has been taken successfully from 1st to
10th of Nov. 2000. The budget of Census 2000 amounted
to 83.4 billion won in the fiscal year
of 2000. About 132,600 enumerators were recruited
temporarily to take this census, which indeed
was unprecedented in terms of size.
|2. The population and housing
census had been in preparation for 3 years. During
the preparation period, we had taken 6 pilot surveys
and delineated enumeration districts (EDs) nationwide.
The main characteristics are as follows:
- Including a variety of
items for new policy demands;
- Introducing self-enumeration
- Using digital maps.
|3. Information technology has
developed with a great rapidity. The censuses
taken in many countries have made amazing progress
thanks to the newly applied technologies. But
due to each country's social and cultural differences,
different technologies specific to each country's
circumstances have been applied to the census-taking.
We hope that this workshop will be good a opportunity
to share the new technologies that can be applicable
to Korea easily.
I> Census Data Collection
application of digital maps to delineation of
|4. Korea National Statistical
Office (KNSO) launched a project of making maps
in 1998 in that each ED was endowed with regional-code
and a map-database constructed. Digital mapping
replaced the older raster image process used by
|5. EDs were delineated so as
to create a reasonable workload to enumerators
but also to avoid overlapping and omission in
count. Two types of EDs were defined in the 2000
Census, namely, general EDs and special EDs. The
two types of EDs were divided into sub-groups.
The former, divided by about 60 households, covered
all the areas accessible freely by the enumerators.
The latter was applied to the areas inaccessible
by the enumerators such as military camps, naval
fleets, police/maritime-police camps and stations,
prisons, juvenile detention houses, and Korean
diplomatic and consular offices abroad.
|6. It took almost 4 years to
finish the work of delineating EDs.
|Table 1. Timetable for
||Stage of obtaining GIS-data
- 1/1000 - metropolitan
and urban areas
- 1/5000 - rural
- 1/25,000 - mountains,
|Stage of basic map ascertaining
||98. 2~99. 12
precincts' ascertaining and editing
and renewing of Main foundation and
and renewing of Buildings and quarters
- Acquisition of
apartments' basic information
- Making and Diffusion
of basic maps
of quarters and households
- Delineating Eds
- The renewing
and editing of changed EDs
- Output and providing
of Enumeration maps
|7. Moreover, the effective application
of Geographic Information System (GIS) technology
was required to make the delineation successful.
Particularly the system of Map Searching and Output
System for Statistical Survey (MSSS) was applied
to fulfil the demand. This system made it possible
to make various searches on even small districts,
and all types of digital maps can be printed out
by this system. Main characteristics of this system
are as follows:
- Various searches on small
districts become possible, and the management
on them can be done easily;
- The selected regions by
the user can be conveniently edited;
- The map on regions can
be printed out into various types and sizes.
|8. The Client means census-taking
staff, namely, the members of the work group.
They work with their personal computers, equipped
with Windows 95/98, Auto-CAD and MSSS. The server
is equipped with the Oracle and Spatial Database
Engine (SDE). The server machine has these functions
of data construction; editing and renewal. Through
these processes, GIS-data can be printed in paper
form of paper by using a plotter.
|Table 2. GIS System diagram
|9. With the help of this system,
we will be able to automatically print out EDs'
maps suitable for many different types of census,
such as population census, agricultural census,
industry census and so on.
PC data entry>
|10. In Census 2000, the main
strategy at the data processing phase is to satisfy
various demands of the census data by the users.
The results of the survey should be tabulated
and released as soon as possible. For this purpose,
we had introduced a regionally distributed input-system
of questionnaires into electronic media. Thanks
to this system, the preliminary count of Census
2000 could be released in late December.
|11. Several methods of collecting
data were taken into count for many reasons. The
pros and cons of them had been compared and analyzed
with one another. Later on, we arrived at the
final conclusion that PC outsourcing was the most
economic and time-saving method in the Korean
context. To reduce the time of releasing data
of census results, the regionally distributed
processing method was selected for data-input
and data-check instead of the centralized process.
Application program for data-input and data-check
was developed using Clipper programming language,
and spread over 12 local offices.
|12. In addition, an imputation
method was introduced. It will contribute largely
to shorten the editing period and improve the
accuracy of data. The total results of Census
2000 will come out at the end of this year with
the help of these new methods.
coding of industry and occupation items>
|13. In Census 2000, the automatic
coding system of industry and occupation classifications
was introduced. The items concerning industry
and occupation classifications are generally considered
to be the most difficult in terms of making a
precise process. Therefore, KNSO has constructed
the indexed database on these classifications
to retrieve automatically the classification codes
matched with the industry and occupation items,
which are composed of several words.
|14. In previous census, the
coding of these two items was done by hand. Those
gave birth to inconsistencies and inaccuracies
owing to different abilities of the workers, and
lots of man power and time were required to tabulate
them. Through this matching system to the indexed
DB, codes with regard to items will be automatically
brought out. This system will improve the quality,
and save time.
|15. This system is based on
the UNIX operating system working with Gnu Database
System (GDBMS), developed in C language and shell
script. KNSO will apply both hand-made coding
and automatic coding at the same time to make
a comparison of them.
2000 on the Internet web site>
|16. The Census web site has
been set up to make people aware of census-taking
and persuading them to participate more actively
in it. This web-site has such meta data as census
briefs, recruitment of enumerators, methods of
survey, glossary and the like.
|17. In particular, Q&A bulletin
board has been updated in order to respond to
numerous inquiries and questions from respondents
and enumerators. This service has a capability
of instantly delivering information on census
preparations to the participants by making the
census-taking in advance. As on-line tools for
instant responses were provided, census staffs
could make their answers efficiently to numerous
questions evoked during the census-taking period.
Furthermore, this service made it possible for
the same rules to be applied simultaneously over
the entire country.
II> Census Data Dissemination
strategy of census data>
|18. In terms of utilization,
this census data should be available easily through
various media, and freely accessible to them.
In the first place, the provisions of the whole
data in short-forms and long-forms have been considered
seriously as a kind of raw data. They will be
given in the form of web document and CD-R. In
the past census, only 2% sample data had been
available to users.
|19. Secondly, the results of
short-forms and long-forms will be published in
the form of printed publication and CD-R. The
reports of short-forms will be published as 3
different series of reports, such as those of
the whole country, provinces, and family names.
The reports of long-terms consist of several series
of report related to special topics.
|20. Thirdly, the reports - in
which various data were comprehensively analyzed
- will come out soon. They will deal with various
topics to carry out the researches of how census
data are put into practical use and how new technologies
on the Census can be developed. To complete these
works successfully, many special experts should
|Table 3. Timetable for
the release calendar of 2000 Census
||Preliminary count of
population, household and housing
||Basic Information on
the size, structure, and distribution
of Population and Housing
||Various social and economic
characteristics of Population and Housing
||In-depth analysis of
the social and economic characteristics
concerning population, place of birth,
of integrated census database>
|21. The Census database loaded
with all the results of past censuses has been
constructed. The first Census of Republic of Korea
was conducted in 1925. The 37,000 pages, 119 volumes
of census report from 1925 to 1995 will be loaded
into the database.
|22. Each page has been transformed
into a computer file using OCR, and the inspection
process on them has been followed. As you may
know, the KNSO has provided the statistical DB,
Korean Statistical Information System (KOSIS)
for years. The errorless computer files will be
loaded into KOSIS; the KNSO own developed application
program will be used. The newly constructed census
database may be ready for use this summer. As
a matter of course, they will be piled up into
the census DB after the reports of this census
dissemination through STAT-KOREA web site>
|23. The KNSO has adopted a kind
of decentralized statistical system. Because of
this, many users find it difficult in searching
for statistical information. To overcome this
problem, the KNSO constructed a portal site where
users can easily retrieve statistical information.
This is STAT-KOREA, which means a portal site
of statistical information built by the KNSO.
It provides a one-stop service that users can
find conveniently any statistical data, although
different data are dispersed in 130 places of
statistical institutes. The latest census data
will be provided for users in the form of web
document, database, etc. through STAT-KOREA.
the Internet Shopping Mall>
|24. As the Internet is spreading
rapidly, the number of cyber shopping malls are
increasing more and more. KNSO already opened
a shopping mall on the Internet in 1999, and it
has helped users purchase statistical products
in a convenient way through on-line. The Internet
shopping mall deals with various kinds of statistical
media such as book, CD-R, etc. This will make
a great contribution for the distribution of census
|25. At this moment, we are in
the middle of conducting census 2000. Both successful
census-taking and efficient data-dissemination
will play a vital role in securing the solid grounds
for the future censuses. As we all believe, without
accepting new technologies, any improvements of
census-taking and data-dissemination will not
be possible in terms
of speed and preciseness. All the time, the KNSO
is ready to learn good technologies, if they are
pertinent applicable to Korea. Hopefully, we will
have a successful outcome as has been seen in