| Workshop on Population Data Analysis, Storage and
Dissemination Technologies |
| Bangkok, 27-30 March 2001 |
STAT/WDT/Thailand
26 March 2001
ENGLISH ONLY
ECONOMIC AND SOCIAL COMMISSION FOR ASIA AND THE PACIFIC
Workshop on Population Data Analysis, Storage
and Dissemination Technologies
27-30 March 2001
Bangkok |
| The Intelligence Characteristic
Recognition in Capturing Thailand 2000 Population
and Housing Census Questionnaire1/
|
| Contents |
| |
1/
This paper has been reproduced as submitted. It has been issued without
formal editing.
|
| The
2000 Population and Housing Census |
| The 2000 Population and Housing Census is the tenth population
census and the fourth housing census of Thailand.
The National Statistical Office (NSO) carries
out the Population and Housing Census every ten
years in accordance with the international standard
to collect basic data on the number and characteristics
of population and housing throughout the country.
The data are used for planning and setting policies
of the government and private sectors. The census
data are also used for population projections.
The field work was in April 2000, and the first
of April was the census date. |
| Methodology |
| The data collection was interviewing method. All provinces
except Bangkok, school teachers were used for
supervisors and interviewers in both municipal
and non-municipal areas while temporary employees
in Bangkok were hired for both supervisors and
interviewers. There were 40,000 enumeration districts.
The total of 6,000 supervisors and 40,000 enumerators
were used for the field enumeration. |
| The sample census enumeration technique was used to reduce
cost. All persons and households in every province
were listed and simultaneously enumerated with
the short form questionnaires except for the sample
households (20%) which were enumerated with the
long form questionnaires in every province including
Bangkok. |
| Preparatory
activities |
| Preparatory activities for the 2000 Census began in 1998.
Major activities were carried out, there were
a calendar of operations, a request for permission
from the cabinet to carry out the census, the
budget requested for the whole project covering
the period of 1998-2001, covering questionnaires,
manuals of instruction, materials and maps of
enumeration areas, a publicity program, data processing,
data dissemination, a post enumeration survey,
and the census field personnel. |
| Budget |
| The NSO received only 581.4 million Bath or $ 15.3 million
for the entire census operation between 1998-2001.
For the total budget, about half of the budget
was for the field enumeration. The distribution
of the expenditure is as follow: |
| Field enumeration |
49.0% |
| Data processing |
12.5% |
| Mapping |
5.1% |
| Questionnaire |
4.5% |
| Publicity |
4.3% |
| Publications |
3.1% |
| Others (e.g. training) |
21.5% |
| Total |
100.0% |
|
| Publications |
| The publications were the preliminary report, advance
report, the final reports of all provinces (76
provinces), regions (4 regions) and the whole
kingdom and analytical reports based on 1 percent
sample on special topics such as, economic activities,
migration, fertility, etc. |
| At present the NSO has released the preliminary report
and advance report, publication are available
and also available on the internet. The total
population count by the census was 60.6 million
and about 16 million households. |
| ICR
technique in capturing the census questionnaire |
| Since the final report of all provinces (76 provinces),
regions (4 regions) and the whole
kingdom has planned to be published within September
2001 or about 17 months after finishing the interview
period. In doing so the questionnaire forms
(16 million forms) need to be captured and translate
into ASCII files as soon as possible for further
edit, tabulation and publication. The NSO decided
to use the new technology, the Intelligence Characteristic
Recognition (ICR). The scanner of ICR can read
the questionnaires at very high speed with the
program that can translate the number and alphabet
of hand writing into the ASCII file for further
processing rather than data keying system. The
reasons for using ICR are as follow : |
- The quantity of the
data about 16 million questionnaire forms
need to be processed in very limited of time,
with the ICR technology that the scanner can
read the questionnaire forms at a very high
speed and with the program that can translate
the number and alphabet of hand writing into
the ASCII file for further processing by mainframe
computer.
- Accuracy, the ICR can
read the information more accurate than data
keying.
- Reduce cost and space,
for data keying needs 700 microcomputers with
700 data keying personals to finish keying
census questionnaires within 15 months. Compare
to a system of ICR which consists of 6 scanners,
82 microcomputers with 80 personals to complete
the census questionnaires with in 8 months.
- After processing the
census data, the ICR can improve the timeliness
in processing other the NSO's census and survey
questionnaires.
|
| Development |
| In the early stage of planning, the NSO planned to decentralize
the data processing with the very limited budget.
The questionnaires of both short and long forms
will be edited and coded by manual at the local
provincial statistical offices. After that, the
questionnaire forms would be sent to the center
to be data captured. There would be 15 data capture
centers, 3 center stationed in the Northern Region,
5 centers in the Northeastern Region, 3 centers
in the Central Region, 3 centers in the Southern
Region and one center in Bangkok. Each center
consists of 3 data captures connected with microcomputers,
all of them were decided for stand alone. For
Bangkok the system would be a LAN system consists
of more advance data capture, one server and 6
microcomputers. After completion process the census
questionnaires, each data capture together with
the microcomputer would be distributed to major
province, with the alternatives plan that every
province would be equipped with at least one data
capture device. |
| After series of testing we found that a scanner with normal
commercial software package can not handle the
large amount of census questionnaires, because
of low speed reading and inaccurate in alphabet
reading, and can not control the quality of verification.
The low quality of the questionnaire paper cause
the screen needed to be clean very often and the
roller need to be change more often than expected.
Because of the distributor located in Bangkok,
it was not convenience and time consuming when
it needs maintenance. |
| We found that the more sophisticate software can solve
the problem and with the very limited budget we
can effort to have only one very efficient software
package program and in order to solve the maintenance
problem, NSO decided to have only one ICR system
station in Bangkok. |
| Upon bidding the Loxley Business Information Technology
Co. LTD passed the qualification test. The system
consist of |
Hardware
1.1 NT Server for TELEfrom Server.
2.1 NT Server for Pervasive Database Server.
3.21 Workstations for Reader Modules.
4.55 Workstations for Verifier Modules.
5.6 Workstations for Scanner control.
6.6 Fujitsu 4099 Scanners |
Software
TELEform Software from Cardiff Software Inc. U.S.A. |
| The
system capacity calculate from official 7 hour-work
per day |
The scanner, each scanner can read both sides
of A3 size questionnaire at the speed of 50
sheets per minute, 6 scanners can scan 126,000
questionnaires per day.
Reader process, there are 21 work stations for reader process, operator
and supervisor, each system can process 12.5 questionnaires per minute,
21 work stations can process 110,250 questionnaires per day.
Verifier process, there are 55 microcomputers use for verifying, one
person for each microcomputer can verify at the average of 4 questionnaires
per minute, 55 work stations can verify 92,400 questionnaires per day. |
| NSO plans to process the census questionnaires within
8 months with 80 personals. |
| In practice, after scanning the census questionnaires,
the problem occurs as follow: |
| 1. From the questionnaire, |
The alphabet writing from pencil is not dark
enough because some interviewers did not complete
the questionnaire form with at least 3b pencil.
The poor hand writing.
The wrinkle questionnaire form. |
| 2. From the system |
The scanner screen need to be cleaned regularly
because of the poor quality of the questionnaire
paper, and the dust form eraser remained on
the questionnaire.
The roller of the scanner need to be cleaned and replaced more often
due to the wrinkle of the questionnaires and the quality of the questionnaire
paper.
The bad image, the company has to add the software to correct it, which
cause the scanner to scan slower. |
| In actual operation the capacity of the system is as follows: |
The scanner can read at the rate of 40 pages
per minute per one scanner, 6 scanners can read
100,800 questionnaires per day.
Reader process, can process 10 questionnaires per minute per one system,
21 workstations can process 88,200 questionnaires per day.
Verification, after more experience, one verify personal can verify
5 questionnaires per minute, 55 microcomputers can verify 115,500 questionnaires
per day. |
| Data
dissemination |
| At present, NSO has released the preliminary report and
advance report, the final report by provinces
in the publication form will be available shortly. |
| The preliminary and advance reports are now available
in the CD-ROM. The preliminary report and some
important tables of the advance report are also
available in the internet. |
| At present, the data from both short and long form questionnaires
are recorded and saved in cartridge tape and CD-ROM.
We also plan to create the database system in
the future. |
| Conclusion |
| The 2000 census has been carried out under conditions
of financial stringency. The most serious has
been on maintaining a satisfactory level enumeration
quality in the face of low pay and conditions
of enumerators, especially in the urban areas
where there are other serious enumeration problems.
For the data processing stage NSO adopted the
ICR technique in capturing the census questionnaire,
so that we can process and publish the 2000 census
within 1.5 years compare to 2.5 years in 1990. |
| ICR System Configuration
for NSO |
|