Annex-A: Short Questionnaire
for Population Census 2001
Annex-B: Long Questionnaire
for Population Census 2001
Bangladesh started application of OMR in
1981 for processing the Population Census documents
with OMR IBM 3881. The same machines were used
in 1983!84 for processing agriculture census
documents. Bangladesh used OPScan 21 to process
the census documents of population census 1991.
In all these censuses it successful captured
the numbers and the codes. A short single sheet
OMR questionnaire was used to collect basic
information of each household and individual
members of household all along. The detailed
socio-economic information were collected and
processed using the data entry technology. The
main census and the detailed sample survey used
to be conducted after a period of about 9 months.
As a result, time frame could not be maintained
in the totality of census activities.
Bangladesh is at the door step of the 2000
round censuses. On the basis of the past experience
and demand for timely census results
Government of Bangladesh has decided on principle,
to adopt the 3:2 years timeframe for the 2000
round of censuses. For sharing the resources
it has planned to conduct three national censuses
in succession i.e. Economic Census in 2000,
Population and Housing Census in 2001 and Agricultural
Census in 2002. Bangladesh has also contemplating
to adopt civil registration system through the
local government in the near future. In view
of users demand it is felt essential to search
for technology which will ensure rapid capture
of number as well as character from multi-sheet
questionnaire. With these in view questionnaires
have been designed in both the OMR and ICR format
in single as well as in multiple sheets. To
make the operation success 3 important requirements
availability of resource
persons to manage printing of questionnaire,
data collection and data capture from both
single sheet and also multiple sheet questionnaires;
availability of maintenance
distributive data capture
with back-up system.
Incidentally, we have become blessed with the
advancement of technology. OMR is now available
with both the digit and character recognition
capability with high speed. Bangladesh
has ordered for procurement of 5 OMRs of model
DRS 800 with throughput 8000 A4 sheets per hour
with large host computers. To use as back-up
it has ordered for 5 ICRs with throughput of
42 ppm. Softwares for the OMR and ICR
will be configured in such a way so that both
the machines can be used interchangeable and
can produce output in ASCII format which can
be processed with our own application program.
Experience with OMR IBM 3881 in Population Census
Operation of the OMR started on the 20" October,
1981 on experimental basis and actual production
job was taken up from 3`d November, 1981. During
the first week it was run on one shift of 7
hours duration. Next week, work was done in
2 shifts. From the third week 3 shifts were
introduced. This decision to run 3 shifts was
extremely risky. But considering the need for
quick data entry and also due to the fact that
8 months time already elapsed after the census,
3 shifts were necessary to make up time. The
decision was however taken in consultation with
IBM Engineers who agreed to help in maintenance
and offered service every day without any extra
charge. Also it was expected that the other
OMR would be operational after the receipt of
necessary spare parts and then it would be sufficient
to run the two OMRs in 2 shifts only.
Only 3 to 4 thousand sheets (without splitting
a Union) were taken a time in one Tape. This
was done to avoid large scale re-run due to
(i) frequent power failure, (ii) machine breakdown
and (iii) eventual data check on tapes. The
tapes of one Thana were later merged together
in one containing all questionnaires of a Thana.
Every sheet was assigned a 7-digit serial number
generated by the OMR and recorded in the tape.
The same serial was also printed on the sheet.
This serial, read with Geo-code, could identify
individual household for later reference. The
left most 2 digits indicated Geo-code of Thana
and the remaining 5 digits indicated running
serial within the Thana. After successful run,
tape number, serial numbers of the sheets, date
and other particulars were written on the external
label of the tape and also recorded in a Register,
These tapes were then taken to a main frame
computer for further processing.
Reasons for lower avera
Some of the difficulties faced at the time
of operation are:
Some times recording
on tape was found to be erratic. This necessitated
re-run of the entire batch.
Sudden power failure
necessitated re-run of the batch. Such failure
was almost a daily affair. On the average
about 2 hours time was needed daily for this.
Two to three hundred
sheets are kept on the hopper at a time. When
these are processed the machine automatically
stops. The operator has to bring down the
hopper, feed the next batch of 2-3 hundred
sheets and start again. These sheets needed
proper alignment at the edges. These actions
taken 2 to 3 minutes time.
After processing 5
to 6 thousand sheets the hopper area, carriage
area, mark sensing area, etc. had to be cleaned
manually with clean dusters and vacum cleaner.
Similarly, the read-write head of the tape
drive has to be cleaned 2 or 3 times daily.
The FCS was designed
in such a way as to reject the Tally sheet
in case there was any blank or double mark
on a row. If a Tally sheet is rejected the
operator must stop the machine, look for the
error and correct it before re-feeding.
The OMR machine itself
broke down several times during the period
of census processing. While the IBM Engineers
were available all the time it was very difficult
to get the supply of necessary spareparts
from abroad. Initially the other OMR was cannibalised
but later the whole processing remained suspended
for considerable period of time.
As stated earlier, we have to settle down
with only one OMR. The other OMR could not be
made operational till end of census processing
mainly because of non availability of spare
One OMR machine was put into operation from
3rd November, 1981 and preparation of data tapes
for the Census of Population, 1981 continued
till the 17th August, 1983. Though the
time taken for completion of the work was about
21 1/2 months (655 days), actual operation was
much less than that. Out of 655 days, 197 days
was lost for machine break down and 12 days
was observed as holidays. The machine
was operated on test basis in single shift from
3rd November to 8th November, 1981 and in 2
shifts from 9th to 12th November, 1981.
Thereafter operation continued in 3 shifts per
day till August, 1982 and for the rest one year
the machine was operated in 2 shifts per day.
Generally 2 operators worked in one shift. For
the available 446 actual working days, a total
of 1,116 shifts were planned. But 169 shifts
could not be utilized either due to minor mechanical
troubles or power failure. About 15 million
documents were read through OMR in 997 shifts
each of 7 hours duration. Thus, the average
number of sheets read per shift stands at 15,000
But 145 tapes containing about 4,000 households
in each had to be re-run because those could
not be read in the computer. As such, the average
number of sheets per shift was actually slightly
higher and stood at 16,000 roughly. However,
experience showed that a maximum of 20-22 thousand
sheets can be read in a single shift of 7 hours
Number of shifts
Worked per day
Nov. 3, 1981 to Aug. 17, 1983 (655 days)
Details are given below:
OMR Operation by Month
No. of Tapes
down time (days)
Experience with OMR Opscan2l for Processing Population
Two OMR OPSCAN 21/75 run in two shifts. In each
shift 16 men worked as --
Tearing of questionnaire
The morning shift started from 7-30 a.m and
continued upto 2.00 p.m. and the evening shift
started from 2-00 p.m. and continued upto 8-30
p.m. In addition, one maintenance engineer remained
standby from 9.00 a.m. to 5.00 p.m. every day
to haunt the problems and to make the machine
operational instantaneously, when there is any
disturbance in the machine.
OMR room is relatively cool. Dehumidifiers were
used to control humidity. Blowers and Vacum
cleaners were used for dust control. In addition,
Jugglers were also used to remove dust from the
document before these were run in OMR, Before
data capture the questionnaires were preserved
in the controlled air for 48 hours. The procedures
followed in OMR operation are -
received by batches of thanas and entered
into control register;
Census books were taken out
of the envelope, checked the geo-code
and teared off filled-in questionnaires
and kept them above the envelope for seasoning;
After 48 hours these documents
were taken to OMR room and Juggled before
running in OMR;
After data capture the checker
inserted the questionnaires in respective
envelopes. When all the documents of a
thana was complete then those were returned
Once the harddisk attached
to OMR is full, then the data image of
a thana (batch) was transferred to Micro-processing
system through LAPLINK procedure.
Some observations of OMR operation are given
OMR stops running if
there are mistakes in
rate was 50% but ultimately it varied from
1 to 3%.
Rejection rate observed
in running the documents of one thana without
editing was 20%.
A batch is rejected if
the error rate is more than 2%.
OMR convert one sheet
as one big record of 223 characters. This
record consists of some internal information,
a sheet number, the household. Special continuation
character indicates overflow. The product
of OMR is a file consisting of a EA tally
record followed by many household and personal
HARDDISK capacities of
OMR-host computer (Everax) is 40 Mbyte. It
can easily accommodate the records of two
thanas at time. When the harddisk of OMR was
full then the data of a thana was transferred
to micro computer system in either of two
(1) by Laplink
or (2) by diskette.
The initial plan for
transfer of data through LAN was abandoned.
Instead both the Brooklyn Bridge and Laplink
softwares were tried for transmission of data.
Laplink worked well and has been installed.
Currently it has been experiencing cable problem.
The cipher tape dirves came with the OMR were
disconnected and moved to computer rooms.
They now serve to off-load intermediate and
final data sets.
MACHINE SPEED is 7000
sheets per hour but achievement is 4000-5000
sheets per hour;
DAILY OUTPUT varies from
80,000 to 1,00,000 sheets.
MACHINE stops, loose
its life time and the output is reduced because
of the following problems:-
For any haltage of OMR,
running time of 100 to 150 documents are lost.
The following messages were shown in the monitor
for the above causes:
Input hopper problem;
Probable solutions to
the above problems;
Maintenance of dust free
environment and cleaning by blower and vacum
Seasoning of documents
for 48 hours;
Running the dehumidifier
for 24 hours during rainy seasons and 12 hours
Regular maintenance of
Adequate stock of spareparts;
Better editing and no
marking over time mark and skunk mark area;
Regular use of juggler;
Incentive for operators.
OMR operation was started in May, 1991. Data
capture from tally sheets was completed in June,
1991. Data capture from about 30 million main
census questionnaires was started in July, 91
and was completed on 15th October 1992. Monthly
progress report of OMR is given in table T1.
T1 Progress Report of OMR
Type of Document
put per machine per hour
# of sheet Run
Error Statistics of OMR Operation are given
in tables T2, T3 and T4.
T2 Sample Statistics of
Timing mark deviation
Deskew station jam
T3 Sample Statistics of
code wron or
T4 Error Statistics (Summary)
Type of Error
Deskew station jam
Integrated Approach to OMR and ICR for data capture
in 2000 round of Population Census
Five Pragmatic steps have been recommended
by the National Statistical Council and is being
Both the short and the long
questionnaires have to be designed in OMR/ICR
format so that they can be read in both the
Data capture have to be
made from 4 Divisional offices and high speed
communication have to be made with the headquarters
with essential technical and maintenance personnel
Data capture must be completed
within 5 to 6 months without fail so that
report can be produced within 2 years of conducting
the census. Thus necessary arrangement for
backup machines, re-enforcement of manpower
and arrangement for high speed data communication
will be made.
Most of the data to be
collected will be numeric code and numbers
and only one or two fields will be in short
form with block English characters so that
both ICR and dual recording OMR can capture
the characters easily.
Sufficiently fast moving
spares will be made available to keep the
machines running till the end of data capture.
To ensure timeliness quality paper of 95 GSM
and web press have been procured. For ICR machine
OCR for Forms software and OMR
machine SOSkit and SOSRes have been procured
so that the captured data /image data can be
stored in ASCII format . Host computers with
512 MB RAM and 9 GB harddisk and 400 MHz processing
speed have been ordered with sufficient fast
moving spares essential for data capture from
30 million documents. We have hired maintenance
engineer to station in all the form Divisional
There will be four regional data capture enteries.
Each centre will have one OMR and One ICR machines.
One set of OMR and ICR will be used as mobile.
Population and Housing Census 2001 will have
a short questionnaire (appendix-A) with single
sheet which will be canvassed to all households
and population. It contains basic characteristics
of each household and individual members of
household. The long questionnaire (Appendix-B)
will contain detail questions for different
socio-economic characteristics of household
and individual members of household.
The long questionnaire is comprised of 6 sheets.
The short questionnaire has 28 questions. Out
of them 27 questions will be numerically coded
or numbered. Only the name of individual members
will be written in maximum of 10 English capital
characters. Occupation and economic activity
and names and codes may be captured from the
long OMR questionnaire.
Four types of benefit can be ensured from
the integration of OMR and ICR technology:
It will develop a master
identification database with unique name and
numerical address of each individual of the
country. This database can be used for demographic
analysis, better sampling frame and identification
OMR speed will ensure timeliness
and ICR willensure completeness without
Substantial savings can
be made in cost if it can be used for issuance
of identification card, civil registration
Savings of manpower will
All types of reports can
be published within 2 years from the date
of completion of enumeration.