UN Web Site | UN Web Site Locator
Home Site map Contact 
ESCAP Statistics Division
ESCAP Statistics Division
 
Workshop 2001    
Workshop on Population Data Analysis, Storage and Dissemination Technologies
Bangkok, 27-30 March 2001

STAT/WDT/Pakistan
21 March 2001
ENGLISH ONLY

ECONOMIC AND SOCIAL COMMISSION FOR ASIA AND THE PACIFIC

Workshop on Population Data Analysis, Storage and Dissemination Technologies
27-30 March 2001
Bangkok
Methods/technologies of Collection, Analysis, Storage and Dissemination of 1998 Census Data of Pakistan1/
By: Muhammad Aslam Chaudhry & Rana Insaf Alikhan
Contents
  1. Introduction
  2. Data Collection Methodology
  3. Analysis Techniques
  4. Storage of data
  5. Dissemination
  6. Some findings
  7. Bottle necks/problems
  8. Suggestions/Recommendations
  9. Annex-A: Information about data processing of 1998 Census

1/ This paper has been reproduced as submitted.  It has been issued without formal editing.
I. INTRODUCTION
1. Census history of Pakistan dates back to 1881 when first scientific census was undertaken by the British regime in the area now comprising Pakistan.  From that time regular decennial census was conducted.  Since the inception of Pakistan the first census was conducted in 1951, followed by in 1961, 1972 (due in 1971), 1981 and 1998 (due in 1991).
2. The planning of 5th Census was initiated in 1987 with the consultation of users and delimitation of country into census areas for enumeration purpose.  At the initial stage many users, including federal and provincial governments, universities, research organizations and some NGOs were contacted through correspondence.  Also individual users were given an opportunity for face to face discussion to apprise the scope of the population and housing census and inability of Population Census Organization (PCO) to meet some demands beyond census scope.  However they were guided to contact concerned agencies for meeting their such demands.  Also more than 100 users were called in the Census Advisory Committee (CAC) after the questionnaires were designed and tested in the field alongwith their field observations for discussion in the meeting.  Thus in the light of the recommendations of the CAC questionnaires (forms) were finalized and submitted to the Cabinet for approval.
3. There were 4 questionnaires - house listing operation, big count, sample count and collection of information on housing and population from FATA.  House listing questionnaire contained 10 questions, housing census questionnaire 12, big count questionnaire also 12 and sample count contained 38 questions.  Objectives of houselisting operation were to prepare inventory of housing units/households for carrying out subsequent census operation without omission and duplication of any building/housing unit, to provide guideline to the field staff for identifying housing units in their assigned area, to supervise the enumeration operation and to ensure adequate logistic supplies.
4. Main or the short count questionnaire contained core questions viz name, relation to the head of household, residential status, sex of individuals, age, their marital status, nationality, religion, mother tongue, literacy level, educational attainment and issue of National Identity Cards besides containing questions on individual households like number of rooms, type of tenure, period since constructed, construction material of outer walls and roofs, source of potable water, light and cooking fuel; and housing facilities like presence and type of kitchen, bathroom and latrine and source of getting information through TV, radio or newspapers for keeping themselves aware about day to day development.  While the sample or long questionnaire, in additional to the above information, contained sensitive and difficult questions demanding detailed probing like district of birth, previous district of residence, period since living in the district, reasons for migration, field of education, occupation of individuals, industries where they are working, their employment status, unemployed persons and reasons o f their unemployment, disability, children ever born, children still living, children born during the last one year preceding the census date and children still living by sex and inoculation of children against six fatal diseases namely tuberculossis, diphtheria, pertussis, tetanus, polio and measles.
II. DATA COLLECTION METHODOLOGY
5. For carrying out census operation the entire country was divided into small areas, called census blocks, comprising 200-250 households or one thousand to 15 hundred persons without combining the area of a Mauza/Deh/Village with the area of other Mauza/Deh (in case of revenue settled areas) and village (in case of unsettled area).  Any Mauza/Deh/Village smaller than the above size was declared as an independent census areas.  On an average 5 to 7 blocks were merged to form a census circle but within the limit of a Patwar Halqa. Similarly 5 to 7 census circles were combined within a Qanungo Halqa depending upon its size to form a census charge.  In turn census charges were combined together within an administrative district (excluding big urban areas and cantonments) to form as separate census districts.  All big urban areas were notified as independent census districts.  All cantonments were declared as independent census districts.  The delimitation work was completed by mid 1990, followed by preparation of maps of census areas, their tracing and reproduction. Because of 7 years delay in census operation the delimitation work was updated from time to time to accommodate likely population expansion.  Accordingly all field use material and logistics were raised.
6. In the entire country (including FATA and Islamabad), Azad Kashmir and Northern Areas over 103 thousand blocks were delimited falling in over 166 hundred census circles, nearly 25 hundred census charges and 180 census districts.  For coverage of these areas over 114 thousand enumerators, comprising low paid staff of provincial revenue and education departments, around 186 hundred circle supervisors, nearly 27 hundred charge superintendents and 180 census district officers were appointed with the assistance of the concerned provincial governments.  Circles Supervisors were mostly senior Patwaries and Teachers, Charge Superintendents were senior Quangos and senior high school teachers while Census District Officers were Deputy Commissioner of the concerned census district, Municipal Chief of the Municipal Committees/Corporations, Mayer of the Metropolitans and Cantonment Executive Officers of Cantonment Boards.  They were given complete count training for three days in batches which continued for one month and ended a week before the census operation.  Those who could not get training in regular sessions, because of certain reasons, were called for training during the last week preceding the census date.  Preceding the training of field staff trainers were given a week long training by the Master Trainers including one day field visit and one day for review of field experience and discussion of problems faced by them.  Master Trainers were trained by the subject-matter specialists of PCO.  In all 27 master trainers and 824 trainers were trained who imparted training to the field staff at 824 venues at 129 places throughout Pakistan including Azad Kashmir and Northern Areas.  Besides, they were sent to the field for filling in questionnaires themselves.  These questionnaires were evaluated and errors committed by them were discussed and were rectified on the last day of their session.  These 824 trainers were picked up from 1,525 persons for training on the basis of their performance in the field and quiz test given to them.  Again of 824 trainers top 83 trainers were allowed to give training for sample count.  This training was also given in batches of 30 persons each with six days duration including field visit etc.  At each venue audio and video aids were provided for assistance in training and maintaining uniformity in clearing the concept.  Because of several time postponsement of census field staff was trained several times.  So before they were engaged in census operation about 95 per cent of them had atleast seven times training.  Obviously the cost per head had gone up but with better census returns.
7. Armed forces assistance were sought for carrying out smooth running of census operation who helped the census field staff by facilitating them logistic supplies and safe carrying of the documents.  No major complaint was lodged from any corner except delay in snow bound and inaccessible areas and failure to carry census work in some pockets of Balochistan which were minor in nature.  Exception to this was the part of Quetta city, population of which was estimated later for releasing the provisional results.
8. Besides engagement of Armed Force personals, staff of Federal Bureau of Statistics, local administration, Registration Department and judiciary were engaged to supervise/check the field operation.  Census control rooms were established in census offices and Monitoring Cells were created at the provincial capitals by the respective provincial governments for this purpose.  These control rooms functioned 24 hours during the census operation.  During this operation Census headquarters maintained closed liaison with the concerned Armed Forces Offices and administrative hierarchy for attending complaints.  List of control rooms was also published twice in leading newspapers besides publishing the same in Census journal called  'Hum Loag' copies of which were widely distributed throughout the country.  Also man to man engagement of Armed Forces personnel with census field staff ensured door to door visit by enumerators for collection of census information.
9. A fool proof arrangement was ensured through sealing of census documents at the level of enumerators and their supervisors in the presence of Armed Forces representatives minimising, if not eliminating, the chances of tempering census information in the field.  Another stage where errors are likely to creep into the census information is data entry.  Such chances were altogether eliminated this time through direct scanning of filled-in questionnaires by Optical Mark Reader (OMR) besides reducing data entry time.  This was first experience of the PCO to use OMR for data entry in Pakistan, therefore, questionnaire was designed for OMR with the help of UNFPA advisors who visited PCO from time to time.  Thirty million forms for big count were printing from U.S.A with the financial assistance of UNFPA.  However, with gaining knowledge and experience sample count and FATA questionnaires were designed by the staff of the PCO and printed locally.  Around 15 per cent of the filled-in forms were rejected and transcribed on blanks forms because of their mis-handling in the field, multiple marking, blank entries, poor quality paper and carbonated ink used for printing, improper cutting of critical edge of these forms etc.
III. ANALYSIS TECHNIQUES
10. Our analysis is based on from simple techniques such as working out rates, ratios, percentages to application of sophisticated and complex techniques.  For regular reports like District Census Reports, Provincial and National Census Reports the entire presentation of data, in condensed format, mostly depends upon simple analysis.  However, detail analysis of data as planned is to present in series of census monographs, hand books, children's Profile and Women's Profile.  For projection of population model life tables and logistic curve will be used.  For working out drop out, wastage etc, economically active life and nuptiality study decrement life tables will also be prepared.
11. For the computation of age at marriage singulate mean age at marriage (SMAM) will be estimated by applying U.N. method and Hajnal method.
12. For estimation of migration statistics our main reliance would be on simple rates, ratios etc.  However indirect estimation of net migration will also be measured by applying national growth rate as a counter check.
13. Fertility will be estimated by Brass method, reverse survival method and Relational Gompertz Model.  Regression analysis and Path analysis techniques will also be applied on fertility and its some linked variables to study the relationship and contribution of such variables on fertility behaviour.  For estimation of mortality two methods are likely to be used that is Trussel's method for measuring child mortality and Widowhood method for the estimation of adult mortality.
IV. STORAGE OF DATA
14. Since the Data was processed at two different places i.e. Population Census Organization Head Office and Computer Centre of Federal Bureau of Statistics therefore, it was stored at these two places. After scanning of census filled in forms at Population Census Organization Headquarters the data was stored on magnetic tapes, on hard disks of 4 PCs of gega bites capacity each and on 20 Zip disks of 100 MB each.  To save these storage devices from dust and humidity protected environment was created at both the places.  Further safety measures such as heat blowers for reducing humidity level, fumigation for preventing pest attach and automatic alarm for indication of smoke and gases injurious for documents and workers both were adopted. (For further details please seen Annex-A).
15. These data input devices were in continuous flow towards Computer Centre of Federal Bureau of Statistics who after further processing of data, tabulation and analysis stored the data on magnetic tapes, Zip disks and CDs.  For this purpose 250 tapes of 2400 ft length or 7 MB capacity each, 65 Zip disks of 10 MB each and 77 CDs of 650 MB each were used.
16. We are also using Census Library for storage of data in book shape with supply of one published copy of each report to our National Archive Department, Islamabad who are responsible for storing all important national documents on various storage media including micro-fish films.
V. DISSEMINATION
a. Dissemination Programme
17. As planned last census report was scheduled to be published before December, 2000. However, because of transcription problems, late installation of some OMR machines, frequent electricity fluctuation, etc. caused a slight delay in releasing the last publication of the census report which is expected to be released now few months later to its scheduled time.
b. Dissemination Methods
18. Census results are widely disseminated to all kinds of data users through print media.  The reports in the shape of census bulletins/reports like District Census Reports, Provincial Census Reports and National Census Report of Pakistan are published and sent to Federal Ministries, Provincial Departments, Universities, Main Libraries and NGOs for their use.  The census data is also provided through electronic media in the shape of CDs, tape etc if required by any Government Department, Private Agency, International Organization and person from public.
VI. SOME FINDINGS
19. Census is a gigantic operation so it is difficult to claim achievement of 100 per cent accuracy.  However with experience, improved methodologies and controlling of anticipated problems, especially field related, are the offending endeavours and efforts to claim improvement in data through successive censuses.  The ensuing paragraphs witness some main findings in this regard.
20. The total area of Pakistan is 796,095 square kilometres.  The total population as enumerated in 1951 was 33,816,555 persons which grew with an average annual growth rate of 2.4 per cent from 1951-61, 3.7 per cent from 1961-72, 3.1 per cent from 1972-81 and 2.69 per cent from 1981-98.  The population as enumerated in 1998 was over 132 million comprising 56.1 per cent living in Punjab, 22.6 per cent in Sindh, 13.1 per cent in NWFP, 5.1 per cent in Balochistan, 2.6 per cent in FATA and 0.4 per cent in Islamabad.
21. The population is unevenly distributed with sparsely settled in Balochistan having 19 persons per square kilometre and thickly populated in Punjab and Islamabad having 231 and 376 persons living on one square kilometre of land.  The population pressure per square kilometre of land has increased from 42 persons in 1951 to 166 persons in 1998.
22. The urban proportion has also increased from 17.8 per cent in 1951 to 32.5 per cent in 1998.  The sex ratio has decreased during this period i.e. from 116 males for every 100 females in 1951 to 108 males in 1998 which is an evident of better coverage of females in the latest census when compared with the earlier ones.
23. The proportion of infant (under one) has decreased too, from 2.6 per cent in 1972 to 2.3 per cent in 1998. The proportion of never married female seems to be continuously increasing, 14.8 per cent in 1972, 17.4 per cent in 1981 and 25.3 per cent in 1998.  The infant mortality rate is also declining rapidly from over 140 per thousand in earlier 50's to 80 or below at the time of holding of last census.  The proportions of children below 15 and aged population (65 and over) are also decreasing.  These declining trend of proportion of children particularly of infants, percentage of never married females and infant mortality rate are the indication of fertility decline though it has started declining a bit latter than the decline in mortality. The decline in proportion of children and aged population has increased the age dependency ratio from 84.8 per cent in 1972 to above 90 per cent in 1998.
24. The decline in infant mortality rate, infact, has resulted from launching of immunization programme against six fatal diseases.  Though health facilities has increased considerably yet the incidence of some diseases like cardiovascular, respiratory, cancer and gestro-enteritis has increased over time which might have affected the aged population (65 and over) or it could be due to difference in content errors in the successive censuses or both.
25. The literacy ratio has also increased from a very low level of 16.7 per cent in 1961 to 45.0 per cent in 1998.  But still we are far below the desired level of literacy even amongst the less developed countries in the world.  The gap would have been perhaps much wider if no conceptual changes have been introduced in the census, over the period.  There is a sharp difference in literacy ratio of male and female population.  Female ratio has increased from very low level of 6.7 per cent in 1961 to 32.6 per cent in 1998.  While male ratio has doubled during this period. Similarly there is a wider gap in literacy ratio in urban and rural areas.  The ratio ranges from 64.7 per cent in urban to 34.4 per cent in rural areas.  Punjab has higher level of literacy, followed by Sindh, NWFP and Balochistan.  However, proportion of literate population is the highest in Islamabad as registered in the last census.
VII. BOTTLE NECKS/PROBLEMS
26. Though lot of improvements have been made in collection, processing and dissemination of data over the past censuses yet we have to achieve the desired accuracy in data collection.  We have used OMR technology for the first time in the 1998 Census with almost no exposure of any staff member of the Population Census Organization with respect to designing of OMR questionnaire and scanning through the machine thus we faced a problems due to frequent electricity failure.  Printing of questionnaires from local market for meeting additional demand exceeding over 30 million (printed from USA) imposed scanning problems resulting shooting up rejection rate.
27. Another problem was misreporting of ages especially of children, young unmarried females and older people.  That was mainly due to ignorance of actual age backed by illiteracy.  Uptill now 55 per cent of population aged 10 year and over is still not literate.
28. Accurate head counting will remain a problem as long as constitutional linkage of census population for determination of National/Provincial Assembly seats, allocation of national funds to Provinces and civil servant recruitment quota will remain in tact.  Political gains and ethnic rifts will be the additional problems affecting accuracy of head counting.  Seasonal migrants will also remain a problem at district and probably at provincial level in production of accurate census results.  Counting of mobile population especially at/near coastal areas is also a problem in enumeration of individuals.
VIII. SUGGESTIONS/RECOMMENDATIONS
29. For electricity control there is a need of installation of an independent transformer for Population Census Organization's own properly hooked up with each system of OMR for controlled supply to the system.
30. Population Census Organization may instal its own roller printing machine for printing of OMR questionnaires and other documents meant for scanning of filled in forms/questionnaires. Purchase of paper in required quantity and specifications, may be ensured ahead of time schedule to avoid any unforeseen problem.
31. Though eradication of illiteracy is the main solution for better age reporting yet till the achievement of 100 per cent literacy level Population Census Organization may explore and depend upon other efforts for creating awareness amongst the masses regarding benefits of accurate age reporting.
32. Three incentives affecting census results may be delinked through necessary amendments in the legislation, which perhaps is a very difficult job.  Alternatively these incentives may be frozen for contain period as India did.
33. Problem of seasonal migrants can be controlled to some extent by asking migration questions on 100 per cent bases permitting readjustment of seasonal migrants at present as well as their usual place of residence.
Annex-A
INFORMATION ABOUT DATA PROCESSING OF 1998 CENSUS
A) STORAGE OF CENSUS DOCUMENTS:
The registers & books of filled in Census Forms retrieved from the field were stored in 554 steel racks, pre-labeled for each each Admn./census district and properly arranged in 6 stores.
B) DATA ENTRY AND STORAGE OF DATA:
Four OMRs, OPSCAN-21 Model 75/100 of National Computer System (NCS), USA were used for capturing the data from the filled-in Census Forms.  The data was stored on the Hard Disks (HDs) of the PCs., Pentium, attached with the OMR machines.  The storage capacity of each HD is 4.0 Gega Bites (GB).
C) TOTAL RECORDS:
The total records feeded for data production are about 24 millions, equal to total filled-in forms (Long & Short).  The record length of Short Form is 204 characters and that of Long Form is 347 characters.  The detail of separate Population & Housing records will be provided after the availability of tabulation of Pakistan from Federal Bureau of Statistics.
D) REJECTION RATE:
The rejection rate of data scanning remained about 10 to 15 per cent for the forms printed by NCS, USA and 15 to 30 per cent for locally printed forms. The reasons are as under:
  1. General Reasons:
    1. Im-proper manual editing of the Forms
    2. Seasonal impacts i.e., humidity, extreme coldness/hotness.
    3. Dust particles.
    4. Voltage problems between Phase, Natural and Groun
  2. Reasons for American Printed Forms:
    1. Failure of Edit Checks due to multiple marking and blankness.
  3. Reasons for Locally Printed Forms:
    1. System errors i.e. mis-alignment of timing marks & ovals and faulty cutting. 
    2. Printing errors i.e. light/dark printing and use of improper ink. 
    3. Cutting errors 
    4. Paper errors i.e. low/high grammage. 
    5. Failure of edit chocks
E) TRANSFER OF DATA:
Most of the data was transferred on 767 magnetic tapes of length 2400 feets and 3600 feets, which accommodated the date files of approx. 7MB and 11MB respectively.  However the data files of some districts was also transferred on 6 Zip Disks each of 100 MB.  Moreover 13 Zip Disks of the same capacity were used for back-up when the hard Disk of the PCs over-flowed. Also 264 floppy diskettes each of (1.44 MB), were used for supplementary data files.
F) SAFETY/SECURITY MEASURES:
The following safety measures were taken for storage of census documents:
  1. Census Forms:
    1. The sensors of Smoke/Fire Alarm System were installed in each store andOMR room.
    2. Smoking was strictly prohibited in the store & OMR rooms.
    3. The issuing of documents from the store were under the direct control of a responsible officer.
    4. No outsider were allowed to enter in the store and OMR rooms without prior permission.
    5. The documents were fully protected from dust and humidity for an easy scanning.
  2. Input/Output Storage Devices:
    1. The input/output storage devices i.e, tapes, Zip Disks, floppy diskettes, were under the safe custody of responsible officers.
    2. These devices were stored under the required/proper environments.
    3. The data files were created with the unique names and every data tape was allotted a unique serial number to avoid any loss of data.
    4. The data file names were anonymous and stored in a unknown directory for an unapproachable access of un-authorized persons.
    5. The backup of all software & applications pre-loaded on the hard disk of the PCs. attached with the OMR machines has been preserved on 2 Zip disks, each of 100 MB.

NOTE: The quoted figures are tentative which are likely to change slightly.
 
Pop-IT project (1997-2001)
Project Objectives
Working Party Members
Working Party Meetings
First meeting, Bangkok, 24-26 September 1997
Second meeting, Singapore, 1-3 April 1998
Third meeting, Bali, 7-9 January 1999
Fourth meeting, Manila, 6-9 July 1999
Ffth meeting, Bangkok, 21 October 1999
Sixth meeting, Bangkok, 26 March 2001
Workshops
Application of New Information Technology to Population data, Bangkok, 12-20 October 1999
Population Data Analysis, Storage and Dissemination Technologies, Bangkok, 27-30 March 2001
Guidelines
Population data collection and capture (BBS - Statistics Indonesia)
GPS in modern mapping and GIS technologies to population data (Bangladesh Bureau of Statistics)
Population data dissemination (Statistics New Zealand)
Project Newsletter
Contact us
   
Copyright (c) 2013 ESCAP  |  Legal Notice