The application of computer technology in the field of population statistics has a long and rich history. After several decades of dependance on the mainframe computers and centralized electronic data processing, the national statistical offices (NSOs) were offered a number of new technological options during the 1980s. Significant among these developments were the advent of microcomputers and the availability of an array of packaged software, both in commercial and public domains. During the past decade the application of information technology for the production and dissemination of population statistics has enabled the NSOs to realize tremendous gains in terms of quality and timely availability of data. Those who capitalized on the emerging opportunities and adopted new and innovative approaches accomplished remarkable improvements, not only for processing the census and survey data but also for designing and managing surveys and disseminating and analyzing their data.
Nevertheless, a large number of developing countries were left behind in terms of effective utilization of information technology (IT). For instance, in processing of the censuses of the 1990 decade, many of them utilized resource demanding but not necessarily efficient combinations of tailor made in-house programs and mainframe computers. The networked computing environment has also not yet reached many NSOs in developing countries. Although until the late 1980s the cost of hardware and software might have been one of the important factors which left many NSOs in developing countries lagging behind in IT utilization, invariably the primary factors were the absence of an IT policy and the lack of a pool of adequately trained and experienced staff for designing and developing modern data processing systems. This is becoming evident now as computer hardware and software have become significantly more affordable and more widely available.
Apart from technology, the design and implementation of data processing depends heavily on trained and skilled human resources, appropriate systems, and the approach adopted. In the past, many developing countries were able to receive international financial and technical assistance that met some of their human resources and hardware needs for processing census data. However, with dwindling prospects for the future availability of such an assistance to the countries in the Asia-Pacific region, it is becoming increasingly important for the NSOs to concentrate on developing appropriately skilled human resources and policies which will enable an effective utilization of IT.
In the field of population statistics, data collection, compilation, processing and dissemination have not undergone drastic changes, but the systems and approaches utilized to accomplish those steps vary from country to country. This variation in approaches to systems development has a direct bearing on the quality, timely availability and usefulness of the statistical outputs and services of the NSOs. In cases where the staff had little exposure to contemporary techniques of data processing and electronic communication, the effective utilization of IT lagged behind. On the other hand, in the wake of technological advances, the escalating demand for information and the growing sophistication of data users are compelling the data producing agencies to improve their products and services. The project on ?Application of New Technology in Population Data Collection, Processing, Dissemination and Presentation? has identified the sharing of country experiences in the effective utilization of modern technology as one of the useful approaches for promoting the improvement of population statistics in the ESCAP region.
The Working Party is invited to discuss and identify IT applications to population data -- such as for data collection, processing, analysis and dissemination -- which could be implemented in three pilot countries selected by it. The aim is to develop suitable applications in selected pilot countries based primarily on existing commercial and public domain software. Often the mere availability of software and hardware do not meet all requirements of population statisticians, and a lot of preparatory work is needed on the development of a suitable and effective system. For a selected application, the work in the pilot countries would in the first place involve the assessment of various hardware/software options and system designs. For example, data capture system may be fully automatic or a combination of manual and automatic methods. Similarly, for making available large quantities of data, there are various options for on-line access or dissemination through optical media. There are a number of software packages that can be used to develop a system of dissemination and analysis. The work in pilot countries in this case may involve the review and testing of a number of factors, including the quantity of data, need for compression, nature of indexing and loading component, database query features, extent of tabulation and analysis features. In other words, the focus is on systematic planning and development of a suitable system.
The applications may involve a national data collection or a subset of it in the pilot country, for which the designated local institution will undertake activities, under the guidance of the Working Party, for their adaptation and refinement. The project has a provision of up to US$20,000 for each pilot country to use as seed money. However, to realize an effective implementation of this component, the willingness to participate and a commitment of the required personnel and other resources by the pilot countries is essential. The Working Party can play an active role in facilitating the transfer of technology and knowledge from the advanced countries and in arranging the provision of the technical assistance. It can identify countries from where the applications could be borrowed, as well as the ones which could be approached for assistance. The work in the pilot countries will also be facilitated by direct assistance provided by the members of the Working Party and their parent organizations.
A brief discussion of various IT applications to population data follows, focusing on selected applications and issues.
II. IT Applications to Population Data
In developing countries, the direct key-in of data from questionnaire remains the dominant mode for data entry. The use of microcomputer with special data entry software has furnished several options. The clear understanding of various options, their merits and limitations is very important before the NSO adopts a system of data entry. The data entry software provide opportunity for interactive error checking, automatic duplication of fields and introduction of other factors. However, there is a serious danger of over programming.
Optical mark reading (OMR) has been used by several developing and developed countries of the ESCAP region for capturing data from censuses and surveys and civil registration records. However, its effective application depends on the availability of special quality of paper, unique questionnaire design, climate in the country, and transport infrastructure. The development of an OMR-based data capture system requires a number of other considerations -- such as whether the questionnaires should be marked by the respondent or by the enumerator, the colours that can be used to do the marking, and how the rejection rate may be kept at acceptable levels. Also important are the issues of the availability of equipment, repairs and support from the vendor. For a country contemplating to use OMR for the first time, answers to many of these issues can be found in others? experiences, but their applicability and many other factors have to be worked out according to local circumstances and the scheme has to tested before the NSO adopts it as a viable system.
Optical character reading (OCR) technology has also been in existence for more than two decades now. Its initial problems with the recognition of certain numerals and alphabets, depending on the writing style of the enumerators and respondents are still lingering. However, as compared to OMR this technology has received greater attention from the developers and probably holds a better prospect. Some countries in the ESCAP region have experimented with mixed approaches involving combinations of computer-assisted keyboard data entry, OMR and OCR. The Working Party may wish to examine whether the transfer of that knowledge and experience could be a useful exercise for one of the pilot countries. There are further developments involving imaging techniques and scanner devices, together with OCR software, which have been used for data capture. In view of their reported cost effectiveness and low error rates, it might also be explored if there are experiences with them in the region.
For the stage of data collection, some national statistical offices have already experimented with an utilized the use of computer assisted personal interviewing (CAPI), computer assisted telephone interviewing (CATI) for surveys or electronic data interchange (EDI) for such purposes as compilation of vital statistics.
Data scrutinizing, coding and editing
The operation of scrutinizing the questionnaires and the coding of the responses have important bearings on the quality and timely availability of the data. The use of precoded responses is on the rise, but these are certain variables for which the textual responses contribute to delays in data processing, such as those concerning occupation and industry. Some countries in the ESCAP region have experimented with computer-assisted and automatic coding and that rich experience may be shared with others under the guidance of the Working Party.
There are some software packages which allow editing together with interactive data entry. However, the implication of utilizing such a system is that highly skillful data entry operators and supervisors are required. Thus training becomes an important issue. The information from country experiences on addressing these issues, as well as on the gains in terms of quality and timely processing of data, need to be shared in the ESCAP region.
The establishment of a geographic information systems (GIS) for population data cannot be realized by the mere acquisition of software. Rather it is a system requiring base maps, hardware, software, procedure and skilled personnel. It has great potential for improving the capture, management, manipulation, analysis and display of spatially referenced data. Apart from the adequacy of skills, the application of GIS technology also depends on the resources available to the NSO and the cooperation it receives from other agencies which maintain base maps. It is therefore important that the NSO adopts a cautions approach in developing GIS applications suitable for its needs. Cooperations should be sought from other NSOs which have experience in this area to address issues such as the roles of the NSO and the other interested agencies, selection of hardware and software, and skills development. A start can be made with a simple and robust design using a suitable GIS package with which experience already exists in other NSO(s). A country seriously considering GIS use and willing to devote resources can benefit as a pilot country in the project. With the advice and support from the Working Party and transfer of technology and knowledge from experienced countries, a pilot system may be developed and tested.
For several decades NSOs have been providing data through electronic media, such as magnetic tapes to supplement the published results of censuses and surveys. Today the popular choices for data dissemination are diskettes, online access and optical media. Each of these media options have their advantages and limitations. No matter which media is used and how the data are organized, the information should be useful and easily retrievable, manipulatable. Disseminated electronic content can be as simple as a copy of a census/survey publication in a suitable character or image format. It can be supplemented by time series data or data from other sources. Compact disks are preferred over diskettes for large volumes of data; they have more room to pack along software for data retrieval, display and manipulation. More advanced mode of dissemination may consist of dissemination of data together with maps so that thematic maps and those needed for census/survey field work could be produced. The use of the Internet and bulletin boards for on-line dissemination have been attempted by some countries in the ESCAP region.
Since dissemination technologies are evolving rapidly, countries should make an effort to learn from experiences gained elsewhere in testing and applying them. The Working Party may wish to discuss possible projects and identify countries which would benefit as pilot countries.
Large volumes of data are best handled in normalized and indexed databases. Besides the performance aspects, or the ability to retrieve data fast, the database design must take into account the needs of users . With the improved availability of database management software and interfacing tools, the development of more user-friendly systems has become easier.
A well-designed database system improves the utilization of data by facilitating easy online queries and the production of traditional outputs in new formats. Several factors must be considered and a number of issues must be resolved in designing population databases. The information on the target audience, their requirements, level of computer skills, their hardware facilities, and the like must be collected and comprehended in the first place. Then decisions are to be made concerning preferred medium of storage, handling of the metadata, data to be included and their structure, query features, and the database software, commercial or otherwise, to be used.