|Training Manual on Disability Statistics
|Learning Objectives: Chapter 9
|Analysing and disseminating disability statistics
After reading this chapter, the reader should be able to:
- Describe methods for presenting disability results
- Outline procedures for analysing disability data
- Describe dissemination strategies for disability information
|9.1 Planning the outputs
Many developing countries do not put enough money into data compilation, analysis and dissemination. If the data collected is not analysed and disseminated, the resources used in planning, development and data collection are wasted. This is a problem of overall data collection planning, the key to which is effectively allocating resources to the design, collection, processing, analysis, and dissemination phases.
Tabulation and analysis of data must be carefully planned during the early stages, not decided upon after the data has already been collected. The scope of analysis depends on the statistical variables (or data items) that the analyst can correlate. The output data items must be chosen at the planning stage of a collection. A well-planned data collection activity securely links the input of questions asked and data items derived with the output of statistical tables and usable information.
As always, decisions about which outputs are appropriate, how the individual data outputs are to be correlated, and how this material should be presented, all centre on the needs of the ultimate user of the data.
9.1.1 Output items to meet user needs
What information is usable depends on the questions the data users want answered. In general, for disability policy, the questions range from basic prevalence to particular questions about the impact of disability in specific areas of life. Here are some examples, depending on the variables included in the data collection instrument:
- How many people are there with disabilities in the population?
- How does the prevalence of disability vary by age, sex and rural or urban residency?
- What is the prevalence of severe disability in the population? How does it vary by age, sex and area of residence?
- What proportion of households has a member with a disability?
- How do persons with disabilities compare with others in terms of major socio-economic indicators?
- To what extent are persons with disabilities receiving effective assistance services?
- To what extent do persons with disabilities experience participation restrictions, e.g. unemployment, exclusion from school, unable to use public transport?
The kinds of data items that these and other policy questions require range from the most basic identification of populations of persons with disabilities, and their prevalence, distribution and characteristics, to highly detailed data sets which are possible mostly in a survey and not in a census. Some of the data items include:
- prevalence of specific activity limitations in seeing, hearing, walking about, moving from one room to another, speaking, communicating, learning, and so on;
- underlying cause of the impairments - congenital, disease or infection, injury or trauma;
- severity of the disability;
- age of onset of disability;
- need for and use of medical and rehabilitative services, and personal assistance;
- need for and use of assistive devices;
- quality of life (or socio-economic profile) of persons with disabilities, compared with persons without disabilities; and
- barriers to full and equal participation in society in areas such as education, work, housing, transportation and the political sphere.
9.1.2 Output tabulations
Once the questions and kinds of information are settled on, it is possible to begin the task of identifying relevant cross-tabulations. Cross-tabulations can be specified in terms of a population, one or several output data items and, where appropriate, the counting unit to be used. For example, for the total population, one can identify the basic or minimum cross-tabulations based on age, sex and disability status.
A set of disability statistical tables for census was suggested in the United Nations' Principles and Recommendations for Population and Housing Censuses (Revision I). These are the following with age, sex, and urban-rural residence as the main variables:
- Total population, by type of disability, geographical division, urban/rural residence, whether living in household or institution, age and sex.
- Households with one or more persons with disability, by type, size of household and urban/rural area.
- Total population 15 years of age and over, by type of disability, marital status, urban/rural area, age and sex.
- Population with disability, by cause and type of disability, urban/rural area, age and sex.
- Population 5 to 29 years of age, by school attendance, type of disability, urban/rural area, age and sex.
- Population 5 years of age and over, by educational attainment, type of disability, urban/rural area, age and sex.
- Population 15 years of age and over, by activity status, type of disability, urban/rural area, age and sex.
For data on persons with disabilities, cross-tabulations can be based on:
- Age, sex, cause and type of disability,
- Age, sex, and severity of disability,
- Unmet need for health services in the last 12 months, severity of disability, and so on.
Similar data items can be applied to more narrowly defined populations, for example populations of people with specific impairments. However, too detailed cross tabulations should be limited for surveys, especially if the sample size is small that the sampling design is not appropriate for generating statistics at a lower geographical division. It is always advisable to examine each cell in the table. There may be a need to collapse some of the categories so that those cells with zeroes or small frequencies can be avoided.
For surveys, it is always advisable to include a table on sampling error which is usually measured in terms of standard error and the confidence intervals. By providing this information, the users will be informed on the range in which the true value for the population could fall.
The counting unit is the unit used to quantify the cross tabulation data - most commonly in terms of appropriate population numbers (ones, hundreds, thousands, millions), or percentages and rates. For surveys, since the generated statistics are just estimates based on samples, percentages or rates are preferred. If the numbers are to be included they should be rounded-off to hundreds or thousands.
The important thing to remember when designing cross-tabulations of data is that it is not an exercise in arithmetic. If a statistical collection uses six variables, then the challenge is not to present a series of tables in which all possible combinations of these six variables is displayed as two-by-three tables. Rather, the challenge is to produce tabulations that have a purpose and enable the end user to address issues of interest. Each tabulation should be there for a reason: to provide data relevant to a purpose or issue. To ensure this, it is recommended to state, under each tabulation, the purpose and underlying issue that it has been designed to address.
9.1.3 Graphical representation of disability data
The final step is the presentation of the outputs for the ultimate data user. Graphical methods of presentation can provide the user with a clear picture of the significance of the data, highlighting aspects of the data that might otherwise be invisible. An example of a graphical output summarising disability data is presented in Box 9.1.
Box 9.1: Summary of Disability Data in Australia, 2003
Source: Disability, Ageing and Carers: Summary of findings, Australia, 2003 (Cat. No. 4430.0).
The tree diagram in Box 9.1 shows how the two populations (those with and those without disabilities) are compared, which can be useful when investigating issues of equalization of opportunities. The further subdivisions create increasingly more refined subpopulations in terms of kind of restriction, severity, and, finally, whether the person is living in a household or in a cared accommodation. All these data are presented clearly and quickly in graphical format.
There is often more than one way to present disability data. It is useful to think carefully about the aspect of the data that is important. For example, both diagrams in Box 9.2 present an age profile of person with disability, but emphasize different aspects of the data.
Box 9.2: Age Profile of Persons with disabilities: Two Diagrams
2001 Census of Population and Housing, Sri Lanka
Survey of Disability, Ageing and Carers, Australia, 2003
The first diagram in Box 9.2 compares age distribution of the general population with that of person with disability. It shows that the distributions are very different and that, understandably, a large proportion of person with disability are in the higher age groups. The second diagram presents a graph of disability prevalence rates by age and sex. This diagram shows clearly that the disability prevalence rate for both males and females aged 85 years and over is greater than 80 per cent. The same data is presented, but the impact is very different.
9.1.4 Map representation of disability data
When comparing statistics by geographical subdivisions, e.g., regional, provincial, towns, and others, one very effective way of presenting the results is through the use of statistical maps. The map shows different colors that reflect different values. The colours are used for easy identification of areas with the highest and lowest values. For instance, the statistical map below that shows the data about disability rates in the different regions in the Philippines uses several colours with dark orange representing the highest disability rate and the lightest colour as the lowest. The regions with the highest disability rate are Eastern Visayas, Western Visayas, Mimaropa, and the Bicol Region (1.46 to 1.74 percent).
Box 9.3: Disability Rates by Region, Philippines: 2000
|9.2 Analysis - Turning data into information
Statistical agencies are now taking a greater role in analysing the data their instruments collect. Rather than solely producing tabulations and handing over raw data to other agencies for analysis, statistical agencies are now more in the business of "adding value" to their data by engaging in preliminary analyses. This can be as simple as converting the number of persons with disabilities into a percentage of the overall population, or as complex as employing sophisticated mathematical modelling techniques to interpret the data.
How do we turn disability data into disability information?
9.2.1 High quality data
The value and usefulness of output information depends on the quality of the input data. Previous chapters in this manual have outlined the phases in the data collection process - from consultation with clients, to sampling design, testing, and derivation of tabulations - that are designed to yield good data.
Disability statistics, perhaps more than any other area of social statistics, is vulnerable to distortions of data. Worldwide prevalence rates vary remarkably. Although in part this reflects real differences in chronic and infectious disease patterns, differential life expectancy, age structures, nutritional status, exposure rates to environmental hazards, war, and other public health problems, most of the differences can be traced to the quality of the data.
Differences in the operationalization of disability, screening procedures, collection methods, and different methods of calculating disability rates, produce different prevalence estimates. Even within countries, different studies have produced widely different estimates of disability prevalence because they have used different definitions, instruments and procedures to collect the data. For instance, as earlier discussed, estimates of the percentage of persons with disabilities are lower when impairment questions rather than disability questions are used to identify persons with disabilities. This explains in part why the reported disability prevalence rates of Africa and Asia are lower than those of Europe and North America. In addition, when impairment questions are used for screening purposes, the resulting disability rates for men are higher than those of women. In contrast, when activity and participation screening questions are used, rates are similar for women and men, and in some cases disability rates for women are higher.
It is widely believed, both among statisticians and persons with disabilities organizations, that the prevalence rates derived from impairment-based data collections largely under report actual disability incidence, thus compromising the quality and usefulness of the statistics.
9.2.2 Pitfalls in analysing disability data
Even with the highest quality disability data, the analyst must be aware of potential pitfalls in the analysis of this data. Many of the traps described below are standard and apply to the analysis of statistical data about any subject matter. Some of the traps, however, are particular to disability data analysis.
A common analytical mistake is to define the wrong population for the issue under analysis.
A good example is the analysis of disability across the population in terms of data collected for people who access rehabilitation services. As mentioned in the earlier chapter, this population is likely to be composed of people with severe disabilities, and only those who receive services. Not included are those with severe disabilities who do not receive rehabilitation services as well as those with mild or moderate disabilities who do not require the specific types of support provided by these services.
Additionally, as explained in the previous chapter, a survey of people living in private households (not institutions) yields understated data since older people and people with severe disabilities are more likely to be living in institutions.
More generally, a common source of error in analysis comes from ignoring the effect of sampling variability. No sample is perfectly representative; there is bound to be some measure of difference between the estimate of prevalence of disability in a sample population, and that in the whole population. In a survey in New Zealand in 1996, for example, the disability rate was 19.1 percent in urban areas and 18.6 percent in rural areas. Was this a genuine difference, or merely a fluctuation caused by sampling variability? In this case, the difference was not found to be statistically significant; but even if it was, it would still be relevant to ask whether it represents the reality.
Drawing unsupportable conclusions
The task of a data analyst is to take valid, reliable, and high quality data then, draw conclusions about what the data mean and what they tell us about the disability. The process of drawing conclusions can go awry in many different ways, but these errors all share the same underlying problem: the data does not truly support the conclusion. The most common example is drawing conclusions that are plausible (because they appeal to unquestioned beliefs we hold) but which, when scrutinised, really do not have much data to support them.
There are other cases which are less obvious and so potentially more dangerous.
Ignoring the impact of other variables
A standard problem in an analysis is assuming that one or more variables are responsible for an observed phenomenon, when in fact it is yet another, independent variable that accounts for all the data.
For example, in New Zealand, the disability rate for the Maori population is lower than the non-Maori population. Should we conclude that if you are a Maori person you have a lower probability of having a disability than if you are a non-Maori person? No, we should not, since the higher disability rate for the non-Maori population is probably largely due to age: the Maori population is younger than the non-Maori population. Standardising for age, the two rates are roughly equivalent.
Due to the nature of disability, age is almost always a relevant variable in analysis. In some social contexts, gender also matters. For example, men tend to work in jobs that have a high accident rate. Relevant cross tabulations are perhaps the best way to discover whether there are independent variables affecting disability prevalence rates.
A similar error can occur if the independent variable is a general social phenomenon that could easily go unnoticed, and would not normally be included in the analysis. These are sometimes called endogenous factors.
For example, between 1986 and 1991 the number of Canadians reporting some degree of disability increased from 13.2 per cent to 15.5 percent. While the increase could be partially attributed to an ageing population and a change in the survey methodology, analysts suspected that these factors alone did not account for the increase. It was suggested that the increase in awareness of disability in Canadian society between the two survey dates made people more willing to respond affirmatively to questions about limitations in their activities and barriers they encounter in their everyday lives.
Other potential endogenous factors might include promises of increases in welfare assistance and other programmes to those who identify themselves as having a disability, or outright payments to people who participate in the collection activity if they report their disability.
If an analysis is conducted on highly aggregated data, trends of magnitude or direction may be masked unless the data is disaggregated by region, population group, or some other parameter.
For example, the total disability rate in a country may not show any significant change over time, even though the rate may well have increased dramatically in a particular region because of a rapidly ageing population, natural disaster, or other factors. It is therefore a potential source of error not to consider conducting analyses on disaggregated data to confirm the validity of trends at the aggregated level.
Analytical conclusions that claim that the data supports a causal link between variables are subject to many pitfalls. The most obvious error is to claim that there is a causal relationship between variables on the basis of data that merely shows a correlation (that may well be coincidental).
Causal errors are common in the analysis of disability data. For example, looking at the data for unemployment among people with intellectual disabilities, one might be strongly tempted to say that intellectual impairment might causally responsible for the low employment rates. Yet, although these variables are undoubtedly correlated, this may not be a cause at all. Recalling the mistake of ignoring independent variables described above, it may well be that the causes of unemployment are employers' attitudes and behaviours, based on stereotypes and misunderstanding of the true work capacity of people with intellectual disabilities.
We have noticed on several occasions that disability statistics present special problems because the notion of 'disability' has been variously defined in surveys and censuses. The primary virtue of the ICF approach to disability statistics is that it makes absolutely clear that 'disability' is a complex term with three distinct dimensions, each of which can be precisely classified and measured. ICF makes it clear that the variability of definitions in statistical data collection has primarily been the result of data collection designers not clearly identifying which dimension, or which dimensions of disability, their collection is all about.
The problem for the analyst, of course, is that for data that is not grounded in the ICF model, it is very difficult to determine what the answers to disability questions actually mean. Data with different definitions will not be comparable, and conclusions drawn will be unsupportable.
Even if ICF terminology is used, some definitional issues still remain. For example, what do we mean by the disability rate for children ? Do we mean children with disabilities aged 15 years or less, as a percentage of the total children aged 0-14 years, or the number of children aged 5-14 years with a disability as a percentage of the total children at these ages? Some disability surveys do not collect data about children under 5 years because of difficulties in identifying disability amongst children at these ages.
The unit of analysis also needs to be defined. Is the analysis based on the number of persons with disabilities or the number of disabilities? If the latter is chosen, then the same individual may be counted more than once. This data cannot be used to determine the number of persons with disabilities, given the significant number of individuals with multiple disabilities.
|9.3 Forms of output and dissemination strategies
Data must be disseminated in a form that is both relevant and accessible to users. This requires an understanding of who the users are, and their needs, as well as a strategy for promoting the availability of data to maximise the use of the information.
There are various audiences for disability data: the general public; the media; persons with disabilities and their advocacy and support organizations; policy makers, in both public and private sectors; universities and research institutions; and other statistical organizations, local and international. Each audience has different information needs, and the form of dissemination should take these into account.
9.3.1 Presentation of data collection details with the results
Generally, statistical tabulations and analyses should be accompanied by sufficient technical detail to satisfy the needs of the data users. The general public will likely require less technical detail than researchers or other statistical agencies. But, in any case, survey results should always be accompanied by a description of the survey limitations - such as sampling errors, response rates, and others. This is especially important for disability data given the history of wide differences in definitions of disability, screening procedures, and collection methods.
Box 9.4 provides a guide to what should be included in explanatory materials that will accompany survey results and analyses.
9.3.2 Accessibility of data to persons with disabilities
An important consideration in the dissemination of statistical reports is their accessibility to person with disabilities. The relevant modes of presentation include large-type, Braille, audio formats, electronic tables on disk with computer programmes and interfaces for people with intellectual disabilities. It is best to consult with disabled people's organizations to enhance further strategies for ensuring its accessibility.
9.3.3 Dissemination strategies
In addition to hard copy publications, increasingly, there is a demand for electronic publication of statistical information, in CD-ROM format or via the Internet. Many statistical agencies have websites where they post reports and statistical information and this is becoming an important medium for disseminating statistical tabulations and analyses.
Box 9.4 : Explanatory Materials for Survey Analysis Presentation: A Guide
The following information should be included:
A statement of the objectives of the survey or data collection, including a definition of the target population.
A description of its coverage in terms of inclusion or exclusion of geographical regions, particular social groups or age groups, and any other categories of the population covered.
Collection procedures, such as
sample frame used
sample selection procedure
expected sample size
achieved sample size, including sub-groups
response rates, and how they are calculated
non-response methodology and suggested reasons for non-response
collection procedure including type of interviews
date and duration of the fieldwork
quality control (e.g. efforts to reduce non-sampling errors, interviewer training, imputation procedures).
For each estimate reported there should be an associated measure of the sampling error (and method used to calculate the error).
Interpretation of the reasons for the results, and recommendations for future action, such as further research or policy implementation.
Who commissioned the survey, undertook the work, wrote the report.
A dissemination strategy should be worked out during the initial planning phase (and certainly before data is available), considered and agreed on by advisory groups. The strategy should respond to the information needs of the users, and address the following issues:
Timing: when should the data and analyses be released?
Type and range of output: what format should the output come in - paper, electronic or both? Should it be put in publicly accessible form or only in specialised and restricted formats (e.g., confidential unit record files)?
Type of analysis: should the analysis be set out in tables only, or tables with summary commentary and projections, or modelled analysis and analytical commentary, or some other forms?
Shells of tables: should blank tables showing proposed table contents including populations, output data items and cross-tabulations and counting units be included?
Methods of access to data: should the data and analyses be given free of charge or sold, and if sold should it be available for purchase on-line as well as hard copy form in public libraries or other outlets? Should access to data be made confidential and restricted to a few users?
Promotional activities: What types of media releases with summary statistics, fact sheets showcasing important findings, or brochures and user guides will be produced and distributed? Can advocacy groups and other associations of persons with disabilities be included in promotion and dissemination of the data?
Confidentiality of respondent data is an issue of great importance. Care must be taken to ensure that published data cannot be linked to particular individuals, either directly or by inference. The risk of releasing identifiable data is greatest when the data is very detailed or disaggregated, but even in these cases, procedures exist for guaranteeing anonymity. Furthermore, survey, census and registry respondents should always be made aware of confidentiality assurance policies as part of questionnaire introductions so that fear of disclosure of personal information does not affect results.
9.3.5 Standard forms of output
Paper or electronic publication is the conventional media of data dissemination. These may take the form of statistical compendia containing large numbers of statistical tabulations or reports containing descriptive commentary and graphics.
Reports are effective means of communicating statistical information in an accessible form to people who do not have the skills to extract the key trends and patterns from statistical tables. They can either give a general descriptive overview of the results or present focused results on specific areas of interest and concern.
Reports, however, require more resources to produce than statistical compendia, which can interfere with the timeliness of data dissemination. It is therefore advisable to choose cheaper and faster media - such as the Internet - to first release the data, with more detailed, and accessible, reports following later.
Customised data service
Many statistical agencies provide a service for clients that enables them to request their own datasets or tables derived from the survey data. This service can be extremely useful to technically-adept data users since all combinations of useful data could not possibly be provided in a single publication. Highly specialised combinations of data may be of interest to only a few users, but of great importance to them. A customised service makes it possible for all users to request tables of data that meet their specific needs. Usually, statistical agencies charge for this service.
More and more data users are demanding access to unit record or raw data so they can carry out their own manipulations and analyses of data. Published tabulations of data may not allow some users to undertake sophisticated analyses using multivariate and other statistical modelling techniques. Where raw data is released care should be taken to ensure that the files do not have personal identifiers that might undermine confidentiality.
When microdata files are provided, statistical agencies usually charge a fee. The fee may be quite substantial - for example, the cost of de-identified Unit Record Files for most Australian social surveys costs $AUD 8,000. The revenue from the fee can be used to offset the development of these files.
To illustrate the diversity of dissemination approaches, Box 9.5 gives some examples of recent disability data releases.
|9.4 Ensuring secure data or accomplished questionnaires storage
Once collected the data or completed questionnaires need to be managed and stored, at this stage, security and privacy issues become crucial since data either from survey, census or administrative-based data collections contain personal information such as age, sex or address which could be used to identify an individual.
For paper-based data holdings, identifiable information should be kept securely locked away when not in use, and access should be limited to a small number of people directly involved in the data collection. Whenever possible, the questionnaires should be disposed of immediately after being encoded into the computer. Some countries, however, are required to wait a certain number of years before they dispose the accomplished questionnaires. In the case of the Philippines, for example, the retaining period of census questionnaires is five years.
For data collected electronically, or where the data capture process includes the name, address and some detailed identification of individuals, a measure of security should be guaranteed by providing individual user accounts with password protection, and automatic screen shutdown or automatic log-off.
Documentation is the process of recording all the events that transpired during the data collection process. It enumerates and describes the different procedures employed and reports all problems encountered and solutions adopted.
This documentation informs the public of how the operation was conducted, allowing them to analyse and interpret the results fully. Additionally, the information contained in the documentation report can serve as guide in planning for the next survey/census/administrative-based operation of the same type. Furthermore, it allows for international comparison for it provides the basis for an exchange of information on content and procedures.
The following should be included when preparing a documentation report:
- Description of the methods used
- Production schedules and size of staff
- Budget estimates
- Calendar of activities
- Forms and manuals used
- Organization of statistical agency
- Definition of geographic areas
- List and description of equipment and facilities used
- Quality control instituted
- Memoranda and other additional instructions not included in the manuals
- Other relevant information
Documentation is one of the important aspects in any data collection but much of the time it is neglected. In many cases, statistical agencies find that the persons who are supposed to document the data collection process become tied up with new data collection operations. Documentation should not be passed on to whoever is available, but should be completed by those who were actually involved in the data collection.
Box 9.5: Disability Data Statistical Releases: Some recent examples
1. Two stage data release from the Sri Lankan Disability Census of Population and Housing
The Sri Lankan Department of Census and Statistics first issued a bulletin giving summary results of disability by age, sex and district, together with a short descriptive review. The bulletin was also made available on the Department’s website.
This was followed by a detailed report containing 118 statistical tables covering six groups of disabilities. In addition to prevalence information for each type of disability – by age, sex and urban and rural sector – the report contains tabulations cross-classified by age of onset, cause of disability, whether the person was living in an institution, his or her educational attainment and employment status. The report contained a commentary and graphics highlighting major patterns in the data.
2. Release of New Zealand disability survey results
Statistics New Zealand issued a media release as soon as preliminary results from its disability survey became available. The release was issued four months after the fieldwork was completed. A second media release was issued when final results from the survey were available. In addition, a report containing a selection of tabulations from the survey was prepared for the survey sponsors, with tables specified by the survey’s sponsor.
The report contained a selection of 50 tables from the survey and ten pages of commentary and graphics highlighting major trends and patterns in the data. Documentation of the survey methods, survey population, sample design, standard errors, and disability definition were included. A further publication is planned, aimed at a general audience and containing additional analyses.
3. Release of Australian disability survey products
The Australian Bureau of Statistics released the results of its 2003 Survey of Disability, Ageing and Carers in several publications and formats:
- A published summary of findings was released less than a year after the end of the survey enumeration, and contained a broad selection of national estimates of disability, ageing and caring, including detailed estimates of the number of persons with disabilities and their demographic and socio-economic characteristics.
- Simultaneously, separate sets of tables for each State and Territory were released, in paper and electronic forms.
- This was followed within six months by two special topic reports and electronic table sets.
- A de-identified Unit Record File was released for detailed analysis by government agencies and researchers.
- Special data tabulations were made available on request.
- Short analytical articles based on survey data were published in the statistical compendium Australian Social Trends, and made available on the website.
- A Statistical User Guide was produced describing the objectives and content of the survey, the concepts, methods and procedures used in the collection of data and the derivation of estimates.