What are the advantages and disadvantages of area data and microdata?
What are the advantages of census area data?
- Census Area Statistics (CAS) for the 1991 Census were mainly 100% counts of population and households. A few tables were a sample of 10%. All tables relating to the 2001 Census will be 100%.
- Tables are organised by topics and, following wide consultation prior to each census, the cross-tabulations of variables available in each table frequently fulfil a user's data requirements
- Area data are available for a large number of geographical scales ranging from national down to small (electoral wards) and very small geographical areas (e.g. Enumeration Districts, Output Areas). Data are also available for particular types of areas such as Travel-to-Work-Areas (relating to commuting patterns) and National Parks
What are the disadvantages of census area data?
- The predefined cross-tabulations in the area tables do not always meet a particular application's data needs
- Analyses of area data risk the "ecological fallacy" whereby relationships apparent at area level are spuriously assumed to operate at individual level. For example, if an area has a high percentage of lone parents as well as a low percentage of car ownership it would be wrong to assume that lone parents do not have cars
- Only limited statistical tests and modelling techniques can be carried out using area data
- These issues can be alleviated using microdata
- The geographical areas for which the census data area released are not necessarily those in which the user is interested
- The boundaries of census data collection and dissemination can change during the 10 years between censuses. This is likely to occur at small area scale in locations experiencing the most population and housing change. Time-series analysis of area data can therefore be difficult
- Users interested in these issues should see Using Data from Different Sources: Consistent Geographical Units and Simpson (2002) and Norman et al. (2003)
What are the advantages of microdata?
- Census microdata are very versatile compared to area data because:
- The variable groupings of data within published area tables may not be the detail required for
a particular application (e.g. age group, ethnic group or occupational classification)
- The cross-tabulations of variables available in area tables may not be those needed for a study
(e.g. counts of individuals by age and ethnic group and occupation)
- Microdata do not have these restrictions
- Microdata avoid risk of the "ecological fallacy" whereby relationships apparent at area level are
spuriously assumed to operate at individual level. For example, if an area has a high percentage of lone
parents as well as a low percentage of car ownership microdata can be used to determine whether lone
parents are more or less likely to have cars in comparison with other persons.
What are the disadvantages of microdata?
- The census microdata that are released are a sample and are thus subject to sampling error.
Cross-tabulations may result in small numbers and wide confidence intervals
- Census microdata contain very detailed socio-demographic information, but each record about an individual
or household has only coarse (i.e. relatively large area) geographic, locational information
- The sample size restrictions and lack of geographic detail are imposed on census microdata
to alleviate risk of breaching the confidentiality of the data