The role of data integration in helping to produce an effective official statistical system is becoming increasingly apparent. The process of bringing together information from different sources paves the way for a broader range of questions to be answered. Through integration it becomes possible to examine underlying relationships between various cross-sections of society, thus improving our knowledge and understanding about a particular subject.
Data integration can occur in many different ways with different levels of data. Here, data integration involves taking two or more different data sources and finding information about the same record. It relies on common data fields being present on different files.
Linking information from different sources allows the examination of relationships not previously able to be considered. Data integration offers a less time consuming and less costly alternative than other investigative methods such as surveys. It also reduces respondent burden by making more effective use of existing data sources.
Data integration raises privacy and confidentiality concerns. Integrating data uses information beyond the purposes for which it was originally obtained. Privacy implications exist because the range of information available about an individual will be greater than would have been considered in any of the original data collections. Furthermore, data intended for statistical use does not require identifying information, but to enable linking to proceed, it is necessary to use identifiers.
In recognition of genuine privacy concerns and sensitivity from the public regarding the linking of records, the Government has directed that:
Where datasets are integrated across agencies from information collected for unrelated purposes, Statistics New Zealand should be custodian of these datasets in order to ensure public confidence in the protection of individual records.
For data integration projects undertaken by Statistics NZ, managing privacy issues while achieving the benefits of linking data is done by ensuring that:
the linked data is only used for the production of official statistics and approved statistical research.
the project is authorised by the custodians of the data sources being brought together and maintains the integrity of the source data collections.
linking of records is carried out by staff at Statistics NZ in accordance with the Statistics Act 1975, the Privacy Act 1993, other relevant legislation and Statistics NZ's data integration policy
the project does not put at risk public trust in the methods used by Statistics NZ
there is an assessment of privacy risks through a Privacy Impact Assessment.
Approval for any data linking rests with the Government Statistician and is not delegated.
A variety of projects has been approved to date.
In addition to these projects, Statistics NZ has been using tax-sourced administrative data to produce a range of outputs for many years.
Statistics NZ has been involved in data integration for over 10 years, and has developed a number of projects. Now, the focus is on making more extensive use of available data, in particular integrating administrative data with information collected in social surveys.
Current topics of consideration are:
Statistics NZ will also have a leading role in the application of data integration across all the government departments.