New Zealand’s Integrated Data Infrastructure: Linking data for better science and policy

Category: Blogs

Written by Terrie Moffitt and Anna McDowell 16 October 2019

Statistics New Zealand is performing an unprecedented amount of data linking in its Integrated Data Infrastructure (IDI), a database of de-identified data on the people of New Zealand. The IDI contains administrative data from a number of sectors, including education, social welfare, migration and movements, justice, and health and safety, as well as from Statistics NZ surveys. The IDI is used by researchers and analysts from within and outside of government, and has grown beyond expectations in both datasets and users since its creation in 2011.


The creation of the IDI

Several factors enabled the creation of the IDI. The first was the legislation already in place that allowed the sharing of data. The Statistics Act 1975 and Privacy Act 1993 form a framework to protect information about New Zealanders, while the former also allows data to be used for bona fide research or statistical purposes in the public interest. This enabled the IDI to be created without the need for further legislation. New Zealand has a Privacy Commissioner, who has been supportive and has described the design of the IDI as a ‘grand bargain’, balancing privacy and usage.

Additionally, thanks to its impartial role as the national statistics office of New Zealand,  Statistics NZ is seen as a ‘safe pair of hands’ to collect and disseminate data. As Statistics NZ had already undertaken several data integration projects before the creation of the IDI, it had been shown that the organisation could perform the necessary data integration successfully and without major risks to privacy or security. Statistics NZ also has a high level of public trust -a key issue when individual-level data is being integrated.

Third, support from the New Zealand Government grew as the country’s data needs were better understood and the need for an evidence base to support policy was recognised. The New Zealand Government’s Better Public Services advisory group, along with the then Minister of Finance, encouraged an increase in cross-agency co-operation, particularly on complex issues that span agency boundaries.


The IDI spine

New Zealand does not have a population registry or universal personal identification number. This is by legislation; the Privacy Act 1993 ensures that no identifier used by one agency will be used by another agency as their identifier. Therefore, the data from various sources must be largely probabilistically linked on an individual level.

The ‘spine’, containing more than nine million people, is the central dataset that all other datasets are linked to. It is created through probabilistic linkage, linking tax data to births data, births to visa data, and visa to tax data; these links are then combined to create the spine dataset. It is estimated that fewer than 1% of links in the spine are incorrect.

The New Zealand Government Statistician makes the final decision as to whether data can be integrated in the IDI, taking into account government priorities and systematic evaluations of the risks and benefits. Security measures ensure the IDI can only be used to conduct analysis and research about populations of interest, not to attempt to identify individuals. Access to the IDI is not limited to government; any researcher may access the IDI as long as their research has been approved.

The privacy and security of the data in the IDI are critical to public trust. Statistics NZ ensures access to data is provided only if the ‘Five Safes' are met: safe people; safe projects; safe settings; safe data; and safe outputs.


Social licence and the IDI

Because the IDI is linking and using data to an extent previously unseen in New Zealand, the question of social licence is crucial. Social licence is societal acceptance that a practice that lies outside general norms may be performed by a certain agent, on certain terms.

Research commissioned by Statistics NZ showed the New Zealand public has an expectation that their data are used, but used wisely and for the public good. However, research also indicates that the public is generally not well informed about the existence of the IDI, how it works, and the benefits that can be gained from it. Statistics NZ is continuing to engage in an informed debate about social licence, intended to move the public to a more informed level of trust.

Data providers are generally willing to make their data available to the IDI, as long as doing so does not impose substantial agency costs. For government agencies, the IDI is more useful and easier to manage than attempting to integrate data on their own. The IDI contains more data than a single agency could practically attempt to assemble, and presents a single point of access to this data, rather than requiring the negotiation of agreements with each other agencies individually. In addition, technical support and resources are available from both Statistics NZ and the user community.


Lessons learned

A key lesson learned from the development of the IDI is the importance of flexibility. While some aspects of the IDI, such as privacy and security, were set from its beginning, the structure of the IDI and the procedures around its growth and use have been updated and refined based on the experience of administrators and users. The IDI has continued to grow beyond what had been planned; there is a continued demand for expansion, both from government and within the wider research community.


You can find out more about the New Zealand Integrated Data Infrastructure on the Stats NZ website.

Share this: