Statement from Government Statistician Liz MacPherson
Here is Statistics NZ’s response to technology commentator Andy Linton who discussed Statistics NZ’s disaster recovery after the 14 November earthquake during an interview on Radio NZ’s Nine to Noon programme 11.05am, 15 December 2016.
Statistics NZ is rapidly moving towards a future-proof and value-for-money approach towards disaster recovery.
Work on updating our disaster recovery plans, based on distributing risk through an ‘as a service model’ and geographic diversity, were well in hand after the Christchurch and Seddon quakes, but the 7.8 Kaikoura quake last month beat us to the punch by six months. If the quake had struck after May 2017, there would have been no issue at all.
Technology commentator Andy Linton suggested on Radio NZ’s Nine to Noon programme on 15 December that Statistics NZ should have had a full replica of all computer systems based in Auckland to take over in seconds after the quake forced the shut-down of computer servers in Statistics House in Wellington.
What any organisation must do is consider the costs of buying what is akin to house insurance. To create a fully mirrored computer system would be enormously expensive for the taxpayers of New Zealand. Such a fail-proof system is vital for hospitals and emergency services like the police who must respond in a crisis.
While statistical data is vital for decision-making, lives were not at risk because our website was out of action for three days and data releases resumed five days after the quake left Statistics House ‘munted’ (as in uninhabitable for the foreseeable future).
Our disaster recovery planning took a value-for-money approach commensurate with our risk profile. Our data is fully backed up in Auckland and Trentham ready to be deployed.
We did not lose any information and privacy and security was never compromised. There are lessons for us and we will be ensuring our website remains uninterrupted in the future. Our view remains that investing in fully mirrored systems is not value for money given that we are not an essential emergency service. The Government Chief Information Officer Colin MacDonald, the government’s leader for IT, has supported Statistics NZ’s approach to disaster recovery.
“The process of shifting from a status quo in-house data centre is complex and requires considerable time and planning. “Statistics NZ is forward thinking in its roadmap towards a more distributed and resilient model of data and infrastructure services,” Colin MacDonald said.
See more detailed technical explanation of the organisation’s IT disaster recovery plans below.
Statistics NZ’s IT disaster recovery – technical detail
From Statistics NZ Chief Digital Officer, Chris Buxton.
The operating model where organisations operate a mirrored datacentre of their primary datacentre as part of their disaster recovery plan is not cost effective and does not take advantage of the opportunities that innovative capabilities, including cloud services, can provide.
Organisations that retain such a ‘standby’ site incur all the cost without any real benefit, until there is a disaster. This is consistent with the legacy model of information technology (IT) as a cost centre, rather than a business partner that adds value to an organisation. Using modern technology we can build systems so that they can operate in a geographically distributed cluster arrangement, so rather than a sunk cost, the disaster recovery capabilities can be a performance advantage for the organisation.
We defined our disaster recovery capability based on requirements determined in 2013. This capability included resilient capability for some of our key services required to manage a crisis. Such resilient services were configured as distributed clusters, enabling us to take advantage of that investment in our day-to-day operation at no additional cost.
Our business is cyclical, with peaks and troughs of work, and the information we process is largely handled in batches. For example this could be data from a whole survey, or a whole set of data from an external party, so it is not processed in real time. Because of this, real-time mirroring of systems between centres would provide virtually no business benefit at a considerable cost – many millions of dollars per year.
The systems that we need to use at any given point depends entirely on the day of the year at which an event occurs and what processing work is being done at the time.
Our past model was therefore to have resilient capability to manage the event with standby hardware available at our recovery site that can be used to recover a variety of key systems depending what is needed most at the time. This is consistent with the legacy site mirroring model of the time.
Our future model does not look like that.
Our future operating environment will not be built on any single data centre but will be enabled through various modern concepts. It will have a Software Defined or Virtual Data Centre, able to support a hybrid delivery model incorporating Public (Microsoft Azure, AWS) and Private cloud (Government Common Capability) based services, as well as Software (SaaS) and Platform as a Service (PaaS). These will be connected through a Software Defined Network.
This is the process of merging hardware and software resources and networking functionality into a software-based virtual network. This will remove the dependency on any single network, enabling us to incorporate new or changing systems as well as operating locations, with minimal investment.
Underpinning this will be Software defined Storage (SDS), which includes storage virtualisation, for data storage, including performance and durability. The ‘as-a-service’ nature of these capabilities means that we will incur costs based on consumption and can grow or shrink our capability based on real need.
This means we do not need to invest large sums on simple duplication of infrastructure.
This combination of modern technologies will enable us to integrate a range of services no matter where they are located, thus reducing the impact of any single event.
These individual services will have inbuilt resilience including geographic diversity as part of the service cost. Examples of this are Salesforce or Office 365, which provide distributed services and geographical diversity at no additional cost to the customer. This delivers improved service resilience with value for money for the taxpayer.
Through this mechanism we can better manage our risks, removing the ‘eggs in one basket’ issue that is associated with the legacy IT delivery models. This significantly reduces the requirement for a high-cost replicated mirror site, which can then be susceptible to other impacts. New Zealand’s geology means we have a particular risk from multi-location disaster. The traditional primary site, with a mirrored fall back site, offers some protection, but modern technology and services now provide better mitigations to protect organisations.
This will underpin our move to being a more agile and digitally responsive organisation. We will be able to use the best available solutions, while removing the risk of large government IT investment projects, which have not always been the best investment of taxpayers money.
For media enquiries contact: James Weir, Wellington, 021 2859 191, firstname.lastname@example.org
For technical enquires contact: Chris Buxton, Chief Digital Officer, +64 21 615 9581
Authorised by Liz MacPherson, Government Statistician, 16 December 2016