Statistics NZ > Find info for secondary > Teachers > SURF for Schools > About SURF for Schools

About SURF for Schools

The SURF for Schools data source: New Zealand Income Survey

The New Zealand Income Survey (NZIS) is run annually, as a supplement to the Household Labour Force Survey (HLFS) during the June quarter (April to June). It was run for the first time in the June 1997 quarter. June 2004 quarter results were published in the New Zealand Income Survey: June 2004 quarter Hot Off The Press on 28 September 2004.

All respondents in the HLFS were asked to participate in the NZIS. Data was accepted from a proxy only if people were unable to answer the survey on health or language grounds. Questions related to the respondent's most recent pay period, except for questions on annual income, self-employment income, and investment income, which cover the 12-month period prior to the interview.

Statistics NZ was unable to collect valid data from all eligible respondents. The most common reasons for this were that a respondent was not able to be contacted or that a respondent was not able to provide the relevant information about their income when asked.

Of the 90.2 percent of eligible households that responded to the HLFS, over 84 percent of eligible individuals gave a valid response to the NZIS.

 

About the SURF for Schools dataset

The following variables were used:

  • Age
  • Sex
  • Usual gross wages and/or salary for a week
  • Hours worked per week in the first job
  • Highest qualification gained
  • Marital status (from census data)
  • Ethnicity (from census data).

The dataset contains a small sample of 200 respondents from the 28,000 respondents in the NZIS. It has only a small number of variables from that survey, with some categorical variables being generated from census data.


These records are not information about real people, but were generated using statistical techniques to have the same characteristics as respondents to the survey. The data in SURF for Schools shows many of the same patterns as the original dataset, and analysis will give results that are close to the results from the original survey. 
 

Methods used to create the dataset

The SURF dataset was based on a confidentialised unit record file (CURF). Variables were generated from this file using probability and regression techniques so that no person's actual data is on the SURF file. However, the data on the file gives a good representation of the characteristics of people on the original file. A sample of 200 records was taken from the synthesised dataset. This sample was based on weights so that it represented the population distribution rather than the sample distribution.


No records with missing values were included in the dataset. In addition, only respondents with income between zero and $2,000 and with 0 to 80 hours worked were used. Age was limited to those 45 years or younger. The focus of the dataset is the general employed population rather than unemployed or unusual respondents.

 

Release of data

According to the Statistics Act 1975, the Government Statistician may publicly release the SURF for Schools under the following condition:

"All statistical information published is to be arranged in such a manner as to prevent any particulars published from being identifiable by any person as particulars relating to any particular person or undertaking....[Section 37(4)]."

 

List of dataset variables

Variable name Variable description Codes Notes
Personid Unique Person Identifier Random number up to 3 digits in length
Age Age of individual Single year ages up to 45 years Continuous data
Qualification Highest qualification:
No qualification
School qualification
Vocational or trade qualification
Bachelor or higher degree
none
school
vocational
degree
Categorical data
Ethnicity European
Māori
Pacific people
Other
European
Māori
Pacific
Other
Categorical data
Marital Never married
Married
Separated, widowed or divorced
Other
never
married
previously
other
Categorical data
Gender Male
Female
male
female
Categorical data
Hours Usual total hours work - weekly
From all wage and salary jobs
Numeric - rounded to nearest whole number Continuous data
Income Weekly income from all sources, excluding investment income Numeric - rounded to nearest whole number Continuous data