The SURF for Schools data source: New Zealand Income Survey
The New Zealand Income Survey (NZIS) is run annually, as a supplement to the Household Labour Force Survey (HLFS) during the June quarter (April to June). It was run for the first time in the June 1997 quarter. June 2004 quarter results were published in the New Zealand Income Survey: June 2004 quarter Hot Off The Press on 28 September 2004.
All respondents in the HLFS were asked to participate in the NZIS. Data was accepted from a proxy only if people were unable to answer the survey on health or language grounds. Questions related to the respondent's most recent pay period, except for questions on annual income, self-employment income, and investment income, which cover the 12-month period prior to the interview.
Statistics NZ was unable to collect valid data from all eligible respondents. The most common reasons for this were that a respondent was not able to be contacted or that a respondent was not able to provide the relevant information about their income when asked.
Of the 90.2 percent of eligible households that responded to the HLFS, over 84 percent of eligible individuals gave a valid response to the NZIS.
About the SURF for Schools dataset
The following variables were used:
- Age
- Sex
- Usual gross wages and/or salary for a week
- Hours worked per week in the first job
- Highest qualification gained
- Marital status (from census data)
- Ethnicity (from census data).
The dataset contains a small sample of 200 respondents from the 28,000 respondents in the NZIS. It has only a small number of variables from that survey, with some categorical variables being generated from census data.
These records are not information about real people, but were generated using statistical techniques to have the same characteristics as respondents to the survey. The data in SURF for Schools shows many of the same patterns as the original dataset, and analysis will give results that are close to the results from the original survey.
Methods used to create the dataset
The SURF dataset was based on a confidentialised unit record file (CURF). Variables were generated from this file using probability and regression techniques so that no person's actual data is on the SURF file. However, the data on the file gives a good representation of the characteristics of people on the original file. A sample of 200 records was taken from the synthesised dataset. This sample was based on weights so that it represented the population distribution rather than the sample distribution.
No records with missing values were included in the dataset. In addition, only respondents with income between zero and $2,000 and with 0 to 80 hours worked were used. Age was limited to those 45 years or younger. The focus of the dataset is the general employed population rather than unemployed or unusual respondents.
Release of data
According to the Statistics Act 1975, the Government Statistician may publicly release the SURF for Schools under the following condition:
List of dataset variables
| Variable name |
Variable description |
Codes |
Notes |
| Personid |
Unique Person Identifier |
Random number up to 3 digits in length |
|
| Age |
Age of individual |
Single year ages up to 45 years |
Continuous data |
| Qualification |
Highest qualification: No qualification School qualification Vocational or trade qualification Bachelor or higher degree |
none school vocational degree |
Categorical data |
| Ethnicity |
European Māori Pacific people Other |
European Māori Pacific Other |
Categorical data |
| Marital |
Never married Married Separated, widowed or divorced Other |
never married previously other |
Categorical data |
| Gender |
Male Female |
male female |
Categorical data |
| Hours |
Usual total hours work - weekly From all wage and salary jobs |
Numeric - rounded to nearest whole number |
Continuous data |
| Income |
Weekly income from all sources, excluding investment income |
Numeric - rounded to nearest whole number |
Continuous data |
|