• Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
New Zealand Income Survey 2011 CART SURF

The New Zealand Income Survey 2011 CART SURF contains unit record data that teachers can use for teaching and learning purposes, including developing analytical methods or statistical processes. The dataset can also be treated as a simple random sample. The CART SURF contains 29,471 records with eight variables. The sub-SURF, with 500 observations and categorical variables as text in short form, is easier to use with the Visual Inference Tools in iNZight.

Download the CART SURF datasets and data dictionary from ‘Available files’ above. You can open the CSV files in Excel or any statistical computer packages. If you have problems viewing the files, see Opening files and PDFs.

About the NZIS 2011 CART SURF

The New Zealand Income Survey (NZIS) 2011 CART SURF is a new synthetic unit-record file (SURF) dataset based on the NZIS 2011 data. Classification and regression trees (CART) were used to create the fully synthetic dataset. Synthetic data is artificial data created to be statistically similar to the real respondents, but not contain the particulars of any actual people.

The NZIS is an annual supplement (June quarter) to the Household Labour Force Survey (HLFS). All respondents in the HLFS were asked to participate in the NZIS, which provides a snapshot of income levels for people and households. NZIS data gives average weekly income for the June quarter from most sources, including wages and salaries, government transfers, self-employment,  and investments.

See New Zealand Income Survey: June 2011 quarter for the June 2011 quarter results.

See New Zealand Income Survey resource for more information, including questionnaires and technical notes.

The CART SURF is a realistic representation of New Zealanders aged 15 years and over. However, this synthetic data is designed for educational purposes and should not be used as a source of accurate statistical information.

New features in this SURF

Here’s a list of new features not available in previous NZIS SURFs:

  • Coverage of the general New Zealand population aged 15 and over, not just subsets such as wage/salary earners or certain age groups.
  • Region and occupation (ANZSCO) variables have been added.
  • There are up to two responses for ethnicity.

The SURF also has a large sample size (n=29471) replicating the actual size of the real survey. The NZIS 2004 and Household Savings Survey SURFs only contained a small sub-set of the surveyed sample.

Properties of the NZIS 2011 CART SURF

Please note the following properties of the NZIS 2011 CART SURF.

The NZIS 2011 CART SURF contains 29,471 synthetic unit records of New Zealanders aged 15 years and over. The SURF represents a unit record (each row is a synthetic person) dataset of people who could have been selected for the NZIS 2011.

The SURF contains the following eight variables:

  • age: 15 years to 65 plus in 5-year age bands
  • sex: male and female
  • ethnicity: up to two responses in six categories
  • region: one of 12 regions (LGR)
  • highest qualification: five categories
  • occupation: nine categories (ANZSCO Level 1)
  • weekly hours: weekly hours worked from all wages and salary jobs, excluding self-employment
  • weekly income: gross weekly income from all sources.

Meanings of variables

The categorical variables in the NZIS 2011 CART SURF file are numeric codes, so please refer to the data dictionary (available from ‘Available files’ above) to fully interpret the data. Take special care with interpreting the ethnicity, income, and hours variables.

The ethnicity variable follows level 1 of the Ethnicity New Zealand Standard Classification 2005.  There are up to two responses. For example ‘1’ means someone identifies themselves as European only, ‘2’ Māori only, and ‘12’ means someone identifies themselves as both Māori and European. Responses like ‘New Zealander’ are coded as ‘6’, or ‘other’.

The income variable is gross weekly income from all sources, not just wages and salary. This means that if you divide income by hours you do not get hourly rates for jobs. Losses are recorded as negative values.

The hours variable excludes self-employment. This means you cannot assume that everyone who is recorded as working zero hours is either unemployed or not in the labour force.

Contact

If you require further information ,please contact us toll-free on 0508 525 525 or email us on  educationservices@stats.govt.nz.

Published 22 August 2013

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+