This page is a learning resource for the New Zealand Income Survey (NZIS). It outlines the purpose of the survey and its main uses, and provides a few practical exercises relating to the survey.
The New Zealand Income Survey (NZIS) is a supplement to the June quarter Household Labour Force Survey (HLFS) and provides a snapshot of income levels for people and households. The NZIS gives average weekly income for the June quarter from most sources including government transfers, investments, self-employment and wages and salaries. For those receiving income from wages and salaries, statistics on average and median hourly earnings are also available. (Statistics NZ, 2008a, p6)
The NZIS includes “income for everyone over the age of 15 years and over, including those not in paid employment” (Statistics NZ, 2008a, p2).
This means that, unlike the Quarterly Employment Survey (QES), the Labour Cost Index (LCI) and the Linked Employer-Employee Dataset (LEED) job-level datasets, the NZIS includes people who are not engaged in paid employment (for example beneficiaries, superannuitants, stay at home parents and students) as well as those receiving both wages and salaries and other forms of income such as government transfers. As a consequence, the NZIS includes income from a greater number of sources, such as income from government transfers (for example benefits and tax credits) and investments as well as from wages and salaries and from self employment. (Statistics NZ, 2008a, p6)
NZIS data is collected from a sample survey, which is “… a sample of the population as representative of the whole population” (Statistics NZ, 2008a, p11).
The NZIS dataset is collected from households:
Due to its large sample size, the NZIS is particularly good for making income comparisons across different population groups. For example, using the NZIS it is possible to compare incomes across demographic variables such as age, ethnicity, sex, qualifications and household type. It is also possible to compare the average weekly income of those in paid employment with those not in paid employment as well as the hourly earnings of those in full-time employment with those in part-time employment. This data is useful for exploring the implications of full, partial or non-participation in paid employment upon individual income levels. (Statistics NZ, 2008a, p6)
The NZIS is also very useful when looking at the distribution of income in New Zealand. In particular, the NZIS data on personal income distribution, which is presented in quintiles, shows which demographic characteristics are most strongly associated with a lower income and, in turn, a higher income. Quintiles divide the population into five groups by ranking people in order by the amount of income they receive. The bottom quintile (quintile 1) is the lowest 20 percent of the population in terms of income, while the top quintile (quintile 5) is the highest 20 percent of the population. Quintiles are particularly useful for illustrating how incomes are distributed amongst population groupings. (Statistics NZ, 2008a, pp6–7)
There are a couple of points to note with the NZIS that can be confusing for those unfamiliar with the dataset. There are 3 tables on the website under ‘Tables’ that relate to employment status. These are, ‘Average Weekly Income for People in Paid Employment’, ‘Average Weekly Income for People Not in Paid Employment’ and ‘Average Weekly Income by Labour Force Status’.
The first table, ‘Average Weekly Income for People in Paid Employment’, looks at the breakdown of income received only by those in paid employment across several sources including wages and salaries, self-employment, government transfers and investments. Put simply, the table takes the amount of money earned by people in paid employment from the various sources listed above and divides each source's total by the number of people in paid employment. The second table provides a similar breakdown, but for those not in paid employment. Those in paid employment are those receiving income from wages and salaries and/or self-employment. The third table, ‘Average Weekly Income by Labour Force Status’, shows income received from all sources combined, according to an individual's labour force status. This table also includes people not in the labour force. Note that to be categorised as ‘in the labour force’ a person must be either currently employed or actively seeking employment.
Finally, one of the more commonly used features of the NZIS is the occupational income data. This data can be accessed by clicking on Table Builder on the home page and then by clicking on income. NZIS occupational income data is given as average hourly earnings, average weekly income, median hourly earnings and median weekly earnings. These earnings variables can then be broken down by sex and by age groupings. The table also shows these occupational earnings as a time series. That is, the table shows occupational earnings yearly from 1998. It is important to note, however, that the occupational groupings provided by the NZIS are very broad.” The NZIS is not an occupational wage measure but can provide estimates of earnings for broad groups of people (so long as there is enough people in the sample to provide a good estimate). For example, we could not give an estimate for a surgeon but we could for Health professionals (excluding Nursing), and, we could not provide a code for a primary school teacher but we could for Primary and Early Childhood teaching professionals. “Therefore, if you want specific occupational data, such as how much a plumber typically gets paid, the NZIS cannot give this.
The NZIS reports on 'weekly income' and relates specifically to an average week during the June quarter; that is a snapshot in time. Conversion of this weekly income into an annual equivalent is not recommended as an individual's circumstances can change significantly during a year (that is change of job, a period out of work, etc). The Household Economic Survey and the LEED person-level data are better sources of annual income” (Statistics NZ, 2008a, p7)
Here is a definition and an example of the mean. This has been taken directly from Robert Niles' webpage.
This is one of the more common statistics you will see. And it's easy to compute. All you have to do is add up all the values in a set of data and then divide that sum by the number of values in the dataset. Here's an example:
Let's say you are writing about the World Wide Widget Co. and the salaries of its nine employees.
So you add $100,000 + $50,000 + $50,000 + $15,000 + $15,000 + $15,000 + $15,000 + $9,000 + $9,000 (all the values in the set of data), which gives you $278,000. Then divide that total by 9 (the number of values in the set of data).
That gives you the mean, which is $30,889.
Not a bad average salary. But be careful when using this number. After all, only three of the nine workers at WWW Co. make that much money. And the other six workers don't even make half the average salary.
This is why you need to use the median to estimate the average income or earnings. The mean can be completely changed by one outlier or ‘different’ number, therefore will not accurately represent the dataset.
Here is a definition and an example of the median. This has been taken directly from Robert Niles' webpage.
Whenever you find yourself writing the words, "the average worker" this, or "the average household" that, you don't want to use the mean to describe those situations. You want a statistic that tells you something about the worker or the household in the middle. That's the median.
Again, this statistic is easy to determine because the median literally is the value in the middle. Just line up the values in your set of data, from largest to smallest or smallest to largest. The one in the dead-center is your median.
For the World Wide Widget Co., here are the workers' salaries:
That's 9 employees. So the one halfway down the list, the fifth value, is $15,000. That's the median. (If halfway lies between two numbers, split ‘em.)
Comparing the mean to the median for a set of data can give you an idea how widely the values in your dataset are spread apart. In this case, there's a somewhat substantial gap between the CEO at WWW Co. and the rank and file. (Of course, in the real world, a set of just nine numbers won't be enough to tell you very much about anything. But we're using a small dataset here to help keep these concepts clear.)
Here's another illustration of this: Ten people are riding on a bus in Redmond, Washington. The mean income of those riders is $50,000 a year. The median income of those riders is also $50,000 a year.
Joe Blow gets off the bus. Bill Gates gets on.
The median income of those riders remains $50,000 a year. But the mean income is now somewhere in the neighbourhood of $50 million or so. A source now could say that the average income of those bus riders is 50 million bucks. But those other nine riders didn't become millionaires just because Bill Gates got on their bus. A reporter who writes that the "average rider" on that bus earns $50,000 a year, using the median, provides a far more accurate picture of those bus riders' place in the economy.
The mode is the most common number in a dataset. Using the example from the median section, see how the mode is found.
For the World Wide Widget Co., here are the worker's salaries:
The most common number here is $15,000. However, for those who earn a salary of $50,000, or even $100,000, the mode does not accurately represent the different salaries.
Like the mean, when discussing income and earnings, the mode is not very representative of the population.
1) What is the median weekly income of the following group of people?
Median is $_____.
2) Calculate the mean, median, and mode for the following company’s weekly incomes for staff. Which measure is the most appropriate to use to reflect the average? Why?
Mean is $_____.
Mode is $_____.
The most appropriate measure is the __________. This is because
Statistics New Zealand (2008a). User Guide for Statistics New Zealand’s Wage and Income Measures
Statistics New Zealand (2008b). New Zealand Income Survey: June 2008 quarter
Robert Niles’ webpages on mean and median.
2) Mean = (1100 + (4 x 800) + (3 x 500) + 350) / 9 = $683.33
The mean is not appropriate to use as the mean does not reflect the actual weekly incomes earned. Four of the nine staff members do not earn anywhere near $683.33 per week, whereas the other five staff earn a lot more than the mean. The high value of the CEO’s weekly income increases the mean greatly, therefore not very representative of the whole company’s weekly income.
Mode = The most common number = 800 (there are four).
Although this is the same answer as the median, this is not the best answer to use. Imagine if there were four receptionists (earning $350) and three marketing analysts (earning $800). This would mean the mode is $350, which does not paint an accurate picture of the rest of the staff’s weekly incomes.
The most appropriate measure is the median as it gives an idea of the middle of the range or distribution of the different weekly incomes. The mean has a tendency to be skewed by extreme numbers. The mode can be affected if the most common number does not represent the overall range of numbers (for example, if it is the highest or lowest number).