




3. The 0.58 shows that the population is declining (because it is less than 1). For every 15 to 19 year old in the area unit in 1991, there is only 0.58 of a 25 to 29 year old in 2001. More people have moved away or died than have moved into, or been born in, the area.
4. The residuals show the two clusters in the data mentioned before. There is no other obvious pattern in the data. The outliers in the data are also shown in the plot.
5. The outlier for Gore is quite close to the model and should not be removed. It shows the same decline in population as the other areas do. Te Anau could be removed as it is difficult to be sure how many of the population shown are permanent residents. It is in a different category from the other area units, by being a popular tourist destination. Numbers may include overseas visitors as well as casual workers. If the population count was replaced by the numbers of permanent residents it would give a clearer picture, and then it would be hard to justify its removal. Some of the other area units which show increases are also tourist areas (eg Milford).
6. The coefficient of determination (R2 ) is 0.4876. This shows that only about 49 percent of the variation in the area unit population in 2001 is explained by the regression model. The correlation coefficient is 0.70. This shows there is a reasonable linear relationship between the numbers in 1991 and the numbers in 2001.
7. The removal of the outlier improves the model (see below). It now explains 75 percent of the variation. This means that predictions are likely to be more accurate. 




Part 2: Changes in the north





1. The data shown in a scatter plot with a linear model (above).
2. Both datasets show that the linear model is a reasonable one which explains over 70 percent of the variation. There is very good correlation (more than 0.85) between the data for 1991 and for 2001 in both datasets. The slope of the regression line for Waitakere shows a slight increase in population over the 10 years as the slope is slightly more than 3. For Southland and Gore a large decrease is shown.
4. The numbers in the area units are larger overall in Waitakere.
(a) Because we are looking at movements in the population, it makes sense to try to use the same cohort to see if those people are still there after the 10 years. However, there are lots of other things which make this less valid. Many people leave school between age 16 and 19 and they often go elsewhere to study. Many will not go back afterwards. In addition, it is unlikely that the 15 to 19 year olds will still be living in the same area unit by the next census, even if they stayed in the area. The 25 to 29 year olds are likely to be largely a different group who have moved into the area. So the changes may be just for that age group not the population in general. We are comparing two completely different things and this makes our analysis somewhat meaningless.
(b) If we used the same age range in each census they would be different people and we would have to assume that the percentage of that age group in the population stayed the same. However, this would probably show more realistic results for this age group.
5. In Southland there are very few areas above the y = x line. This means that few places increased in population over the 10 years. In Waitakere there are almost equal numbers of increases and decreases.
6. The high correlation shows that there is a strong linear relationship in both datasets. Most area units with a small population in 1991 still had a small one in 2001, and units with a large population in 1991 were still large in 2001. So this is likely to contribute to the number in the later census. However, the reason for the number of people in the area unit is likely to be related to its location. For example, city locations are likely to have more people than rural locations, where the population is more spread out.
7. More 15 to 19 year olds died or left Southland and Gore than came in. There was a small increase in Waitakere. However, there is no evidence that the people who left Southland went to Waitakere. Possibly more relevant is that we are comparing a rural area with a city area. Many young people go to the city after they leave school and this is likely to be more significant than the direction of the city.
It would make more sense to compare like with like. A city in the south with one in the north, for example. Or two rural areas. Choosing suburbs with the same characteristics could be useful. For example, two close to the university area.
Choosing a different age group might be more useful. People with families are often less likely to move around. The same cohort is then more likely to have more of the same people in it.
Looking at the total population rather than a subset would also be useful.
Related links
Similar data can be obtained from Table Builder using this link.
A copy of the raw data used in this activity is available here: Drift to the north.xls (37 KB)
If you do not own a copy of Microsoft Excel 97 or higher, you can download the Excel File Viewer from Microsoft for free. Excel File Viewer lets you view, print and copy tables downloaded in Excel format. You may use this to export these files to another spreadsheet application.
|