Statistics NZ > Find info for secondary > Teachers > Time series for marriages - level 3 teachers page

Time series for marriages – level 3 teachers page

 

Secondary activities

Time series for marriages activity

Curriculum links

NCEA Mathematics Achievement Standard AS90641

  • Determine the trend for time series data

Mathematics: Statistics strand – level 8

  • Use graphs, moving averages, separation into smooth and rough (with awareness of additive and multiplicative models) to explore time series.


Background

This activity is designed as a practice exercise for AS90641. The purpose is to assist students to think more critically about their results and forecasts and to illustrate an example where one linear trend as a model is obviously unsuitable. While it gives opportunity for students to practice the basic Achieve and Merit level skills, its main focus is to practise an Excellence level analysis.


In particular the focus is on the following explanatory notes:


Merit
Forecast will be based on:
    -   estimates of the trend for smoothed data

The report will include justified comments on some of the following:

-        appropriateness of the model

-        relevance and usefulness of forecast

-        discussion of features of the time series data

-        improvements to the model.


The data is number of marriages registered from 1951 to 1985. It was collected from the Registrar of Births Deaths and Marriages directly. Not all marriages in the last 10 days of the quarter may be included as registrars have 10 days to send in the paperwork. Quarterly data is no longer available. Further information on the current statistics is available here:





Suggested answers


There is a copy of the answers at the end of the Excel spreadsheet, with the calculations and graphs on which the answers were based.


Part A: Trends and forecasting

Question 1.1
The equation is y = 30.66x + 3783.2 showing an increase of approximately 31 marriages per quarter.

Question 1.2

The moving average line is above the linear model of the trend at the ends and below it in the middle. This shows that a curve may be more useful as a model.


Question 1.3

1972

Forecast

Actual

Mar

6,575

7,167

Jun

5,255

6,575

Sept

6,668

5,255

Dec

7,167

6,668


The actual numbers will probably be higher, as the moving average was above the model at this point and seemed to be curving up. The last two values of the moving average are almost even so there may be a possible change to come.
Note: The seasonal component was done using a moving average. If an overall average was used then answers may differ.

Question 2.1

Students may use higher level polynomials but the reasons given below suggest they are a worse model in the long term.

Question 2.2

1972

Forecast

Actual

Mar

7,750

7,261

Jun

7,198

6,849

Sept

5,918

5,460

Dec

7,372

7,298

These predictions should be closer to the actual values because the polynomial model fits the moving average much more closely. However, the problem with polynomials is that they increase very rapidly and so the forecasts are too high.


 Question 3

1972

Actual

Forecast

Linear

Forecast

polynomial

Mar

7,261

7,167

7,750

Jun

6,849

6,575

7,198

Sept

5,460

5,255

5,918

Dec

7,298

6,668

7,372

In the forecasts from the linear model (question 1) the actual value for 1972 was higher for each quarter. The moving average trend was above the linear model at the end, showing that the number of marriages was increasing more quickly than the model showed. But the seasonal variation is also increasing compared with the forecasts. The gap between the forecast and actual values varied, as shown by the comparison graph, indicating a possible change in the seasonal pattern from the previous quarter. Looking at the seasonal pattern in the graph of the actual data, the highs and lows do seem to be becoming more extreme over the years.

From the polynomial model the forecasts were all too high. This model fitted the moving average more closely than the linear model, but one problem with polynomials is that they increase very quickly at the end and tend to overestimate real data. There also seems to have been a change in the pattern in the next year. Looking really closely at the graph, the last value in the moving average did not show an increase. Obviously the number of marriages is not increasing at the same rate.

[This becomes more obvious in part B where the entire file is analysed. It becomes clear then that the number of marriages declines from September 1971.]



Part B: Using other models for the trend 

This part is designed to give students the opportunity to be critical of a model and justify using different models in different parts of the data.  These are evaluated using forecasts as well as by looking at the patterns in the graphs.

Question 1.1

Between 1965 and 1975 the moving average line curves above the linear model. After 1975 it lies under the model. The formula of the line is y = 17.453x + 4282.6. This implies that the number of marriages is increasing by approximately 17 per quarter. However, the number is declining in some parts of the series and increasing quite quickly in others.


Question 1.2

None of the other available models seems to fit the data particularly well. The hump between 1965 and 1975 remains a problem. A polynomial of order 5 or 6 looked the best, but the endpoints were heading in the wrong direction. Polynomials seldom make good models for forecasting as they increase too quickly.


Question 2

The split shown in the answer file (below) is one possibility for the split. The trend changes direction during 1971 and this makes it a sensible place to split the data.


Question 2.1

The linear model of the moving average correctly identifies that the number of registered marriages is increasing by about 29 marriages per quarter in the earlier data, and decreasing by about 10 per quarter in the later ones. The curve in the moving average line seems to indicate that a polynomial would be a better fit. But these generally don’t give a good forecast as they tend to be too high.

 

Question 2.2 

For a split at 1978 this line gives the best fit.  During 1982 there was a slight increase in the moving average values, to rise above the trend line. By 1984 the moving average line had dipped below the trend line. This could be the start of another trend, with the numbers of marriages dropping again, or random variation. A forecast based on this line is likely to be slightly too high. In addition, the seasonal variation is much higher here than earlier in the data. A multiplicative model is likely to give a better estimation of the trend in this case.


Question 2.4

As expected, the forecast values are too high. 


Marriage activity answers.xls