311: Cross-validated

The cross-validated calculation option fits a CCA , PCR , or MLR model using all data within the training period. It also works with the GCM option. The option produces cross-validated forecasts for each year in the training period. These cross-validated forecasts, and therefore the available validation statistics, are purely deterministic: no uncertainty estimates are generated. If a set of past probabilistic forecasts is required the retroactive of the double cross-validation option should be selected.

At each cross-validation step k (by default, k = 5) consecutive years are omitted from the training period, where k is the length of the cross-validation window , and the middle year of the years that were omitted from the training sample is forecast. This process is repeated with another set of k years omitted until a cross-validated prediction has been made for each year in the training period. At each step, the model is completely reconstructed, including, if appropriate, recalculating the principal components, and redefining the category thresholds.

Towards the beginning and end of the training period the cross-validation window is looped to ensure that exactly k years are always omitted. For example, if k = 5, when forecasting the first year, the first three years are omitted together with the last two; when forecasting the second year, the first four years are omitted together with the last one.

The cross-validated forecasts are made available for output to a file, and for performance analyses within CPT. Note that all the information regarding the definitions of the model used to make a forecast , such as the principal components and CCA modes or the regression model , is based on the results using all the data within the training period, but all the results regarding the performance of the model (validation) are based on the cross-validated forecasts.