Transforming the Data

If the Generalised Linear Model GLM options are unsuitable but the data are not normally distributed, CPT has options to transform the predictand data to a normal distribution. One transformation option is based on the empirical distribution, and so will work for positively and negatively skewed data, but may not be very effective if there are numerous ties in the data (for example, many cases of zero precipitation). The other option is to transform the data by fitting a gamma distribution, which may be suitable for positively skewed data, but again is unlikely to work well if there are numerous ties. (In most cases the GLM for gamma data is likely to be preferable to use the gamma transformation.) For both transformation options, the predictand data are transformed to a quantile, which is then transformed to a normal deviate. The inverse process is applied to convert predictions back to the original distribution. Since the predictions are transformed back to the original distribution, the transformation is effectively hidden to the user, although, results such as principal component loadings and regression coefficients will apply to the transformed data. The transformation is often effective in eliminating or minimising instances of forecasts with probabilities on the normal category being smaller than the probabilities on the outer categories (in cases when the categories are climatologically equiprobable).

To activate the transformation, use the Options ~ Data ~ Transform Y Data menu item, which will toggle the option. A tick is shown next to the item if the transformation is activated. Note that the transformation can slow the computation noticeably.

For predictands such as precipitation, it may also be desirable to set an absolute lower limit of zero so that negative values are not predicted. The Options ~ Data ~ Zero-Bound menu item will toggle an option to reset all negative predictions to zero. A tick is shown next to the item if the zero-bound is activated.

Last modified: