Unreferenced Datasets

Unreferenced or index data files organise data in the same way as station datasets , i.e. the rows represent time, and the columns the indices. The only difference is that the latitude and longitude lines are omitted. The tags are very similar to those for station data . Compulsory tags are:

  • cpt:nrow= number of time steps
  • cpt:ncol= number of stations
  • cpt:row=T (representing time)
  • cpt:col=index

The number of time steps is usually the number of years of data available, but if there are lagged fields, the number of time steps will increase. For example, if data are available for January and February 1971 to 2000 there are two months (one lagged field) and thirty years of data, making a total of 60 time steps. If there are multiple fields then only the number of time steps for the current field should be listed. If any of the months are missing in the file, cpt:nrow should specify the number of time steps that are available. For example, if the data for February 1980 are missing then cpt:nrow should indicate 59 time steps. Alternatively, if the month is included in the file but all the stations have missing values listed then cpt:nrow should indicate that 60 time steps are available. See the section on missing values for further details.

There is little use for multiple fields in unreferenced files, but they are permitted primarily to allow for large numbers of indices if it is impractical to list them all in one line. If multiple fields are used for this reason, it is more efficient to divide the indices approximately equally between the fields. All fields must have the same number of lagged fields, but the lags do not have to be identical.

Optional tags are:

  • cpt:field= abbreviated generic name for the indices
  • cpt:units= units in which the data are stored
  • cpt:missing= missing value flag

Any additional tags are ignored. For further details, see the section on CPT tags .

Immediately beneath the tag-line should be a line containing the names, or some kind of identifing reference, for each of the ncol indices. The name for each index must not be longer than 16 characters, and should contain no spaces (for example, New York would need to be written as New_York or NewYork). An example is shown below.

xmlns:cpt=http://iri.columbia.edu/CPT/v10/
cpt:nfields=1
cpt:field=prcp, cpt:nrow=8, cpt:ncol=5, cpt:row=T, cpt:col=index, cpt:missing=-2.
                  A       B       C       D       E
1960-01        -2.0   159.4   306.9    35.1   167.4
1960-02       590.5    53.4   237.0   113.4   411.3
1961-01       118.5    10.5    37.4     0.0    -2.0
1961-02        61.0    15.2    69.0     0.0    34.3
1962-01        -2.0     2.3   194.5     3.3    98.6
1962-02	      130.9     3.6    41.2     0.0    53.3
1963-01	      241.9     2.8   213.7     9.6   141.1
1963-02        86.1     2.2   108.5     4.3   237.0

It is possible for some of the dates to be missing, although there are some restrictions. For further details see the section on missing values .

To verify index forecasts in CPT using the Probabilistic Forecast Verification (PFV) option the X file needs to contain the tag cpt:ncat=3, indicating that the file contains probabilities for three categories. The first block of data should then contain the probabilities for category 1, which is the below-normal category. The category number should be indicated using the cpt:C tag, and, optionally, the climatological probability for the category, by using the cpt:clim_prob tag. The forecasts for categories 2 and 3 should then follow immediately, as if they were unstacked fields. For example, an index probabilistic forecast input file might look something like the following:

xmlns:cpt=http://iri.columbia.edu/CPT/v10/
cpt:ncats=3
cpt:field=prcp, cpt:C=1, cpt:clim_prob=0.333333333333, cpt:nrow=3, cpt:ncol=4, cpt:row=T, cpt:col=index, cpt:units=%, cpt:missing=-9999
                A     B     C     D
2000-01/03  -9999 -9999 -9999 -9999
2001-01/03     50    50    45    40
2002-01/03     35    25    25    15
cpt:C=2, cpt:clim_prob=0.333333333334
                A     B     C     D
2000-01/03  -9999 -9999 -9999 -9999
2001-01/03     35    40    45    35
2002-01/03     40    35    30    25
cpt:C=3, cpt:clim_prob=0.333333333333
                A     B     C     D
2000-01/03  -9999 -9999 -9999 -9999
2001-01/03     25    25    20    40
2002-01/03     25    40    45    60

Last modified: