The ARM standards require datastreams to store QC checks in auxiliary fields that are parallel to data fields and cover the same dimensions. For each value
data[×0][×1][.][xn] we therefore have a companion value
qc_data[×0][×1][...][xn]. These QC values are stored as integers, with each bit representing a particular state or condition. Depending on the nature of the test, failure (represented by setting that bit to 1) may indicate the data is bad and should not be used, or it could simply indicate an unusual or noteworthy condition (ARM Standards Committee
2015).
Output QC
The output of the ADI transformation process is meant to be a new ARM-standard datastream. Therefore, the transformations will also generate parallel output QC fields to describe the various states and conditions that occur during the transformation itself. When a transformation fails, it is important to document both the occurrence and the reason why it failed - if all the inputs were missing or if a required input had a value outside its valid range, for example. Storing this information can allow us to generate statistics or do correlations on when such conditions occur, and can be an important part of analyzing and improving a given datastream.
It is especially important to flag non-standard “indeterminate” conditions because they do not necessarily mean the data is flawed, just that the transformation occurred in special circumstances. An example of this would be when some but not all of the input values in an average were flagged as bad; we can still calculate a meaningful average value, but we are not using all the points we were expecting to.
Because the transformation library is designed to be consistent across all applications, the possible QC states that come out of a transformation are fixed, and all transformed data will have only these QC bits set. In this way the transformation process necessarily “washes out” any detailed QC information provided by the input datastream. We can no longer tell exactly which input point was bad, nor can we tell which test the data might have failed. All we can do is set the appropriate QC flags that declare that some (or all) of the points used to generate a given output data point were bad.
Under the Serial 1D method, the output data and QC fields generated by transforming the first dimension will be used as input to the transformation of the second dimension, and so on until all the dimensions have been transformed. Therefore, each intermediate QC field will be used to filter data for the next dimension’s transformation, until we have transformed all dimensions. The final output QC fields will hold the QC states generated while transforming just the final dimension. For example, in our 2D case where
data
0
[t][z] is dimensioned by time and height, after transforming the time coordinate we will have the new arrays
data
1
[t’][z] and
qc_data
1
[t’][z]. When transforming
z, we use the QC values given by
qc_data
1
[t’][z] to filter bad values of d
ata
1
[t’][z], in exactly the way we used the original input QC fields to filter bad data while transforming
t. Figure
1 illustrates the same process in the general case.
This means that some intermediate QC information has been lost by the time we have transformed all dimensions. We cannot determine the value of
qc_data
1
[t’][z] at the end, because we do not save or store anything on this intermediate
[t’][z] grid. But because
qc_data
1
[t’][z] has been used as input to a later transformation its impact is propagated through to the final output. Many of the QC flags as described in Table
1 reflect some qualitative QC information about the intermediate transformations. For example, QC_SOME_BAD_INPUTS upon output implies that the result of the penultimate transformation generated some bad data, and provides a starting point for further investigation if desired.
Table 1
QC States by transform method
QC_BAD | X | X | X | Bad |
QC_INDETERMINATE | X | X | X | Indeterminate |
QC_INTERPOLATE | | X | | Indeterminate |
QC_EXTRAPOLATE | | X | | Indeterminate |
QC_NOT_USING_CLOSEST | | | X | Indeterminate |
QC_SOME_BAD_INPUTS | X | | | Indeterminate |
QC_ZERO_WEIGHT | X | | | Indeterminate |
QC_OUTSIDE_RANGE | X | X | X | Bad |
QC_ALL_BAD_INPUTS | X | X | X | Bad |
QC_ESTIMATED_INPUT_BIN | X | X | X | Indeterminate |
QC_ESTIMATED_OUTPUT_BIN | X | X | X | Indeterminate |
Table
1 lists the possible QC states generated during transformation, and which of the initial three transformation methods apply:
About half of the quality states are general in that they apply to all transformation methods. These include a flag to denote the transformation was unsuccessful (QC_BAD), that the transformation included one or more input values with an indeterminate assessment (QC_INDETERMINATE).
The QC_ESTIMATED_INPUT_BIN and QC_ESTIMATED_OUTPUT_BIN refer to whether the transformation parameters “width” and “alignment” have been set externally by the user whether default values were calculated. Details of the parameters that can be externally set will be discussed in a later section.
The QC_OUTSIDE_RANGE state is the only QC state assigned a bad assessment other than the test test documenting whether all inputs were bad and the test noting that the transformation failed. When averaging data it is set if none of the input bins overlaps with any part of the output bin, or if an input dimension’s values are more limited then its value in the output (i.e., if input dimension height goes up to 60 km, but output max height is 20 km, then all values above 20 km will have this flag set). For subsample and interpolation transformations where we use two input points to calculate every output point. If one of our inputs has been flagged, we scan up or down the input grid until we find the nearest good point in that direction that is still within our defined range transform parameter. If not found within the range then QC_OUTSIDE_RANGE is set.
Quality control states unique to the averaging method document whether some, but not all of the inputs in the averaging window were flagged as bad and thus excluded from the transform (QC_SOME_BAD_INPUTS), and if all the inputs to be averaged for this output bin were zero (QC_ZERO_WEIGHT). For nearest neighbor, if the nearest good point is not the nearest absolute point (i.e., the nearest point was flagged as bad), we flag that “indeterminate” status in the QC field. If a linear interpolation technique is being used, if no such good point exists, we scan down in the other direction until we find a good point to use; in that case, the transform actually becomes an extrapolation (which is mathematically identical to an interpolation; the only difference is that instead of bracketing our target index the two points we use are on the same side). If we do not use the two closest bracketing points to interpolate, we set a QC flag to indicate that a non-standard interpolation took place (QC_INTERPOLATE). We also set a flag to indicate if one of the bracketing points had been flagged as indeterminate (QC_INDETERMINATE).