|
NOTES FROM RFC's
If you have
some notes from your RFC that you would like included in the next
issue of HAS NOTES. Please E-mail them as a Word perfect attachment
or ASCII TEXT file to jay.breidenbach@noaa.gov
NOTES FROM HRL
There have been
several important changes to Stage II and Stage III which should
help with the underestimation problem that most sites have been
experiencing. They include:
NEW STAGE III
MOSAICKING OPTION
Under the mosaic
button of Stage III, the user now has the option of using the maximum
value in areas where two or more radar umbrellas overlap. The default
will still use the mean value in overlapping areas, however in several
cases examined by HRL, we were able to significantly improve the
Stage III estimate by choosing the maximum value method. When using
the new method, you should be extremely careful that bright band
contamination is not present as the maximum value method will cause
bright band contamination to be even more severe in the final Stage
III estimate. Try it an let us know what you think.
MANUAL EDITING
OF STAGE II SINGLE SITE BIAS
Under the Single
Site Display button of Stage III, the user now has the option to
manually edit the Stage II mean field bias. When Stage II is re-run
the new bias value will be applied to the entire radar field for
that individual site. The new bias adjustment algorithm described
below should provide better bias estimates, however, you still may
find cases where you have more confidence in your manually estimated
bias than you do in the bias automatically determined by Stage II.
If you can confidently estimate how far off the Stage 1 estimate
is for a given radar, you can simply go in and specify your own
bias.
NEW BIAS ADJUSTMENT
ALGORITHM
The new bias
adjustment algorithm for Stage II is now available from the anonymous
ftp site at HRL: you should have received an email from Paul announcing
this by now. Other than the list of adaptable parameters provided
below, no documentation is available yet. The full technical description
of the algorithm will be provided in early July as a part of the
OH's Final Report to the NEXRAD Program and OSF, which is due June
30, 1997.
How it works
Conceptually,
the algorithm is extremely simple (no matter how complicated the
technical description may appear in the aforementioned soon-to-be-available
Final Report). In a nutshell, the bias at the current hour is given
by the ratio of the sum of all positive gage rainfall data over
the radar umbrella (this is the spatial window of sampling) from
the previous x number of hours (this is the temporal window of sampling)
to the sum of all positive HDP rainfall data at the same gage locations
over the same spatio-temporal window of sampling.
In actuality,
the temporal window is not as clear-cut as described above (one
side of the window is actually kind of fuzzy) in order to accommodate
recursive estimation (let's just say that this results in tremendous
savings in CPU and RAM).
Sizing the temporal
window
The size of
the temporal window, x, is specified by the adaptable parameter
'mem_span' (in hours). If you set 'mem_span' to 1, you are using
radar-gage pairs only from the current hour (i.e., you are calculating
the sample bias). If you set it 720, you are using radar-gage pairs
from the most recent 30 days (i.e., you are calculating the monthly
bias). If you set 'mem_span' to 24*365, you are calculating the
yearly bias. If you set it to an extremely large number, you are
calculating the climatological bias. The optimal choice for 'mem_span'
depends largely on the gage network density under the radar umbrella.
Qualitatively speaking, the denser the network is, the smaller 'mem_span'
should be to capture the temporal variability of the bias. Very
often, however, one is much better off gaining reliability in the
bias estimate by setting 'mem_span' to a larger value, thereby collecting
more data (some initial guidelines are given below) than being defeated
by sampling errors due to lack of data while futilely trying to
capture the temporal variability of the bias. In other words, there
is a trade-off between reliability and timeliness (they are mutually
exclusive in bias estimation).
The initially
recommended value for 'mem_span' is 100 or larger for most sites.
For those enviable gage-rich sites in OHRFC and ABRFC, 'mem_span'
may be set to 50 or even less. We strongly recommend that you tune
'mem_span' until you reach the setting to you liking. It is important
to reiterate, however, that 'mem_span' is not some dimensionless
free parameter that must be estimated through some fancy optimization
process, but has a physical meaning which allows you to make an
educated guess at what you may be getting in your bias. For example,
suppose a WSR-88D umbrella has 50 gages. Suppose that it has been
raining for 10 hours over 20 percent of the radar umbrella. If your
'mem_span' is set to 10, then at the end of this event what you
are getting is essentially the 'storm' bias based on roughly 50*10*0.2=100
radar-gage pairs. This bias is then used as the initial bias for
the next storm.
Sizing the spatial
window
The size of
the spatial window is specified by adaptable parameters 'min_gage_rad_dist'
and 'max_gage_rad_dist' (both in kilometers). For example, if you
set 'min_gage_rad_dist' and 'max_gage_rad_dist' to 0 and 230, respectively,
you are collecting radar-gage pairs for bias calculation from the
entire radar umbrella. Initially, we do not recommend any changes
to these parameters.
Because of range-dependent
biases in HDP products, it is expected that bias-adjusted HDP estimates
based on 'min_gage_rad_dist' and 'max_gage_rad_dist' settings of
0 and 230, respectively, may be overestimates over mid-ranges. If
this becomes a problem, one may set 'max_gage_rad_dist' to a smaller
value (say, 200 km) to ignore radar rainfall data suffering from
far-range degradation. The downside of this is of course that you
will be losing one-fourth of the data (yes, the thin annulus between
the range rings 200 and 230 km actually amounts to almost a quarter
of the whole radar umbrella).
A few more comments
Off-line studies
using long-term data (to be available in the Final Report) indicate
that the algorithm does a better job at large-scale mass-balancing.
In other words, if you sum up a year's worth of gage rainfall amounts
collected over the radar umbrella and corresponding bias-adjusted
radar rainfall amounts, they will be approximately the same (say,
within + or - 10 percent). This implies that bias-adjusted radar
rainfall amounts are equally likely to be overestimates as they
are underestimates.
Another consequence
of mass-balancing bias adjustment is that small-scale accuracy of
bias-adjusted radar rainfall may be worse than that of raw HDP rainfall.
In other words, if you spot a small area of intense rainfall, you
may find that raw HDP estimates are more accurate that the bias-adjusted
estimates. This catch-22 problem (in that you cannot always reduce
mean error and root mean square error at the same time) is due to
rainfall amount-dependent errors in HDP products stemming largely
from range degradation and inaccurate Z-R parameters. At any rate,
beware that bias-adjusted radar rainfall may be overestimates in
cores of heavy rainfall!
As usual, if
you have any questions or problems, just give any one of us a holler
(Jay, Paul, or myself): the sooner we learn what the problems are,
the more quickly we can get to them.
Adaptable
Parameters - Bias Adjustment
Parameter Default Range (Followed by description and notes)
rng_min 0 (0,230)
Denotes 'minimum
range.' It specifies the minimum range (i.e., distance from the
radar, in km) for pairing rain gage data with collocated radar rainfall
data.
If one opts
to ignore radar rainfall estimates at close ranges, say, below 50
km (e.g., to calculate mean field biases that are not subject to
the close-range bias in WSR-88D precipitation products), he/she
may set rng_min to 50., and interpret the calculated bias
accordingly: it is then an estimate of the ratio of the sum of gage
rainfall amounts over the area identified as raining by both gages
and the radar beyond the range of 50 km to the sum of bias-adjusted
radar rainfall amounts over the same area.
.
rng_max 230. (rng_min, 230)
Denotes 'maximum range.' It specifies the maximum
range (in km) for pairing rain gage data with collocated radar rainfall
data.
If one opts to ignore radar rainfall estimates at
far ranges, say, beyond 180 km (e.g., to calculate mean field biases
that are not subject to the far-range bias in WSR-88D precipitation
products), he/she may set rng_max to 180., and interpret
the calculated bias accordingly (see rng_min).
nmin 7 (2,infinity)
Denotes 'minimum
number.' It specifies the minimum number of valid radar-gage pairs
to initiate bias calculation, i.e., there has to be at least nmin
positive gage data and collocated positive radar rainfall data to
attempt bias calculation.
This implies
that, loosely speaking, if there are, e.g., 35 hourly gages randomly
scattered under the radar umbrella, it must rain over 20 percent
of the radar umbrella to attempt bias calculation (i.e., 35x0.2=7).
If the gage network is so sparse that bias calculation will be a
rarity, one may experiment with a smaller nmin (say, 4 to
5): it is then no longer possible to expect that the algorithm will
consistently behave rationally: for example, a sample bias estimate
(i.e., the ratio of the sum of positive gage rainfall amounts to
the sum of collocated positive radar rainfall amounts) based on
only 4 data points may be wrongly interpreted by the algorithm as
very reliable.
eps 10.-6
Denotes 'machine
epsilon.' The default value is much larger than the actual machine
epsilon, i.e., the smallest floating point number recognized by
the computer on which the algorithm is run. The purpose is to guard
against zero-divide associated with numerically singular matrices.
There should be no need to change this.
std_cut 2.5 (0,1)
Denotes 'standard
deviation cutoff.' It is a quality control parameter (dimensionless)
in pairing positive rain gage data with collocated positive radar
rainfall data. It specifies the confidence interval (in units of
standard deviation of the standardized error of radar rainfall data)
around the linear regression through matched radar-gage pairs.
One can make
a rough guess at what percentage of radar-gage pairs might get thrown
out, by looking up the cumulative probably distribution table for
the standard normal variate: at levels of std_cut=0.5, 1.,
1.5, 2., and 2.5, one is throwing away, loosely speaking, 31, 16,
7, 2, and 1 percent of the data. Hence, the larger std_cut,
the more willing one is to accept radar rainfall data in bias calculation
even though they may deviate significantly from the collocated rain
gage data.
The quality
control step associated with std_cut is intended to catch
apparent gross outliers, such as egregious gage data or AP-contaminated
radar data, while allowing differences between radar and gage data
due to natural variability of rainfall and to 'usual' errors associated
with radar observation of rainfall. Setting std_cut to a
smaller value (e.g., 1.5) will produce biases that are less variable,
but will almost certainly deteriorate mass balancing between bias-adjusted
radar rainfall data and rain gage data.
alpha 100.
(eps,infinity)
Denotes the memory span in the updating of state
variables in the algorithm. Loosely speaking, it specifies the length
of the time window (in hours) in blocking out radar-gage pairs to
be used in bias calculation. Loosely speaking again, if alpha
is set to 10000./1000./100./10./1., the calculated bias is based
on radar-gage pairs collected from the most recent 10,000/1,000/100/10/1
hours (13.5 months/1.3 months/4 days/10 hours/1 hour). If alpha
is less than 1., bias is calculated using essentially radar-gage
pairs from the current hour only.
In actuality, the algorithm employs a time window
which is not as clear-cut as the one described above, but which
fades out as the age (i.e., the time of bias calculation - the observation
time) of the radar-gage pair increases. The purpose of this fuzzy
time window is to accommodate recursive estimation, in which case
there is no need to actually store hundreds or thousands of radar-gage
pairs: all the algorithm needs is several state variables from the
most recent bias calculation.
One is strongly urged to play with this parameter
to best tune the algorithm to the local gage network density (and,
to a lesser extent, to the local rainfall climatology). A large
alpha reduces timeliness of the bias estimate, but increases
the chance of getting a bias calculated as well as stability in
the calculated bias (i.e., no very big or small biases).
z_cut 0.01 (0,infinity)
Denotes 'observed
rainfall cutoff.' It specifies the smallest gage or radar rainfall
amount to be included in the bias calculation (in mm).
If, for whatever
reason, one is interested only in estimating bias for heavy rainfall
situations, he/she may increase the z_cut setting to throw
out small rainfall amounts: the consequence is that 1) it will greatly
deteriorate radar umbrella-wide mass balancing between bias-adjusted
radar rainfall and rain gage rainfall and 2) bias calculation will
occur far less frequently.
cv_cut 0.33 (0,1)
Denotes 'coefficient of variation cutoff.' It is a quality control
parameter (dimensionless) to check if the calculated bias is
acceptable.
To measure how
certain/uncertain the calculated bias is, the algorithm calculates
the coefficient of variation of the bias calculated (error standard
deviation of the calculated bias/calculated bias itself). Too large
a coefficient of variation is an indication that, even though a
bias is calculated, it is not trustworthy.
If, for example,
the calculated bias and error standard deviation is 1.5 and 0.2,
respectively, it means that there is approximately 84 and 98 percent
chance that the true (unknown) bias lies within 1.5±0.2 and
1.5±0.4, respectively (this interpretation is an extremely
loose and liberal one because bias is not distributed normally,
but lognormally). Hence, a larger/smaller value of cv_cut
implies that one is more/less willing to accept an uncertain bias
estimate.
aver_cut 0.5 (0,infinity)
Denotes 'average
radar rainfall cutoff.' It is a quality control parameter (in mm)
to avoid bias calculation in light rainfall situations.
Because of nonlinear
errors in radar rainfall data (due, e.g., to inaccurate Z-R relationships,
range degradation, etc.), bias is rainfall magnitude-dependent:
if there are lots of radar-gage pairs so that one could calculate
biases over various ranges of rainfall amounts, the biases will
not be the same, but vary greatly depending on the range of rainfall
amount (typically, biases are larger/smaller for small/large rainfall
amounts).
If there is
only light rainfall in the radar umbrella, it is very likely that
the sample bias ( sum of gage rainfall/sum of collocated radar rainfall)
will be very high simply because the denominator is very small.
To prevent this from adversely affecting the bias calculation, mean
radar rainfall, conditional on occurrence of rainfall, is calculated
and compared with aver_cut before bias calculation is initiated.
If aver_cut is greater, it is interpreted that it is raining
too lightly for the sample bias to mean much (i.e., the signal-to-noise
ratio is too small), and no bias calculation is attempted.
Applying the
above observation, one can actually foresee how the calculated bias
might behave over the course of the life cycle of a storm (assuming,
for the sake of argument, that the storm remains stationary from
birth to death): the cycle of birth-development-maturing-dissipation-death
would roughly translate into the bias cycle of high-medium/small-small-small/medium-high.
Apparently unrealistically high biases are most likely an indication
that it is raining very lightly (and hence not much rain water to
hydrologically worry about).
|