OBSERVATIONAL AND UPPER-AIR DATA USED IN ALGORITHM DEVELOPMENT
CREATION OF A STATISTICAL DEVELOPMENT SAMPLE
RADAR AND ENVIRONMENTAL INDICES AS PREDICTORS OF SEVERE WEATHER
REGIONAL APPLICABILITY OF THE ALGORITHMS
GRAPHICAL PRESENTATION OF THE ALGORITHM OUTPUT
SKILL OF THE ALGORITHMS IN TERMS OF CATEGORICAL FORECASTS
DIFFERENCES BETWEEN SCAN ALGORITHMS AND WSR-88D POH AND POSH PRODUCTS
IMPLICATIONS FOR OPERATIONAL USE
The System for Convection Analysis and Nowcasting (SCAN), a program for
operational implementation of new or updated automatic radar interpretation
techniques, has incorporated several algorithms within the Advanced Weather
Interactive Processing System (AWIPS). The algorithms described below utilize
data from existing radar products and from numerical weather prediction models
to generate probabilities that individual thunderstorms will produce severe
weather. The SCAN Severe Weather Detection algorithm (SSWD) produces the
probabilities of general severe weather (strong winds and/or large hail). The
SCAN Severe Hail Detection algorithm (SSHD) produces the probability of large
hail ( 2 cm diameter). The probabilities are valid within a square region
44 km on a side, centered on a convective storm cell, for 30 minutes after the
radar observation. Storm cell locations are taken from the Storm Track Index
(STI) product.
These products are intended primarily to provide guidance to forecasters on
which storms warrant closer examination for other severe weather signatures,
and to monitor storm development over all of the forecast office's area of
responsibility when the forecasters might be concentrating on a small subregion.
Forecasters have long known that critical values of many radar indices for
severe storms change with the storm environment. In particular, shallow
storms are much more likely to produce severe weather in the spring than in
the summer. The new SCAN algorithms automatically incorporate information on
upper-air temperature and wind vectors as well as radar data. In AWIPS these
algorithms were implemented by using radar data from the vertically-integrated
liquid (VIL) and composite reflectivity graphic products, and upper-air
information from one or more numerical weather prediction models. Comparative
verification tests indicate that the new radar-environmental severe weather
algorithms can produce 10-15% fewer false alarms than does the operational
WSR-88D Severe Weather Potential (SWP) algorithm, which incorporates only VIL
data.
This note contains a brief summary of the development methodology behind
these algorithms and their operational performance characteristics.
The methods used to develop SSWD and SSHD were fully documented by
Kitzmiller and Breidenbach (1993, 1995). The approach was similar to that
employed in the development of the currently-operational SWP algorithm
(Kitzmiller et al. 1995, hereafter referred to as KMS95). In brief, a large
sample of radar observations of convective storms and coincident storm
environment observations was collected and then collated with nearby severe
local storm reports. Equations relating severe storm occurrence (the predictand) to radar and environmental severe weather indices (the predictors) were
then developed.
Because severe weather climatology varies significantly across the United
States, two sets of equations were developed, one for the Central Plains
states, another for the mid-Atlantic states. At the time of the development
effort, these were the only regions with adequate archives of radar data.
Radar data
The radar data for the Central Plains development sample was taken from
Radar Data Processor Version II (RADAP II) archives collected at Amarillo,
Texas (AMA), Wichita, Kansas (ICT), and Oklahoma City, Oklahoma (OKC), between
1985 and 1991. Typically, new volumetric scans were available every 10 or
12 minutes. The RADAP II archive has been described in detail by McDonald and
Saffle (1994) and by KMS95. The Plains sample contains data on over 6000 individual thunderstorms.
Radar data for the Northeast development sample was from the RADAP II unit
at Binghamton, New York, (BGM) between 1988 and 1992, and from the WSR-88D
unit at Sterling, Virginia, (KLWX) during 1992 and 1993. We obtained only VIL
graphic images from WSR-88D; these were manually interpreted for cell locations, peak VIL values, and VIL horizontal coverage. This data sample
contains data on nearly 700 storms.
As noted in KMS95, it appears that there were no systematic biases between
VIL as estimated by the WSR-57 and WSR-88D networks, though there are likely
to be random calibration differences among radars within either network. The
use of data from multiple radars was intended to mitigate the effects of such
calibration differences.
Environmental data
Upper-air conditions were derived from analyses and 6-h and 12-h forecasts
of the Nested Grid Model (NGM) (Hoke et al. 1989) archived by the Techniques
Development Laboratory. These data were chosen in preference to radiosonde
observations because they are readily available in gridded form and represent
a reasonable approximation of atmospheric conditions between rawinsonde
observation times.
To objectively assign environmental conditions to individual storm cells,
the environment was assumed to be constant over each 230-km radius radar
umbrella, with values corresponding to those at the center. The NGM data were
available from Techniques Development Laboratory archives at 6-h intervals:
initial-time analyses at 0000 and 1200 UTC, and 6-h forecasts at 0600 and
1800 UTC. For radar data within one hour of an analysis or forecast valid
time, the values were taken at that time. For data outside these 2-h windows,
the conditions most favorable for strong convection (higher instability,
humidity, wind speeds) at the bracketing valid times were used. Temperature,
humidity, stability, and wind data were derived from 0000 or 1200 UTC analyses
or 6-h forecasts; vertical velocity and divergence predictors were derived
from 6- and 12-h forecasts (the initial fields are quasi-nondivergent).
Severe local storm reports
Reports of stron convective wind gusts (those causing damage or measured in
excess of 50 kt), large hail (2 cm or greater in diameter), and tornadoes are
logged by the National Severe Storms Forecast Center (NSSFC). We collated
these reports with storm cell data by mapping the reports to the VIL analysis
grid. For each cell, the number of all severe reports, the number of large
hail reports, and the largest reported hail diameter were noted.
Only storm cells with at least two grid boxes with a VIL of 10 kg m-2 or
more were considered for inclusion. Any two cells in the final dataset were
separated from each other by at least 28 km, or by 45 minutes in time. Where
spatial or temporal overlaps were found, only the larger cell (in terms of
maximum VIL) was included. The final development dataset contains storm cell
chacteristics including maximum VIL, the number of map grid boxes with VIL in
excess of 10 and 20 kg m-2, and the operational SWP value.
The severe storm report log was then examined to locate any severe weather
or hail events reported near the storms, and the number of associated severe
storm and hail reports were recorded. A storm was considered to be severe (or
a large hail producer) if at least one report (or one large hail report)
occurred from 10 minutes before to 30 minutes after the nominal radar observation time. This convention should account for both events in progress and
those about to develop.
These collection procedures yielded a Plains data sample of 6068 cells, of
which 8% were severe and 5.5% featured large hail. The Northeast sample
consisted of 668 cells, of which 20% were severe and 6% featured large hail.
Thus most severe events in the Plains involved large hail, while over the
Northeast, damaging wind events were predominant. These features of severe
weather climatology might be due partly to physics and partly to land development patterns. Many reported wind events in the mid-Atlantic region are
associated with falling trees and damage to structures. The scarcity of wind
events over the Plains could be due in part to a relatively low density of
construction even near population centers and to sparse forestation.
Two-predictor histograms clearly illustrate that independent information on
severe weather potential is available from both radar and environmental data,
as noted earlier by Breidenbach et al. (1993). The histograms in Figs. 1-3
show the percentage of storm cells that had associated severe weather, as a
function of a radar-derived storm characteristic and an environmental severe
weather index. The predictor combinations shown were the optimum ones in
terms of the statistical correlation between the particular predictand and the
predictors.
General severe storm probability (winds and/or hail) for the Plains region
is shown as a function of cell-maximum VIL (kg m-2)and freezing level height
(m MSL) in Fig. 1. The dominant
severe weather event in this region is large hail; thus the optimum predictor
combination reflects both the intensity of the individual storm (through VIL)
and the atmospheric mean temperature and temperature lapse rate (through
freezing level height). Severe weather is most likely (> 60% probability) for
the highest VIL values and the lowest freezing level values.
General severe storm probability for the Mid-Atlantic region is shown as a
function of cell horizontal area (number of 4-km grid boxes with
VIL 20 kg m-2) and 700-mb wind speed (m s-1) in Fig. 2
. In this region, where wind
events predominate severe weather, it is possible that the optimum predictor
combination indicates storms and general conditions likely to result in
momentum transfer from middle levels of the troposphere to the ground. Larger
storms (and those with higher VIL) are the ones with the most intense updrafts
and downdrafts, while the environmental wind speed is a direct measure of
horizontal momentum above the surface. Widespread convective wind damage is
often associated with higher wind speeds aloft.
For large hail potential alone (Fig. 3)
the dependence on both VIL and
freezing level height is readily apparent. While this sample is from the
Plains region, a similar relationship was evident in the Mid-Atlantic sample.
Applying forward-selection linear screening regression procedures to the data sample yielded the following algebraic relationships between the available predictors and event probability:
SSWD = -16.37 + 2.33 SVG20 + 1.02 WSPD700 + .646 MAXVIL (1)
for the Mid-Atlantic region and:
SSWD = -16.49 + .025 MAXVIL2 - .00206 (MAXVIL x FRZLVL)
+ .365 U-WIND500 + .341 (SFC TOTAL TOTALS) (2)
for the Plains. Here, SVG20 is the number of 4-km grid boxes with
VIL 20 kg m-2, WSPD700 is the 700-mb wind speed in m s-1, FRZLVL is the
freezing level height in dm MSL, and U-WIND500 is the west-east 500-mb wind
component in m s-1. The Total Totals index predictor is in C. The expression in (1) explains 18.6% of the predictand variance, and that in (2)
explains 25.7%. In (2), most of the predictand variance is explained by the
MAXVIL2 and MAXVIL x FRZLVL terms. The forward-selection procedure itself
selected the 700-mb windspeed predictor in (1), indicating the importance of
strong mid-tropospheric winds in causing strong convective wind events over
the eastern United States.
The following equations were derived for probability of 2-cm hail:
SSHD = 14.22 + .03 MAXVIL2 - .0031 (MAXVIL x FRZLVL) (3)
for the Mid-Atlantic and:
SSHD = -375.43 + .019 MAXVIL2 - .00619 (MAXVIL x FRZLVL)
+ 2.057 MAXVIL + .066 THICK1000-500 (4)
for the Plains. Here, THICK1000-500 is the 1000-500 mb thickness in m. The
expression in (3) explains 22.4% of the predictand variance, and that in (4)
explains 24.9%. The similarity between (2) and (4) reflects the dominance of
hail events within the Plains sample of severe weather reports.
We have often noted that during the summer months the SSHD is significantly
lower than the SSWD, as would be expected in very warm, humid environments.
The SSHD and SSWD values are generally closer to each other in the spring,
when storm environments are rather cool.
Our experience in developing the large-hail probability algorithm suggests
that either the Plains or Mid-Atlantic equations would serve in most of the
conterminous United States.
The choice of predictors in a general severe weather probability equation
depends on the dominant modes of severe weather (hail or wind), which in turn
depend on regional storm climatology, surface characteristics, forestation,
and extent of land development. As noted, the prime environmental predictors
for the Plains equation (2) appear to be associated with hail potential, while
the equation for the Mid-Atlantic (1) reflects mainly the potential for wind
events. An examination of severe local storm reports between 1973 and 1994
indicated that wind events predominate east of the 85th meridion, with hail
events predominating farther west. We therefore have constructed the algorithm so that the 'Mid-Atlantic' equations are used for all radar sites east
of that meridion, and the 'Plains' equations are used at sites to the west of
it.
In the southeastern and south-central U.S., it is possible that many wind
events during the late spring and summer are driven more by local instability
effects (e.g. wet microbursts) than by vertical momentum transfer. In that
case, the prime environmental forcing mechanism would be low-level temperature
lapse rate or theta-e lapse rate, rather than 700-mb wind speed. We plan to
investigate this possibility following the collection of an adequate sample of
radar observations.
Probability values are listed along with with other information in a text
popup box in the AWIPS Thunderstorm Product display. This box is brought up
by locating a cursor over a cell identified in the SCAN Thunderstorm Product
display (Smith et al. 1998). Information on storm location, velocity, VIL,
reflectivity, and lightning activity then appears (Fig. 4). The
SSWD value for the cell appears under "SWP" and the SSHD value under "HAIL."
To obtain the box for any identified storm cell in the Thunderstorm Product
display:
1) Select Storm Cells/Site Storm Threat from the local radar menu;
2) Position the cursor over the "Thunderstorm Popup" text identifier at the
lower right part of the screen, and middle click. This makes the product
"editable";
3) Position the cursor within a cell circle and right-click to bring up the
box for that cell.
Though SSWD and SSHD provide probabilistic guidance, their performance is
most easily evaluated by examining categorical (severe/nonsevere) forecasts
based on the probabilities. Categorical forecasts are generally derived by
setting some fixed threshold probability value, and forecasting all storm
cells with probabilities at or above the threshold to be severe. All other
cells are assumed to be nonsevere. This verification exercise is most useful
when a range of possible thresholds, from low to fairly high, is examined.
The performance of these forecasts may be described by four commonly-used measures, the probability of detection (POD), false alarm ratio (FAR), bias, and critical success index (CSI) (Donaldson et al. 1975; Schaefer 1990). Let x be the number of severe events correctly forecasted to be severe, w be the number of nonsevere events correctly forecasted, z the number of nonsevere events incorrectly forecasted to be severe, and y the number of severe events incorrectly forecasted to be nonsevere. Then the following definitions apply:
POD = x / (x + y) (5)
FAR = z / (x + z) (6)
CSI = x / (x + y + z) (7)
BIAS = (x + z) / (x + y) (8)
The POD, FAR, and bias tend to decrease if the severe/nonsevere probability
threshold is lowered. For rare events such as severe local storms, the CSI
reaches a peak value near thresholds that yield neither too low a POD nor too
high an FAR.
The performance of the severe weather and large hail algorithms in terms of
these scores is shown in Figs. 5-8. The chart for the Mid-Atlantic SSWD
algorithm (Fig. 5) can be interpreted as follows: If the
yes/no threshold is set at 20%, then over many cases about 75% of the severe
cells will be detected (POD = 0.75); 62% of the "yes" forecasts will be false
alarms (FAR = 0.62); there will be about twice as many "yes" forecasts as
there are severe cells (Bias = 2). The CSI at the 20% threshold is 0.32.
Similar interpretations can be made for the Plains version of SSWD (Fig. 6),
and for the regional versions of SSHD (Fig. 7 and Fig. 8).
Note that these scores are based on the
dependent data sample. We expect that skill will be lower within a sample of
new cases, that is, for any given probability threshold, the POD will be lower
and the FAR higher in the new sample than in the dependent one. However,
experience has shown that the scores shown here are reasonable estimates of
the values achievable for independent cases.
In order for SSWD and SSHD to be generated in real time, AWIPS must ingest
the following products from the WSR-88D Radar Product Generator (RPG):
VIL (product 57)
STI (Storm Track Index, product 58)
These must be included in the Routine Product Set (RPS) list.
The system also ingests numerical model input data for the upper-air
conditions. These data could be from the Eta or the RUC models. When model
data are unavailable, the algorithms revert to VIL-only estimates of event
probability, and a warning message appears in the popup box.
The POH and POSH (Witt et al. 1998) are generated within the RPG. The POH
is the probability that a storm cell is producing hail of any size at any
level; the POSH is the probability of large hail at the surface. The POSH
algorithm is similar to that in SCAN, in that both rely on measures of upper-level reflectivity within the storm and the height of the freezing level.
However, environmental data for POSH must be entered manually through the
radar Unit Control Position, while SSWD and SSHD obtain environmental data
automatically from within the AWIPS database. To date, no large-scale
comparison has been made between POSH and SSHD.
These techniques do not possess high absolute accuracy in identifying
severe storms; that is, a high probability of detection is associated with a
high false alarm rate. Thus the algorithms are intended primarily to alert
forecasters to sudden or unexpected severe storm development. Other considerations, such as three-dimensional storm structure, storm motion, and real-time
spotter reports must be used to decide which storms actually warrant warnings,
and where the warnings should be valid. At the same time, forecasters can be
confident that storms with very high SSWD values (70% or more) are very likely
to be severe. Meanwhile, the vast majority of storms are assigned very low
probabilities (< 5%), and these are very unlikely to be severe within the next
30 minutes.
For specification of large hail in severe weather warnings, absolute skill
is again rather low. Forecasters can be confident that storms with SSHD in
excess of 50% will generally produce hail shortly, and may wish to specifically mention hail as a threat in statements to the public.
We have used NGM data as a robust source of upper-air information in the
development of these equations. It should be noted that the algorithms
include only upper-air predictors that have a fairly broad spatial structure
function, and thus do not change quickly with time.
We are indebted to Robert Saffle and Wayne McGovern, both formerly employed
in the Techniques Development Laboratory, for their expertise and their
support of this work. Melvina McDonald developed the RADAP II data archive
used here. Manual reduction of the WSR-88D VIL graphic images was carried out
expertly by Bryon Lawrence and Mary Scarzello.
Breidenbach. J. P., D. H. Kitzmiller, and R. E. Saffle 1993: Joint relationships between severe local storm occurrence and radar-derived and environmental variables. Preprints 13th Conference on Weather Analysis and
Forecasting, Vienna, Virginia, Amer. Meteor Soc., 588-591.
Donaldson, R. J. Jr., R. M. Dyer, and J. J. Kraus, 1975: An objective evaluator of techniques for predicting severe weather events. Preprints
Ninth Conference on Severe Local Storms, Norman, Amer. Meteor. Soc.,
321-326.
Hoke, J. E., N. A. Phillips, G. J. DiMego, J. J. Tucillo, and J. G. Sela,
1989: The regional analysis and forecast system of the National Meteorological Center. Wea. Forecasting, 4, 323-334.
Kitzmiller, D. H., W. E. McGovern, and R. E. Saffle, 1995: The WSR-88D Severe
Weather Potential Algorithm. Wea. Forecasting, 10, 141-159.
_____, and J. P. Breidenbach, 1993: Probabilistic nowcasts of large hail
based on volumetric reflectivity and storm environment characteristics.
Preprints 26th International Conference on Radar Meteorology, Norman,
Amer. Meteor. Soc., 157-159.
_____, and _____, 1995: Detection of severe local storm phenomena by automated interpretation of radar and storm environment data. NOAA Technical
Memorandum NWS TDL 82, National Weather Service, NOAA, U.S. Department of
Commerce, 33 pp. [Available from Techniques Development Laboratory,
W/OSD2, National Weather Service, 1325 East West Highway, Silver Spring,
Md.]
McDonald, M., and R. E. Saffle, 1994: Revised RADAP II archive data user's
guide. TDL Office Note 94-2, National Weather Service, NOAA, U.S.
Department of Commerce, 18 pp. [Available from Techniques Development
Laboratory, W/OSD2, National Weather Service, 1325 East West Highway,
Silver Spring, Md.]
Schaefer, J. T., 1990: The critical success index as an indicator of warning
skill. Wea. Forecasting, 5, 570-575.
Witt, A., M. D. Eilts, G. J. Stumpf, J. T. Johnson, E. D. Mitchell, and K. W.
Thomas, 1998: An enhanced hail detection algorithm for the WSR-88D.
Wea. Forecasting, 13, 286-303.