Comparative Verification of GFS MOS Guidance, MOS Guidance, and Guidance Generated from the WRF

To assist in evaluating the impact of replacing the 12-km Eta model used in the NWS North American Mesoscale (NAM) system with the 12-km Weather Research and Forecasting (WRF) version of the Non-Hydrostatic Mesoscale Model (NMM), MDL did a verification of MOS guidance generated from the Global Forecast System (GFS) and the NAM system for the period of March 1, 2006, through May 31, 2006.  A third system was included in the verifications of GFS- and Eta-based MOS packages.  The guidance for this third system was generated by applying Eta-based MOS equations to the output from the WRF-NMM.     This approach is not a proper application of the MOS technique, but serves as an indication of what happens to the MOS guidance if WRF-NMM forecast variables are used in lieu of the appropriate Eta variables.  In the verifications to follow, we used the nomenclature “WRF-MOS” to denote this third approach, though the reader must be aware that an actual MOS system was not developed from WRF output.


For the verifications, we used a sample of 335 stations in the contiguous United States (CONUS), Alaska, Hawaii, and Puerto Rico.  In the verifications that follow, we present values for all stations combined, for the 300 sites in the CONUS, for the 30 sites in Alaska, and for 5 sites in Hawaii/Puerto Rico combined.  All verifications for a specific forecast projection are based on a matched sample of GFS, Eta, and WRF MOS.  Only guidance for the 0000 UTC cycle was verified, and only forecast projections out to 84 hours after 0000 UTC were included in the study.  For purposes of this evaluation, we verified the guidance for maximum/minimum temperature (max/min); 6- and 12-h probability of precipitation (PoP); and temperature, dewpoint, total sky cover, wind direction, wind speed, ceiling height, and visibility valid at 3-h intervals from 6 through 84 hours after 0000 UTC. 


For verification of max/min temperature, temperature, and dewpoint, we calculated the mean algebraic error (or bias) and the mean absolute error.  For the PoPs, we calculated the mean square error of the probabilities, termed the Brier score.  For wind speed, we calculated the mean absolute error as well as the Heidke skill score.  For wind direction, we calculated the mean absolute error as well as the percentage of all forecasts with a direction error of ≤ 30 degrees, conditional on the observed wind speed being 10 knots or greater.  For sky cover, we used the Heidke skill score as a verification measure.  For ceiling height and visibility, we calculated the Heidke skill score for five categories of ceiling height and visibility, as well as the threat score (or critical success index) for Limited Instrument Flight Rules (LIFR) conditions.  The table below indicates the categories used for the Heidke skill score and threat score computations.  Note that LIFR conditions for ceiling height combine the lowest two categories, that is, a ceiling height of 400 feet or less constitutes LIFR.  For visibility, the lowest two categories, that is, a prevailing visibility of less than 1 mile, constitute LIFR conditions.


Wind Speed (kts)

Sky Cover

Ceiling Height (ft)

Visibility (mi)

0 – 12


< 200

< 1/2

13 - 17


200 – 400

1/2 - < 1

18 – 22


500 – 900

1 - < 3

23 – 27


1000 – 3000

3 – 5

28 – 32


> 3000

> 5

> 32









            To interpret the verification results, the reader should note that a mean algebraic error of near 0, a low mean absolute error, and a low Brier score (all measures of accuracy) are desirable.  Conversely, the Heidke skill score, threat score, and percentage of wind direction errors of ≤ 30 degrees indicate increasing levels of skill with increasing scores.