Part V:   What is the role of EPS post-processing?

In reality, model used by ensemble has bias, the uncertainty source of a forecasting system cannot be fully and accurately described by an EPS as well as model spatial resolution has to be compromised due to huge computing cost, which result in a suboptimal ensemble system. The defects of such suboptimal ensemble system include the following: ensemble mean not being better than control and other perturbed individual members, suboptimal spread-skill relation (under- or over-dispersive spread), excessive outliers, unreliable probability and lacking of spatially detail structures etc.. Even with perfect IC perturbations, ensemble-based PDF distribution could be woeful at so called “unpredictable spots” as long as model possesses even very small error (Du, 2005). Therefore, post-processing is a necessary and important step to calibrate raw ensemble forecasts. For example, by removing model systematic bias (1st moment), ensemble mean forecast will be more likely to close to the best solution, outlier will be significantly reduced and probability will be more reliable. For multi-model ensemble system, it also ensures no spurious spread introduced when bias of each model is removed before the sub-ensembles are combined into one grand-ensemble. Removing bias is also very important for ensemble-based data assimilation technique. By calibrating 2nd moment (forecast variance), spread-skill relation and the under- or over-dispersive problem of a spread could be improved and remedied. To further improve reliability of a probabilistic forecast, higher-moment such as PDF distribution also needs to be calibrated. In many applications such as hydrology and fire weather, downscaling of a lower-resolution ensemble is necessary to resolve local-scale features. Since ensemble is not perfect in real world, the equal-likelihood property of each member’s performance might be violated (see Part 6), i.e., members could perform differently in quality under different weather conditions especially in multi-model or multi-physics based systems. Under these circumstances, some kinds of performance-based weighting to different members might be found useful in practice before combining all members into an ensemble product, which is another kind of post-processing procedures. Removing bias is believed to be important too in searching for “best member”. We are always hoping to know in prior which member might verify the best although it’s almost impossible since all of them should be equal likely in theory (see Part 6). However, if we can completely remove systematic bias from an ensemble, ensemble median or mean (if distribution not too skewed) should verify the best on average. Thus, for an individual event, we might be able to identify which member is likely to be the best since the member that is closest to the median or mean should also verify close to the best. One might expect that different member could serve as “best member” at different forecast lead time or over different regions or with respect to different weather systems or parameters. In contrast, in a biased EPS, there is no easy reference to be used to identify the best member, where ensemble median or mean should consistently perform poorer than the best member. All those indicate the importance of ensemble post-processing.

There are, in general, two kinds of approaches in post-processing: statistical and dynamical. Since statistical approach is based on past error information, it should work well when bias is relatively constant from day to day and large in size but poorly when bias varies with flow and is small as well as weather regime changes. Statistical approach has many different versions including commonly used running-mean which is an equally-weighted average over a past period of time (e.g., Stensrud and Yussouf, 2003 and 2007; Yussouf and Stensrud, 2006), simple decaying-average which intends to focus more on the most recent past data with equivalently “decreasing weights” with data ages using a Kalman Filter type adaptive algorithm (Cui et. al., 2005), regime-dependent analog approach where weighting depends on flow pattern (Du and DiMego, 2008), linear regression (Krishinamurti et. al., 1999 and 2000; Yuan et. al., 2007a), Artificial Neural Network (Yuan et. al., 2007a and 2007b) and Bayesian Model Average (BMA) etc.. Some of them are more sophisticated than others. The basic idea of BMA is that for each ensemble member, create a probabilistic distribution, then assign a weight to each distribution based on past performance of each member, and finally use weights to combine all distributions into one “master” probabilistic distribution. This BMA approach is gaining its popularity nowadays (Raftery et. al., 2005; Sloughter et. al., 2007; Wilson et. al., 2007). For short-range forecasts (1-3 days), a short data-training period such as 14-30 days might be enough, while for longer-range forecasts (beyond a week), much longer training period might be needed. The

length of training period depends on variables too such as shorter for temperature and longer for precipitation. For situations requiring long training period, some special datasets such as hindcast or reforecasting (Hamill et. al., 2004b and 2006) might be purposely generated for post-processing to use. It’s reported that using reforecasting data can effectively calibrate probabilistic forecasts to be more reliable (comparing Fig. 11 and 12; Hamill and Whitaker, 2007; Hagedorn et. al., 2007; Hamill et. al., 2007). Post-processing can be applied to 1st moment (mean), 2nd moment (variance) and higher-moment (such as PDF distribution, Eckel and Walters, 1998). For example, statistical dressing and shadowing are ways to increase spread for an under-dispersive ensemble (Roulston and Smith, 2003; Berrocal et. al., 2007; Gilmour and Smith, 1997). Statistical approaches can also be applied for downscaling where topography and other information might also be considered at the same time. A dense observation is surely critical in statistical downscaling and other post-processing.




Figure 11



Figure 12

    In many real world situations, bias varies with flow, and the systematic and random error components are hard to be separated (even such a separation is not physical if mathematically can be done), so that statistical methods don’t work well but flow-dependent dynamical approach is desired. There are no widely accepted dynamical methods yet but it remains a land of wildness to be explored in this direction. Based on author’s personal experiences, methods could include multi-model based, dual-resolution such as Hybrid Ensembling approach (Du, 2004), spread-error relation and stochastic physics etc.. Dual-resolution is also common for dynamical downscaling. However, for very high-resolution dynamical downscaling, it might be too expensive to run a full-physics downscaling model. Then, an alternative might be to run the very high-resolution downscaling model with no or reduced physics which might be a reasonable assumption for short range. Stochastic physics has potential in reducing bias by simulating various bias effects in model equations. Since a favorable large-scale environmental condition is necessary for an event such as heavy precipitation to occur, careful diagnosis of related environmental dynamical conditions such as moisture convergence, vertical motions and instability might help to calibrate a forecast as a post-processing too (Gao, 2007).

Contact  Jun Du