Initialization Strategies for Climate Forecast System

Ben Kirtman

George Mason University and the Center for Ocean-Land-Atmosphere Studies

I.  Improving Coupled Seasonal Predictions

It is now widely recognized that the predictability of the climate system on seasonal to interannual time scales is the result of complex coupled interactions among the ocean, the atmosphere and the land surface. Indeed, the El Niño/Southern Oscillation (ENSO) phenomenon, which is largely (but not completely) due to air-sea interactions in the tropical Pacific, is acknowledged as the most predictable element of the climate system on seasonal to interannual time scales. Despite this recognition, there are profound gaps in our ability to predict the climate system and ENSO, in particular. Much of this limitation is related to the fact that there are large systematic errors in the models - especially coupled models - and these errors ultimately limit our ability to predict ENSO. There are two specific examples that are of interest here: (i) the coupled model mean state does not agree with the observed mean state with sufficient fidelity and (ii) the space time evolution of the simulated climate anomalies are not sufficiently realistic.

The purpose of this discussion is to first document the implications of the above two errors on forecast evolution, and second, to propose strategies for improving the coupled forecasts in the face of these errors. Here we focus on what I consider to be one of the best currently available coupled models, namely, the recently released NOAA Coupled Forecast System (CFS; see Saha et al. 2006; J. of Climate to appear).    

II.  Mean State Errors – Initialization Shock 

To understand the implications of the errors in the mean state on forecast skill (i.e., the first error noted above), we need to note how the coupled forecasts are initialized. In broad terms, the separate atmosphere and ocean initial state are model based estimates that are intended to be best estimates of the observed climate. In other words, they are constructed without recognition of the errors in the coupled model. In fact, the sub-surface ocean thermal climate associated with ocean initial conditions or the Global Ocean Data Assimilation System (GODAS) system is significantly different than the climate of the free running CFS. As a consequence, at forecast initialization, the CFS model rapidly adjusts away from the GODAS climate towards the CFS climate. This adjustment is primarily accomplished via Kelvin waves (in the tropics) which ultimately lead to an erroneous SST response 2-4 months into the forecast evolution. This is often referred to as an initialization shock. The question is how does this mismatch between the observed (GODAS) and model (CFS) climate or initialization shock limit forecast skill. It should be noted that the same issue applies to the atmosphere, but since the “memory” of the coupled system primarily resides in the ocean we have not addressed this issue here. Nevertheless, it may turn out to be an important problem.

To address the initialization shock problem we have repeated several coupled forecasts using a procedure known as anomaly initialization. The procedure is specifically designed to reduce the impact of the errors in the mean state of the coupled model. The initial conditions in the experimental forecast are modified so that the initial state has a mean climate that is consistent with the coupled model climate and the initial anomalies are in agreement with the observed anomalies. Several experimental forecasts were made using this approach and there was little discernable impact on the forecast skill. On the one hand, this is a positive result – it suggests that the mean state of the CFS is sufficiently close to the observed mean state. On the other hand, as we show below the second error noted above is seriously limiting the forecast skill.

III. Errors in the Coupled Modes or Slow Manifold

The implications of the second error noted above are more subtle and more difficult to quantify. The basic idea is that the coupled “modes” (think ENSO) of the coupled model are different from the coupled modes of nature. Often we think of these coupled modes as a slow manifold. The consequence of this is that at initialization time the model’s coupled modes are not correctly initialized so that their evolution does not agree with the observed evolution.

To identify the coupled modes of the model and the coupled modes of nature (or in this case GODAS), we examine the evolution of the thermocline depth anomaly in the tropical Pacific. The analysis is based on simple lag-lead linear regression; however, thermocline depth anomalies are precursors for warm and cold ENSO events, so this approach is, in theory, justifiable, but ad-hoc. This calculation is performed for the long simulation of CFS, the CFS forecasts and GODAS. In order to implement this procedure we need a mechanism for interpreting the long CFS simulation as a sequence of idealized forecasts so that lead-time makes sense. For example, first consider each 1 January to 30 September period in the simulation as a 9-month “forecast.” Then the sub-surface ocean or thermocline initial condition for this forecast is the state of the ocean at the end of the preceding December. This issue then is how to relate the forecast evolution to the “initial” state at the end of preceding December. In order to do this we apply lag-lead regression between the forecasted Nino3.4 evolution and the preceding December thermocline depth anomaly. The same approach can be applied to the GODAS data, and, in terms of the control CFS forecasts, the initial condition question is straight forward. (Additional details can be found in Appendix.)

The result is that at short lead times (3 months) the simulation, forecasts and GODAS are in agreement. However, at longer lead time, say nine months, the forecasts and GODAS are similar, but the simulation is significantly different. The implication is that the CFS has different ENSO from the observations, and, as a consequence, if we are careful in how one initializes the forecast ENSO we may be able to improve the forecast skill.

IV.  An Analogue Approach to Identifying and Initializing the Slow Manifold

To identify the slow manifold of the coupled model that best fits the observed evolution of ENSO we look for analogues in the long simulation. In other words, are there ENSO events in the long simulation of the coupled model that closely match the observed evolution? The search for analogues is done entirely in terms of the Nino3.4 SSTA evolution for the target 9-month period. The analogues are identified by minimizing the root mean squared difference between the simulation Nino3.4 SSTA and the observed Nino3.4 SSTA calculated for the 9-month target period. For example, suppose we wish to identify an analogue for January 1997 to September 1997. We first calculate the observed Nino3.4 SSTA for this 9-month period and then search the long CFS simulation for a 9-month period (starting in January) which minimizes the root mean squared difference in the Nino3.4 SSTA. Once we find these analogues, we can then easily identify initial thermocline depth anomalies that are consistent with this evolution. In essence, we have identified the slow manifold of the coupled model that best reproduces the observed ENSO evolution. This approach is applied to several different observed ENSO events so that robust statistics of the slow manifold can be constructed. Based on these statistics of the model slow manifold, any observed initial state can be remapped onto the coupled modes of the coupled model (See Appendix for additional details).  Care must be taken so that this approach is applied in a cross validated way. 

Examples from 1988 and 1997

The above figure shows an example from the 1988 cold event. This case was chosen because the control forecast (shown in red) did a particularly poor job at reproducing the observed cold event. The control and experimental forecasts (shown in green) were all initialized in January 1988. The observed estimate is given in blue from the GODAS. The control forecast is an ensemble of fifteen individual cases, whereas in the experiment the ensemble consists of only five members. The results are encouraging in the sense that for the control forecasts ensemble members are well out of range of the observational estimate, but the experimental forecasts are quite close. In the experimental forecasts there is a discernable shift in the probability of a stronger cold event.

The above second example is from the 1997 warm event. In this case, we note that the experimental forecasts are an improvement only when we consider the complete evolution over the full nine months. The forecast is worse during the first few months. So, we have to be careful about this approach. You can see the control run in red, again the blue one is GODAS and the green curves are my five forecasts. So, in terms of RMS error of the whole evolution, these green curves are better than the red curves, but there is a cost in this particular case.

V.  Caveats

The approach is very ad hoc. The approach that I have taken is to arbitrarily optimize the 9-month Nino 3.4 SST evolution. I could have easily chosen to optimize a different variable or for say three months. In either case, the results would be different. Since this procedure narrowly focuses on eastern Pacific SST, we must ask how much damage have we done to the forecast in other regions. In these specific cases, the other basins are only minimal affected, but this needs further consideration and many more cases.

This approach also has a very deterministic view on forecast problem - If I have a right initial condition, I can get perfect forecast or at least better forecast for 9 months. I think there is some risk there. There could be westerly wind events, various kinds of stochastic forcing during the evolution of the forecast that are the key to capturing the amplitude in 1997. This approach just can not do that, unless somehow you build these into the initial condition.

Finally, I note that this approach is made obsolete once the coupled model is sufficiently improved. Indeed, this obsolescence is my hope. Once this happens, the big break through in coupled forecasting will come from coupled data assimilation. The grand challenges facing the climate prediction community involve improving the coupled models and developing the theory and technology for coupled data assimilation.

Appendix

Presentation at 30th NOAA Annual Climate Diagnostics and Prediction Workshop in State College, PA on 24 October 2005  (ppt file and voice record)

(Contact Ben Kirtman)