An empirical stochastic model of sea-surface temperature and surface wind over the Southern Ocean

Abstract. This study employs NASA's recent satellite measurements of sea-surface temperatures (SSTs) and sea-level winds (SLWs) with missing data filled-in by Singular Spectrum Analysis (SSA), to construct empirical models that capture both intrinsic and SST-dependent aspects of SLW variability. The model construction methodology uses a number of algorithmic innovations that are essential in providing stable estimates of the model's propagator. The best model tested herein is able to faithfully represent the time scales and spatial patterns of anomalies associated with a number of distinct processes. These processes range from the daily synoptic variability to interannual signals presumably associated with oceanic or coupled dynamics. Comparing the simulations of an SLW model forced by the observed SST anomalies with the simulations of an SLW-only model provides preliminary evidence for the ocean driving the atmosphere in the Southern Ocean region.


Motivation
This study addresses aspects of ocean-atmosphere interaction over the Southern Ocean using measurements provided by satellite sensors.Our objective is to quantitatively describe and analyze co-variability of sea-surface temperature (SST) and sea-level wind (SLW) in this region, by developing inverse stochastic models that are derived directly from the remotely sensed data.Empirical models of potentially Correspondence to: S. Kravtsov (kravtsov@uwm.edu)SST-dependent SLW variability can help analyze the coupled climate dynamics of the Southern Ocean, especially when combined with oceanic General Circulation Models (GCMs).
Climate variability over the Southern Ocean is likely to be of global significance, due to this ocean's special role in linking the Atlantic, Pacific, and Indian basins.However, progress in understanding the dynamics of large-scale airsea coupling over the Southern Ocean has been slow, largely due to the very low density of in situ measurements in this region.Recently launched NASA satellites provide accurate high-resolution global measurements of important climatic variables such as SST and SLW.These global fields now permit the construction of empirical air-sea interaction models for the Southern Ocean.Despite improved data coverage in the region, estimating the propagator of the above mentioned statistical models remains an ambitious and challenging task, since (1) there are still missing data due to the presence of strong winds or heavy rains, and (2) such a model has to have an unprecedentedly large number of degrees of freedom, due to high-dimensional nature of global-scale air-sea interaction.The model construction thus requires major algorithmic revisions and gap-free datasets, which we develop and describe in detail below.

Background
The Southern Ocean is the region south of roughly 30 • S that includes the Antarctic Circumpolar Current (ACC), along with the branches of circulations that link it to the Atlantic, Pacific, and Indian Oceans (Schmitz, 1996).
Published by Copernicus Publications on behalf of the European Geosciences Union.

Satellite data over the Southern Ocean
Poor spatial coverage by in situ measurements in the Southern Ocean prohibits direct comprehensive description of climate variability there.The data from NASA satellites launched over the past decade thus provide a unique source of precise measurements of climatically important quantities such as SST and SLW.Global coverage and fine resolution make them extremely valuable for studying air-sea interaction in the Southern Ocean.In particular, the microwavebased sensor AMSR on the AQUA satellite launched in 2002 samples SST field under clouds -an opportunity that was previously unavailable for infrared-based SST records over the typically cloudy Southern Ocean.
Microwave-based SST products (Kummerow et al., 2000;Wentz et al., 2000) have been utilized before to explore tropical SST variations (Hashizume et al., 2000;Chelton et al., 2001;Harrison and Vecchi, 2001;Vecchi and Harrison, 2002;Vecchi et al., 2003).We will use this type of measurements in the present paper to address air-sea interaction over the Southern Ocean.

Southern Ocean climate
The Southern Ocean is characterized by intense climatological westerlies that induce strong meridional Ekman transports and drive the ACC.The modes of climate variability in the Southern Ocean differ by their time scales and spatial signatures, as well as by specific dynamical mechanisms.Synoptic variability, with a time scale of a few days, is comprised of extremely powerful atmospheric storms associated with baroclinic Rossby waves passing over the region.These synoptic eddies cause large-amplitude SST responses mainly through enhanced vertical turbulent mixing in the oceanic boundary layer and through changes in the air-sea heat flux.An example of such coherent patterns of wind and SST anomalies associated with synoptic variability is shown in Fig. 1, which displays snapshots of these fields' anomalies on 1 December 2002; the anomalies were computed relative to the base 16-day period of 1-16 December 2002.The pattern correlation between the SST and SLW fields over the region shown in Fig. 1 is of about r = −0.73,which allows to reject the null hypothesis of zero correlation in favor of the alternative of negative correlation at 0.1 % level, according to the one-sided t-test with ν = 16 degrees of freedom (the value of t statistic is t = r ν (1 − r 2 ) ≈ −4.3).The number of spatial degrees of freedom within the region of interest in the above test was estimated based on the decorrelation scale of about 1000 km.

Air-sea coupling over the Southern Ocean
Synoptic eddies and the lower-frequency SAM that dominate climate variability in the Southern Ocean on weeklyto-intraseasonal time scales are due primarily to intrinsic atmospheric dynamics.Surface manifestations of these modes induce significant oceanic response.The SST variability associated with this response affects, in turn, the atmospheric flow.Additionally, the SLW variability may be modified, on an intraseasonal-to-interannual and longer time scale, by SST anomalies associated with intrinsic oceanic or inherently coupled processes, such as the PSA and ACW.Surface wind influences SST directly by modifying vertical turbulent heat exchange between the two fluids (Gill, 1982;Arya, 1988) and inducing strong horizontal Ekman transports in the oceanic mixed layer.High-frequency wind forcing also leads to significant long-term oceanic changes by affecting, among other things, the seasonal-mean subsurface temperatures and mixed-layer depths (Kamenkovich, 2005).In addition, surface-wind fluctuations can energize intrinsic oceanic modes, which may play an important role in the dynamics of the Southern Ocean (Wunsch, 1999;Weisse et al., 1999;Karsten et al., 2002;Gille, 2003).The SST signatures of these modes have structures that are different from that of a local SST response to wind forcing.These oceanic phenomena have long intrinsic time scales and may thus lead to partial predictability of the Southern Ocean climate.
The way SLW may respond to SST anomalies is via changes in stability of the marine atmospheric boundary layer.Air passing over a positive SST anomaly becomes more unstable; this leads to anomalous turbulent momentum flux and amplification of the surface wind (Arya, 1988).This effect was shown to be at work over the Eastern Tropical Pacific (Wallace et al., 1989;Liu et al., 2000;Chelton et al., 2001;Hashizume et al., 2001) and over the Southern Ocean (O'Neill et al., 2003) on seasonal-to-interannual time scales.Other dynamical factors may also contribute to this response at all time scales (Hsu, 1984;Lindzen and Nigham, 1987;Mitchell and Wallace, 1992).
The SST-induced modifications of the atmospheric boundary layer may cause changes in the free atmosphere's circulation.The linear response is expected to be weak, but nonlinear modes of atmospheric variability, such as SAM, may produce a stronger effect (Koo et al., 2002;Kravtsov et al., 2006a, b).Feliks et al. (2004Feliks et al. ( , 2007) ) have shown, in particular, how an oceanic thermal front may induce intraseasonal variability in the overlying atmosphere, including surfacewind evolution.
To summarize, the Southern Ocean is characterized by vigorous variability on a wide range of time scales.Air-sea interaction in the region is complex and difficult to repre-sent in dynamical models, as it involves a wide variety of boundary layer processes, as well as their coupling to intrinsic dynamics of the fluids on both sides of the oceanatmosphere interface.Statistically, however, these interactions may well be described by joint variability of SLW and SST.A purely empirical model of this co-variability, based on recent high-quality satellite observations, could provide an accurate quantitative description of air-sea interaction without having to resolve explicitly the complex chain of participating dynamical processes.

Empirical stochastic models of SST and SLW
Data-based inverse stochastic models used in climate dynamics generally belong to one of the two major groups: (i) multivariate parametric models with additive, state-independent noise, the simplest of which is the so-called linear inverse model (LIM) (Penland, 1989(Penland, , 1996;;Penland and Sardeshmukh, 1995;Penland and Matrosova, 1998;Winkler et al., 2001); and (ii) nonparametric, univariate or bivariate models involving state-dependent, multiplicative noise (Sura, 2003;Sura and Gille, 2003;Sura et al., 2006;Sura and Newman, 2008;Sura and Sardeshmukh, 2008).Both types of models can be useful in addressing various aspects of climate variability, but are very different in terms of how they are constructed, as well as in their potential applications.
In particular, the models with multiplicative noise consider the time series of a variable of interest (for example, u−component of surface wind, or SST) at a single spatial location, and estimate state-dependent drift and diffusion parameters of the stochastic differential equation (SDE), which presumably governs the evolution of this variable.In order to get reliable estimates of model parameters given relatively sparse observations, as is the case for Southern Ocean winds, one may concatenate data sets from multiple locations, which are situated far enough so that their respective time series may be assumed to be uncorrelated (Sura, 2003).The scalar SDEs so obtained describe local features of interactions between processes evolving on different time scales.They are particularly successful in interpreting some of the nongaussian aspects of both SLW (Sura, 2003;Monahan, 2004Monahan, , 2006a,b) ,b) and SST (Sura et al., 2006;Sura and Newman, 2008;Sura and Sardeshmukh, 2008) variability.The multiplicative noise in the above studies is attributed to random fluctuations of the drag coefficient or air-sea heat exchange coefficient.
On the other hand, multivariate parametric models driven by additive noise are usually constructed in the phase space of the leading empirical orthogonal functions (EOFs) (Preisendorfer, 1988) of the field(s) of interest, thus addressing non-local aspects of the variability under consideration.This non-locality comes at the expense of a fairly restrictive parametric dependency of the system's tendency on its state.In LIMs, for example, this dependency is assumed to be linear, while the model coefficients and noise parameters are www.ocean-sci.net/7/755/2011/Ocean Sci., 7, 755-770, 2011 found by multiple linear regression (MLR).LIMs driven by Gaussian stochastic forcing cannot model the nongaussian aspects of the observed statistics, but more general, nonlinear empirical parametric models can.Kravtsov et al. (2005b) developed a methodology for constructing such nonlinear empirical models, which also addresses some other weaknesses of LIMs.This methodology showed excellent results when applied to the problems of mid-latitude variability of geopotential heights (Kondrashov et al., 2006), as well as to describing tropical SST evolution (Kondrashov et al., 2005).

This paper
The purpose of the present paper is to construct an empirical model of SLW variability over the Southern Ocean by using concurrent high-quality satellite measurements of SLW and SST.Doing so requires the use of recent microwave-sensed SST fields available after the launch of AQUA in June 2002.
Since only about 5 years of such data are available, we do not attempt to develop a closed model that would simulate by itself long-term aspects of SLW-SST co-variability, such as ACW; this would require a much longer data set with enough degrees of freedom to capture interannual SST signals.Instead, the quantities involving SST observations will serve as predictors in the stochastic model of SLW evolution; the time-dependent SST anomalies themselves will be treated as given.We will show that this model is capable of reproducing the statistics of daily-to-intraseasonal SLW anomalies.As a brief introductory illustration of one of many potential uses of the empirical model constructed, we will present some evidence for large-scale oceanic imprint onto the atmospheric variability in the Southern Ocean by comparing the statistics of an SLW-only empirical model with that of a model forced by the daily history of SST anomalies.
Our statistical SST-dependent SLW model will also be able to capture some aspects of air-sea interaction and longer-term variability when coupled to a dynamical oceanic component.Experiments with such a coupled dynamicalstatistical model will be studied in a future paper.The application of our statistical model as a component of a hybrid coupled GCM requires that both local and non-local aspects of SLW variability and its coupling with SST variability be comprehensively represented in the empirical model.We will therefore build upon the methodology of Kravtsov et al. (2005b) to construct this model, but emphasize here that substantial modifications to that model construction technique are necessary, as detailed below.
As we have mentioned at the end of section 1.1, the high-dimensional nature of basin-scale air-sea coupling in the Southern Ocean region prohibits direct application of Kravtsov et al. (2005b) method and requires major modifications to the model construction algorithm; these changes, when applied to gap-free satellite datasets with missing data filled-in by M-SSA (Kondrashov and Ghil, 2006), are essen-tial in obtaining stable estimates of the empirical model propagator.
The rest of the paper is organized as follows.Section 2 describes the data sources, pre-processing and gap-filling methodology, as well as the data set's basic statistics.Section 3 outlines general, as well as novel technical aspects of the empirical stochastic model construction, with methodological details given in the appendices.The performance of our empirical models is evaluated in Sect.4, while Sect. 5 summarizes our results and elaborates on their significance.
2 Data, pre-processing methodology, and basic statistics

Data sources
The gridded data products used in this analysis are obtained from the Remote Sensing Sytems website (http://www.ssmi.com).The SST data are taken from the AMSR-E ocean data product (Version-5) for the time interval from June 2002 to February 2007 (Kawanishi et al., 2003).Missing data are due to sun glint, heavy rain, proximity of ice edge, and winds greater than 20 m s −1 .The wind speed and direction at 10 m a.s.l. are obtained from the QuikSCAT scatterometer dataset (Liu, 2002).The geophysical data record began on July 1999; for the analysis in this paper, we use data for the time interval that overlaps with that of the AMSR-E dataset.Although the scatterometer data tend to be less accurate in the presence of rain, we do not remove such data entries, since our statistical technique is based on analyzing spatial covariances within the fields considered; the small-scale random errors associated with rain occurrences will thus be effectively filtered out.
Both gridded data sets are available on a 0.25 • ×0.25 • grid twice a day, on ascending and descending paths.For our subsequent analyses, the data were averaged in space and time to produce daily values on a 2 • × 2 • grid.

Filling the missing data
While recent satellite observations over the Southern Ocean do have a previously unprecedented quality, there are still gaps in data coverage in the presence of heavy rains or strong winds; specifically, about 40 % of the points in the SST data set and 20 % in the SLW data set were missing.In order to fill these gaps in the data, we used the methodology of Kondrashov and Ghil (2006).Their algorithm is based on multi-channel singular spectrum analysis (M-SSA) (Ghil et al., 2002) and takes advantage of both spatial and temporal correlations in the existing data to iteratively produce estimates of missing data points, which are then used to compute a self-consistent spatiotemporal lag-covariance matrix; cross-validation is applied to find the optimal window width and number of dominant M-SSA modes to fill the gaps.The missing data have been filled-in for SLW and SST fields separately; that is, cross-correlations between these two fields were not exploited.Since the total number of spatial grid points exceeds the temporal length of the data set (in days) for both SLW and SST, we utilized the "reducedcovariance" approach (Ghil et al., 2002) to compute the spatio-temporal lag-covariance matrix.Based on the results of cross-validation experiments, we chose the lag of 1 day and 300 M-SSA modes for filling SLW components and the lag of 5 days and 160 M-SSA modes for SST.The domainaveraged root-mean-square (rms) error for filled-in values is estimated to be 0.44 • C for SSTs and 1.7 m s −1 for SLW components.

Filtering
We considered continuous, filled-in by M-SSA data sets of daily SST scalars and SLW vectors on a 2 × 2 • grid (65 • − 30 • S), for the period of 1 June 2002-13 February 2007, for a total of 1719 days.We first removed the seasonal cycle by retaining, at each grid point, only the residual of the multiple linear regression of the original, unfiltered time series onto a ten-variable "seasonal cycle" time series.The latter time series had the form (sin(2πnt/365), cos(2πnt/365)) , where time tis measured in days and changes from t = 0 to t = 1718, while n = 1,2,3,4,5.The filtered versions of the original SST and SLW time series were then also linearly detrended to get rid of secular variability, since our statistical models are assumed to be stationary.

Low-order moments
Figure 2 shows a few of low-order moments of the filtered anomalies so obtained.The time-mean wind is plotted in panel (a), with the wind speed given by color shading, and the direction of the wind by arrows.The winds are predominantly westerly, as expected, and their spatial pattern represents a mid-latitude jet, whose axis is located at about 50 • S between South America and Australia, and at about 55 • S elsewhere; the strength of the jet in the latter region is somewhat weaker, with the exception of even weaker time-mean winds just east of South America.The standard deviation of the wind speed shown in panel (b) is fairly uniform throughout the Southern Ocean, with the most intense variability south of the stronger portion of the jet, and the weakest variance at the northern edge of the Southern Ocean.Modern data sets thus indicate that, at the 2 • ×2 • resolution used here, the "furious fifties" are much more intense than the "roaring forties" of sailing days.Moreover, at this resolution and on a 5-yr average, winds off Cape Horn or the Cape of Good Hope are not particularly strong, although their standard deviation is maximal off Cape Horn and above the Agulhas Current, east and south of the Cape of Good Hope.
Color shading in panel (c) shows the distribution of the skewness of the zonal component of the surface wind, which is found to be negative in the majority of the basin.Monahan (2004) explains this property of the zonal-wind anomalies in terms of a nonlinear surface drag law: according to this law, positive anomalies in u in the region with positive timemean zonal winds will be subjected to stronger friction than negative anomalies, so that the resulting u-wind distribution will be negatively skewed.
The time-mean SST field (Fig. 2d) is consistent with the climatological wind (Fig. 2a) in that the strongest SST front is co-located with the strongest zonal jet, south of Africa and further eastward, at 40 • S.This is presumably the region of the strongest ACC as well.The north-south SST gradients elsewhere are weaker.The SST variance (not shown) is largely uniform throughout the Southern ocean, with values around 1-2 • C.

Principal component (PC) analysis
Prior to computing the EOFs and PCs of SLW and SST, we multiplied the time series of these quantities at each grid point by the square root of the cosine of its latitude, to account for the meridian convergence and get area-weighted grid-point contributions to the total variance of each field.The EOFs of SST and SLW were computed separately, and we used the combined (u, v) field to compute the latter.The percentages of variance accounted for by the first 100 EOFs of SLW are shown in Figs.3a, b, while the analogous plots for SST EOF are in Figs.3c, d.
Two leading SLW EOF pairs are somewhat separated from each other and from the rest of the modes (Fig. 3a) and together account for about 20 % of the total SLW variance (Fig. 3c).These two pairs are associated with the leading synoptic disturbances in the "weaker jet" (160 • W-80 • W) and "stronger jet" region (40 • W-130 • E), respectively; compare Fig. 2a   shows that these two modes are characterized by zonal wave numbers 8 and 9, respectively.For the SST, only two leading EOFs stand out from the rest (Fig. 3c), and account for about 12 % of the total SST variance (Fig. 3d).Both of these EOFs have a wavelike pattern with dominant zonal wavenumbers 3-4 (Figs.5a, b) and pronounced interannual variability (Fig. 5c) suggesting their possible association with the ACW.In particular, if the time scale T of the ACW is set up by advection processes (Weisse et al., 1999), then T = L/U , where L and U are length and velocity scales, respectively.For wavenumber-3 patterns (Figs.5a, b) L∼6000 km; given typical advective velocities of U =10 cm s −1 , one then ends up Figure 6 shows integral correlation time scales T int of the leading 100 EOFs of SLW in panel (a) and SST in panel (b).The quantity T int was defined as T int = 100 τ =1 |c(τ )| τ , where c(τ ) is the autocorrelation of a given PC at the lag τ (in days), and τ = 1 day.In general, the integral correlation time scale of the trailing modes is shorter than that of the leading modes, for both the SLW and SST PCs, although the dependence of T int on the mode number is not monotonic.The leading EOF pairs of SLW have time scales of about 5.5 and 4 days, respectively, while the leading EOF pair of SST is characterized by T int ≈60 days.The latter estimate is an order of magnitude longer than the maximum T int of the SLW EOFs, which is of about 7 days.Hasselmann (1976) introduced a null hypothesis for low-frequency variability SST anomalies, which involved integration of fast and essentially random air-sea heat fluxes by an ocean mixed layer.The fast random heat flux forcing was associated with SLW variability, while the longer time scale of SST anomalies arose due to ocean mixed layer's thermal inertia.We argue that leading SST modes are not consistent with this null hypothesis for two reasons: (i) the Hasselmann mechanism is local, implying positive spatial correlations between SLW and SST patterns, whereas the patterns in Figs. 4 and 5a, b are not so correlated; and (ii) the interannual time scales of the leading SST modes (Fig. 5c) are longer than those associated with mixed layer thermal inertia.We thus conjecture that the leading SST modes arise from intrinsic ocean dynamics.
In fact, the arguments of the latter paragraph apply to most of the SST EOFs, more so for leading modes, and to a somewhat smaller degree for the trailing modes.The empirical stochastic models of SLW constructed in the next section will include the dependence on SST anomalies that span the subspace of their leading K EOFs, with K = 50 and K = 75.Therefore, any sensitivity of the SLW variability produced by empirical stochastic models to these SST anomalies should be interpreted as that caused by SST variability, rather than vice-versa.

General methodology
We construct empirical stochastic models in the phase space of M leading EOFs of SLW, for various values of M(10-100), following the general methodology of Kravtsov et al. (2005b).In order to do so, we first form daily tendencies of N leading PCs of SLW: the tendency at day n, for example, is approximated as the difference between the value of a given PC at day n+1 minus the value of this PC at day n.The M time series of tendencies so obtained represent our response variables.
We will consider several versions of the empirical models.In the simplest, linear case, the main level of the empirical model is obtained by multiple linear regression (MLR) of each response variable onto M leading PCs of SLW, resulting in an equation of the form where x is an M-component vector of the leading PCs, B is an M × M matrix of the regression coefficients, while r is the vector of M residual time series uncorrelated with each of the predictor variables; as before, nis the time index (in days).If we model r as vector-noise dw that is white in time, but spatially correlated, then the formulation of Eq. ( 1) is a so-called linear inverse model (LIM) of SLW variability.
The spatial correlation refers to that between the different components of the residual time series in r (and dw).Given random realizations of the forcing dw, the LIM (1) can be integrated to produce surrogate time series of the M leading PCs of SLW, which can then be translated into variable SLW patterns in physical space by summing the SLW EOFs multiplied by the value of the corresponding surrogate PC at a given time.The statistics of such surrogate SLW realizations can then be compared to that of the observed anomalies to judge the performance of the LIM.Kravtsov et al. (2005b) introduced several improvements to the LIM (1).In particular, it often happens that the autocorrelation of the residuals at nonzero lags is not negligible.In order to address this problem, Kravtsov et al. (2005b) proposed to construct an additional level of the inverse model; at this level, the tendencies of the main-level residuals are modeled as a linear function of the extended state vector [x n ; r n ], consisting of the M original PCs, plus M first-level residuals: This exercise produces M second-level residuals: if the latter are not white in time, their tendencies can in turn be modeled as a linear function of the extended state vector consisting www.ocean-sci.net/7/755/2011/Ocean Sci., 7, 755-770, 2011  now of the M original PCs, M first-level residuals, as well as M second-level residuals.Additional levels can be added in the same way until the residual time series becomes white in time.Note that this procedure is different from merely modeling the main-level residual as colored noise, since it also takes into account any hidden dependency of the residual tendencies on the main-level PC predictors.
Another modification, which proved useful in modeling tropical SST evolution in Kondrashov et al. (2005), consisted of the inclusion of an explicit seasonal cycle.Despite our having removed the explicit seasonal cycle from the SLW the empirical model).Left panels: the optimal number of PLS components; right panels: the 6 percentage of variance unaccounted for by the regression; x-symbols show the results of PLS 7 regression using the optimal number of latent variables, while the circles display the results of 8 standard MQR, with all predictors considered.9 Fig. 10.PLS cross-validation results for the three-level empirical model with quadratic nonlinearity in the main level.The error bar plots show the mean and standard deviations of each quantity displayed computed using individual values of this quantity for each of the model equations (the number of equations is equal to the number of original PCs simulated by the empirical model).Left panels: the optimal number of PLS components; right panels: the percentage of variance unaccounted for by the regression; x-symbols show the results of PLS regression using the optimal number of latent variables, while the circles display the results of standard MQR, with all predictors considered.fields prior to constructing our empirical stochastic model, the parameters of this model may still have some seasonal dependence.Kondrashov et al. (2005) found that the optimal way to incorporate such dependence is two include, at the main level of the model, two additional predictors, namely sin(2π t/365) and cos(2π t/365).The remainder of the procedure is unaltered, and the construction of the additional levels of the empirical stochastic models proceeds as described above.
Finally, the most significant modification of LIM methodology in Kravtsov et al. (2005b) was to consider nonlinear combinations of basic predictors.For example, one can include, in addition to M predictor variables (PC-1-PC-M), all possible quadratic combinations of PCs: the product of PC-1 with all of PC-1-PC-M, plus the product of PC-2 with PC-2-PC-M, and so on.Using index notations for vectors, matrices, and tensors, and assuming implicit summation over repeating indices, the modified main-level equation can be written as The coefficients of such a regression model are also found by the MLR procedure; however, since this procedure now employs an extended vector of predictor variables and their quadratic combinations, it is called multiple quadratic regression (MQR).Kravtsov et al. (2005b) argued that it is best to restrict the nonlinear modifications to the main level of the Ocean Sci., 7, 755-770, 2011 www.ocean-sci.net/7/755/2011/empirical model, while the construction of additional levels proceeds as before.The main advantage of a nonlinear empirical model is that it can address nongaussian aspects of the observed variability.Its main disadvantage is a potentially much larger number of predictors: in a quadratic model based on M PCs and including two periodic seasonal cycle variables and a constant forcing term, the number of predictors is M × (M + 1)/2 + M + 3, and so is the number of coefficients that need to be determined by the regression procedure for each of the M response variables.Kravtsov et al. (2005b) argued that this problem may be efficiently addressed by a variety of regularization procedures that allow one to avoid overfitting and construct nonlinear multi-level stochastic models with optimal predictive capabilities.We built on this approach here to develop a novel regularization algorithm for robust estimation of empirical model coefficients (see the appendices).

Stochastic model versions
Using the regularization methods described in appendix A, we have constructed several empirical model versions, which differed by the number of PCs considered, the order of nonlinearity at the main level of the model, and the presence or absence of the dependence on SST.All models had three levels, with levels 2 and 3 being linear.The SLW-only models with linear and quadratic main level were obtained in the subspace of M = 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, and 100 PCs, and the cubic models for all of the above M ≤ 40.
We found that the performance of all these model versions is very similar, for a given M.This result argues for using the linear SLW-only model, because it has the simplest form and the smallest number of coefficients; hence, it is more reliable and easily implemented than nonlinear models.The SLW model with SST dependence was based on M = 100 leading PCs of the SLW fieldx, and L = 50 or L = 75 leading PCs of the SST fieldy The main level of the SST-dependent SLW model included linear dependence on SLW and SST PCs, the cross-product of each SLW PC with each SST PC, constant forcing term, and two seasonal cycle variables; no quadratic combinations of SLW PCs or SST PCs were used as predictors: 4 Performance of empirical stochastic models

Simulation procedure
Empirical models constructed using the methodology described in the previous section and the appendices were used to produce 100 surrogate simulations of SLW variability, each of these simulations being 1719-day long.Despite the regularization applied when constructing regression models, a few of the simulations using nonlinear models exhibited instability.In order to avoid such situations altogether, we have used the following procedure (Kravtsov et al., 2005b).
The models were integrated in ten-day chunks.The first chunk was started from random initial states.If at any time during this ten-day period the absolute value of any of the variables ended up outside their range (the latter ranges determined based on the values obtained for the training, modelconstruction period), then this ten-day simulation was discarded and restarted from another random state.The procedure was repeated as many times as necessary until a ten-day simulation with the values of all variables within the specified range was obtained.The final state from this simulation was then used to initialize the next ten-day simulation, for which the ranges of the variables were in turn monitored as before, and so on.
The threshold value for the PCs was computed as the observed maximum of the absolute value, over all the PCs and during the whole observational interval; the threshold values for the second and third-level variables were computed in the same way using "observed values" of these quantities.We kept track of the number of times the threshold condition above was violated, during each of 1719-day surrogate simulations.Table 1 lists the average values of this number for the simulations using linear, quadratic, and cubic SLW-only models.The average is computed over 100 available realizations of each empirical model.
In general, the number of initial-state resets is small, on the order of 10-20 resets during 1719-day-long simulation.Note that the resets do not always reflect instability -our linear model is, for example, always stable, in agreement with LIM theory (Penland, 1989(Penland, , 1996;;Penland and Ghil, 1993), but it still produces the values that exceed chosen thresholds from time to time.On the other hand, the cubic model for M = 35 and 40 does run out of control and may produce unbounded realizations of the simulated fields.Finally, the models that include SST forcing are not listed in Table 1, because they never produce realizations that exceed the threshold values.This fact suggests that coupling with SST is important for properly modeling SLW variability (see also Sect The empirical model of 10 leading PC components of SLW (upper row) produces a time series with a substantially smaller variance of the wind at the given location, while the time scale of SLW anomalies there is overestimated.Both of these results are to be expected, since the leading SLW EOFs account for a limited fraction of total variance (Fig. 3) and are generally characterized by the longest time scales (Fig. 6).Including progressively more components into the empirical model achieves continuous improvement of these two characteristics of SLW variability, with the 100-component model capturing quite well both the variance and the time scale of SLW anomalies.None of the model versions, however, captures the observed negative skewness of the zonalvelocity distribution.In fact, the quadratic model PDFs are essentially Gaussian and very similar to the ones obtained using simulations of the cubic and linear models (not shown).
We have tried a number of ways to better capture the skewness of the zonal-wind anomalies in our empirical models.These attempts included choosing a different EOF basis, which arranged the SLW patterns so that each of them would capture a significant fraction of variance, while having maximally skewed distribution, as well as blending our multi-level model methodology with the multiplicative-noise techniques of Sura and collaborators (see Sect. 1.2), but still failed to reproduce the negative skewness of the zonal-wind anomalies.We think that the reason for this failure is that the dynamics behind this negative skewness is essentially local, as it involves the effectively larger surface drag for positive u-wind anomalies in the region of the positive time-mean uwind (Monahan, 2004).Considering the anomalies in the EOF basis does not optimally represent such local dynamics: Each of the PCs turns out to possess skewness values smaller than the typical skewness of the zonal wind at a certain grid point, and this skewness is identified by the regression procedure as negligible; hence, this non-Gaussian aspect of zonalwind behavior is not properly represented in our empirical models.Non-local dynamics, though, are well represented in our statistical models of SLW evolution, as we will see in Sect. 4.3.The correspondence between the observed and simulated statistics for meridional SLW components is similar or better than that in Fig. 7 (not shown), since the meridional wind distribution is generally more gaussian.Similar results are also obtained for other locations in the Southern Ocean (not shown).These local results are essentially indistinguishable between all versions of the empirical models including the SST-dependent version, given the number of SLW PCs considered.

SST effects on SLW evolution
We show here some preliminary evidence for the substantial oceanic imprint onto Southern Ocean's SLW variability; this oceanic effect is a necessary condition for the existence of active ocean-atmosphere coupling there.In order to do so, we have computed ensemble-averaged evolution of the SLW anomalies for a 100-member ensemble using the empirical stochastic model forced by the history of the observed SST anomalies, as well as this evolution for the SLW-only stochastic model.We then computed the standard deviation of the ensemble-averaged wind speed for both cases, at each grid point: the results of this computation for the SSTdependent SLW model are shown in Fig. 8.The standard deviation in the SST-dependent case is much larger (by a factor of 5-10), at all grid points, than that in the SLW-only case (not shown), and exhibits a distinctive large-scale spatial pattern, suggesting this SLW variability is forced by long-term, ocean-induced SST anomalies.We plan to address this intriguing behavior in a future paper (see Sect. 5).
In summary, the model constructed in the phase space of 100 leading EOFs of SLW and including, in addition, linear and bilinear interactions with SST anomalies restricted to the subspace of 75 leading EOFs of SST, as well as the seasonal effects, is stationary and captures several local and non-local aspects of SLW evolution, on all time scales.We plan to use this model as the atmospheric component of a hybrid coupled model in which the oceanic component will be a state-of-theart GCM (see Sect. 5).

Summary and discussion
We have analyzed five years of remotely sensed data sets of sea-surface temperature (SST) and sea-level wind (SLW) over the Southern Ocean; the microwave sensors installed on recently launched NASA satellites provide an unprecedented quantity and quality of observations in the region.The missing data due to heavy rains or cloud coverage has been filled-in by singular spectrum analysis (SSA).The main technical outcome of this investigation is the construction of a statistical, stochastically forced model of SLW over the Southern Ocean; the model construction algorithm uses a number of essential innovations required to obtained robust estimates of the model's propagator.This model captures detailed features of SLW variability on a wide range of time scales, from daily to interannual, and spatial scales spanning the range from the atmospheric Rossby radius to the basin scale.The model also accounts for ocean-atmosphere coupling via dependence of SLW equations on the SST anomalies.
The model's potential in helping to interpret observed evolution of Southern Ocean's climatic variables is briefly illustrated by identifying substantial oceanic imprint onto SLW variability, which may be indicative of possible coupled ocean-atmosphere effects in the Southern Ocean: ensemble averaging over 100 simulation of the statistical model forced by the observed SST anomalies reveals variability of a large magnitude and distinctive spatial pattern.The analogous ensemble average based on simulations of the SLW-only model is characterized by a very small magnitude and a lack of spatial coherence.
The construction of the above statistical models is rooted in the empirical methodology of Kravtsov et al. (2005b) and Kondrashov et al. (2005Kondrashov et al. ( , 2006)); however, the model construction algorithm is substantially modified and improved here in a number of ways that help choose the optimal model structure (see the appendices).These modifications make Kravtsov et al. (2005b) technique, previously used to identify low-dimensional behavior within high-dimensional noisy data, applicable to the analysis of the phenomena involving intermediate number of degrees of freedom.In particular, the most comprehensive statistical model operates in the subspace spanned by 100 leading empirical orthogonal functions (EOFs) of the daily SLW over the Southern Ocean, thus modeling the evolution of 100 corresponding principal components (PCs); the seasonal cycle was removed from all fields prior to performing the principal component analysis.
The model equations relate the time derivative of each PC to the right-hand side consisting of three parts: the part that depends on SLW only, the SST-dependent part, and the variable forcing term.The first part is approximated as a linear function of all PCs of the SLW field.The dependence on SSTs is modeled as the linear function of the leading 75 PCs of the SST, plus bilinear terms involving the cross-product of SLW and SST PCs; since this part is nonlinear, the seasonally dependent forcing term is also included.The variable forcing that drives the variability in the model is simulated in a separate set of equations that relate the time derivative of each component of the forcing vector to the linear function of SLW and SST PCs, as well as the forcing vector itself, and also include the second-level variable forcing.The second-level forcing's tendency is in turn modeled linearly in a way analogous to the main-level forcing, while the variable forcing at this last, third level of the model is approximated as spatially coherent noise that is white in time.The construction of this statistical model involved a novel multistep regression algorithm to compute the coefficients of the model's propagator, as well as to determine the parameters of the noise.We plan to use the statistical model constructed in the present study to further investigate the dynamics of oceanatmosphere interaction over the Southern Ocean.In particular, our current results may suggest the presence of active coupling in the region by identifying a nontrivial SLW response to the observed SST anomalies, although Bretherton and Battisti (2000) proposed alternative explanations to such findings.Goodman and Marshall (1999), on the other hand, formulated a theory of interannual-to-decadal coupled variability that is potentially applicable to the Southern Ocean.This theory predicts the existence of coupled modes, given a certain spatial phase relationship between SST patterns and SST-induced SLW anomalies; this phase relationship gives rise to Ekman pumping anomalies that force and modify the oceanic circulation and the associated SST field.It would be interesting to check whether we can detect such a phase relationship in our statistical model.Another very promising way to apply our empirical SLW model is to couple it to an oceanic GCM.We plan to achieve this coupling by blending the SST-dependent SLW model with atmospheric boundary layer model of Seager et al. (1995).The latter model needs the specification of boundary-layer winds to compute ocean-atmosphere heat fluxes.These winds will be supplied by the statistical model, and will also be used to compute the atmosphere-ocean momentum flux.The ocean model forced by heat, moisture, and momentum fluxes will predict the evolution of the SST field, which will, in turn, affect the future SLW anomalies.The experiments with such a hybrid coupled GCM of the Southern Ocean regions may provide invaluable insights into the dynamics of climate variability there.
S. Kravtsov et al.: A model of air-sea interaction over the Southern Ocean Appendix A

PCR and PLS regression
The main regularization tool is cross-validation, in which one chooses randomly a subset of the vector time series (in the analyses below, we typically consider 80 % of original data points), applies a given regression technique, and then uses the regression model to reconstruct the segments of the time series that were omitted in the model identification step.The performance of the regression technique may then be assessed according, for example, to the smallness of the differences between the regression-based prediction and the actual values of the time series.We will use cross-validation in a number of different ways when constructing the empirical models below.
A major problem in applying MQR or MLR based on a large number of predictors is multi-collinearity (Press et al., 1994).This problem can be avoided by finding linear combinations of original predictors in such a way that their time series are uncorrelated, while each linear combination accounts for the maximum possible amount of the total variance.A natural way to determine this modified set of predictors is to apply principal component analysis to the original vector of predictors, and then use cross-validation for finding the optimal number of PCs to retain in the regression; this procedure is called the principal component regression (PCR).Note that since we construct our empirical models in the phase space of the data set's EOFs, the predictor variables in an LIM are already uncorrelated.On the other hand, the MQR predictors are the original set of PCs augmented by their quadratic combinations.Therefore, applying principal component analysis to this new multivariate data set generally produces a different set of predictors.
The PCR results for MQR based on several numbers of PCs, M=10,15,20,25,30,35,and 40,are displayed in Fig. 9; the values of M are shown on the abscissa of this graph.We computed the optimal number of PCR predictors for each of the M equations of the quadratic regression model.We thus obtained, for the model describing the evolution of M leading SLW PCs, M estimates of the optimal number of PCR components.The error bar plot in Fig. 9 shows the average value of this number over the Mavailable estimates, along with its standard deviation.The dependence of the optimal number of PCR components on the number of original PCs is very well approximated by a linear fit (heavy solid line); this number is much smaller, for large M, than the maximum possible number of variables, which is equal, for MQR, to M × (M + 1)/2 + M + 3.
PCR does a fairly good job in picking the smallest set of uncorrelated predictors that capture most of the variance.However, the choice of the PCR predictors does not involve at all the information about how well these predictors are correlated with the response variable.The procedure that does take into account this additional information is called partial least-squares regression (PLS); see Abdi (2003) for a brief, but comprehensive review.We apply PLS to the set of optimal predictors determined via PCR cross-validation (Fig. 9), rather than to the original, much larger set of predictors.
Similarly to the PCR procedure, the leading PLS predictor is defined as a linear combination of the original predictor time series, but in this case the quantity being maximized is the correlation between this time series and the predictor time series.We found that applying PLS to each response variable individually produces better results than the matrix formulation of the PLS algorithm, which also considers linear combinations of all response variables and finds two sets of coefficients that define the mode of response and the mode of predictor variables that are maximally correlated (Abdi, 2003).In the general multivariate case, the weights of the leading PLS mode are found using Singular Value Decomposition (SVD) as the first right singular vector of the matrix X T Y, where X and Y are the matrices whose columns are the time series of the predictor and response variables, respectively.The right singular vectors of X T Y define the weights for the response variables; in the univariate case, the single such weight is naturally equal to 1.
The time series of the leading PLS mode is obtained by summing the original time series of the predictor variables with the weights obtained as above.The signal associated with the leading PLS mode is then regressed out of both the response variable(s) time series, and all the predictor time series; this is done, once again, by only retaining the residual of the linear regression of each of these time series onto the time series associated with the leading PLS mode.The above procedure is then applied to the "reduced" response and predictor time series to obtain the next PLS mode, and so on to obtain all the PLS modes.The optimal number of modes to retain in this procedure is also determined by crossvalidation.
The PLS cross-validation results for the main level of the quadratic models based on M = 10, 15, 20, 25, 30, 35, and 40 PCs are shown in the upper row of Fig. 10.The error bar plot in the left panel is analogous to that in Fig. 9, and shows, in this case, the optimal number of PLS components, which is found to be less than 10 for all M. The error bar plot with x-symbols (solid lines) in the right panel shows the residual variance as the percentage of the total response-variable variance; the expectation value and the standard deviation for a given M are, once again, based on the results of the PLS procedure applied to each of the M response variables (and, of course, the same set of original predictors).The additional error bar plot in the same panel (dashed line with circles) shows the same quantity based on the full MQR, which uses all of the original response variables.Note that for M = 10, 15, and 20, only a few (definitely less than 10) effective predictor variables found by consecutive application of the PCR and PLS methodologies capture essentially the same amount of variance in the response variables as the MQR based on 63, 138, and 223 variables, respectively.For M = 40, the residual variances differ by a factor of 2, which indicates that the additional variance "captured" by the original MQR procedure is associated with a substantial overfitting.
The additional panels in Fig. 10 show analogous results for the second (2M original predictors) and third (3M original predictors) level of the wind-only empirical stochastic model.The PCR pre-processing has not been applied to these levels, so that the PLS regularization acted directly on the original PCs and residuals.In each case, about a dozen optimal predictors are identified, which capture essentially the same amount of the response variance as the full MLR model for this level.Note that the residual variances become increasingly close to 50 % for the second and third level.Since our response variables have the form r n+1 -r n and the predictors include the term r n , the case with no prediction skill (that is, r being pure white noise) will identify the regression coefficient multiplying r n to be equal to -1, and all other coefficients to be zero.In this case, the residual will be exactly equal to r n+1 , and therefore the residual variance will be exactly equal to the 50 % of the response-variable variance.The deviations of the residual variance from 50 % in the fourth level of the wind-only regression model are negligible (not shown), thus identifying the three-level empirical model to be optimal.

Appendix B Selection of predictor variables
A few regression coefficients found by the application of PCR-and-PLS regularization, as described in appendix A, can be translated by trivial matrix manipulation into the coefficients of the empirical model in the original predictorvariable basis.Many of these coefficients are fairly small and do not contribute much to the predictive capability of a given empirical model.We therefore fine-tuned and enhanced our regression technique by the following procedure for the selection of the predictor variables.
This procedure was also based on subsampling of original predictor and response variables.For a model mimicking the evolution of M original PCs of SLW (M = 10 − 100), we first obtained 100 sets of regression coefficients by randomly applying PCR-and-PLS regularization to 100 randomly sampled subsets of the full original time series, each of which included 80 % of the original data points.The optimal number of PCR components in the quadratic model was estimated according to the linear approximation shown in Fig. 9.The general cubic model was also constructed for M = 10 − 40; for this model, we determined the optimal number of PCR components in a way analogous to that for the quadratic model, prior to applying the PLS regularization step.No PCR step was applied to the linear models.At the PLS step, we have used a fixed number of 25 latent variables to define the optimal subspace for regression.This number exceeded the optimal one in Fig. 9 by at least a factor of two and thus could not result in underfitting.The regression coefficients so obtained were then translated into the original predictorvariable space.
If the interval between the 2nd and 97th percentile of a given regression coefficient obtained as described above contained the value zero, we excluded the corresponding predictor variable from consideration, thus forming a new, smaller subset of predictor variables.This subset was in turn subsampled 100 times and subjected to PCR-and-PLS regression to identify coefficients not significantly different from zero, and so on, until all coefficients of the final set of predictors were found to be significant.The same procedure was applied to the second and third level of each version of the inverse model.The final regression coefficients in each case were found by applying the PCR-and-PLS regularization to the fully sampled set of optimal predictors.
Table 2 lists the number of statistically significant nonzero coefficients of the three-level inverse model of M leading PCs of SLW; the main level includes quadratic nonlinearities and a seasonal cycle.The total number of coefficients at the main level is (M × (M + 1)/2 + M + 3) × M, at the second level -2M 2 , and at the third level -3M 2 .Note that the statistically significant coefficients are but a small fraction of the total number of coefficients.For example, for M = 75, the main level of the quadratic model has only 4849 nonzero coefficients, out of a maximum possible of 219600.This means that our regression procedure identified, on average, 4849/75≈65 nonzero coefficients in each of the 75 main-level equations; this number is an order of magnitude smaller than the number of degrees of freedom N DOF in the time series of the length of 1719.If one estimates the decorrelation time scale of SLW anomalies to be 5 days, then N DOF ≈ 1719/5 = 344 >> 65.Recall also, that the number of independent regression coefficients we have actually computed at each level is 25, which makes the number of coefficients/DOF comparison even more favorable.

Figure 7 .Fig. 7 .
Figure 7. Probability density function (PDF; left panels) and autocorrelation function (ACF; 2 right panels) of the observed and simulated zonal velocity anomalies at 120ºW and 55ºS.3 Solid lines: the observed functions; dashed lines: 95% spread based on SLW-only model with 4 quadratic main level.The four rows show the results, from top to bottom, for the models 5 constructed in the subspace of 10, 30, 50, and 100 PCs of SLW, respectively.6 Fig. 7. Probability density function (PDF; left panels) and autocorrelation function (ACF; right panels) of the observed and simulated zonal velocity anomalies at 120 • W and 55 • S. Solid lines: the observed functions; dashed lines: 95 % spread based on SLW-only model with quadratic main level.The four rows show the results, from top to bottom, for the models constructed in the subspace of 10, 30, 50, and 100 PCs of SLW, respectively.

Figure 8 .
Figure 8.The standard deviation of the wind speed time series obtained by taking the ensemble average of 100 simulations of SST-dependent SLW model forced by the observed history of SST anomalies..The model was constructed in the phase space of 100 leading EOFs of SLW.A typical (maximum) standard deviation of analogous SLW-only model's ensemble-mean time series (not shown) is 0.25 (0.55) -both values are smaller than the standard deviations shown here.

Fig. 8 . 1 Figure 9 .Fig. 9 .
Fig. 8.The standard deviation of the wind speed time series obtained by taking the ensemble average of 100 simulations of SSTdependent SLW model forced by the observed history of SST anomalies.The model was constructed in the phase space of 100 leading EOFs of SLW.A typical (maximum) standard deviation of analogous SLW-only model's ensemble-mean time series (not shown) is 0.25 (0.55) -both values are smaller than the standard deviations shown here.
Figure 10.PLS cross-validation results for the three-level empirical model with quadratic 2 nonlinearity in the main level.The error bar plots show the mean and standard deviations of 3 each quantity displayed computed using individual values of this quantity for each of the 4 model equations (the number of equations is equal to the number of original PCs simulated by 5 . 4.3).www.ocean-sci.net/7/755/2011/Ocean Sci., 7, 755-770, 2011 S. Kravtsov et al.: A model of air-sea interaction over the Southern Ocean 4.2 Daily-to-monthly aspects of SLW variability We illustrate the performance of the empirical models by examining first local aspects of the simulated SLW variability, based on the output of the quadratic, SLW-only model.Figure 7 shows probability density function (PDF; left panels) and autocorrelation function (ACF; right panels) of the observed and simulated zonal velocity anomalies at 120 • W and 55 • S -in the middle of an intense-jet region (see Fig. 2); the correspondence at other locations is qualitatively and quantitatively analogous.The heavy solid line in all the plots shows the observed PDF or ACF, while the dashed lines mark the 95 % spread in these quantities obtained from 100 realizations of the quadratic SLW model.The four top-to-bottom rows of Fig. 7 display the results from the empirical model based on M = 10, 30, 50, and 100 SLW PCs, respectively.

Table 1 .
Average number of initial-state resets for different empirical SLW-only models computed based on 100 surrogate simulations of a given model (see Sect. 4.1).

Table 2 .
The number of statistically significant coefficients of a three-level quadratic inverse model based on M leading PCs of SLW (see Appendix B for further details).