Ocean Science Dynamically constrained ensemble perturbations – application to tides on the West Florida Shelf

A method is presented to create an ensemble of perturbations that satisfies linear dynamical constraints. A cost function is formulated defining the probability of each perturbation. It is shown that the perturbations created with this approach take the land-sea mask into account in a similar way as variational analysis techniques. The impact of the land-sea mask is illustrated with an idealized configuration of a barrier island. Perturbations with a spatially variable correlation length can be also created by this approach. The method is applied to a realistic configuration of the West Florida Shelf to create perturbations of the M2 tidal parameters for elevation and depth-averaged currents. The perturbations are weakly constrained to satisfy the linear shallowwater equations. Despite that the constraint is derived from an idealized assumption, it is shown that this approach is applicable to a non-linear and baroclinic model. The amplitude of spurious transient motions created by constrained perturbations of initial and boundary conditions is significantly lower compared to perturbing the variables independently or to using only the momentum equation to compute the velocity perturbations from the elevation.


Introduction
In numerous modelling applications, the uncertainty of the model results needs to be estimated.An estimation of this uncertainty is often obtained by realizing a stochastic ensemble forecast: inputs to the model are perturbed within the Correspondence to: A. Barth (a.barth@ulg.ac.be) bounds of their respective uncertainty and for each of those perturbations the model is integrated forward.For ocean models, the model uncertainty stems, among others, from the forcing fields and initial conditions (e.g.Lermusiaux et al., 2006).The spread of an ensemble forecast using perturbed forcings fields and initial conditions gives the estimation of the model uncertainty.
This uncertainty (often formally expressed as model error covariance) is also required in various data assimilation techniques such as the Ensemble Kalman Filter (Evensen, 2003), the Error Subspace Statistical Estimation (Lermusiaux and Robinson, 1999) and the Singular Evolutive Interpolated Kalman Filter (Pham, 2001).The success of the data assimilation depends crucially on the realism of the model error covariance which in turns depends on the realism of the applied perturbations.
Ensemble simulations are also very resource-intensive.There is the need, thus, to restrict exploration of the state space only to model states which are physically reasonable and which respect certain dynamical equilibria.Otherwise resources might be wasted to compute ensemble members using unlikely initial or boundary conditions.
Physically unbalanced initial and boundary conditions produce transient motions during the initialization, often in form of spurious barotropic waves.Several techniques have been proposed to reduce those waves in the context of data assimilation (e.g.Vallis, 1992;Dobricic et al., 2007;Barth et al., 2007b).These methods, which basically damp barotropic waves, are difficult to apply to models which include tides, since tidal waves are also barotropic waves.
The objective of this study is to present a method to create smooth, Gaussian-distributed, monovariate or multivariate Published by Copernicus Publications on behalf of the European Geosciences Union.
A. Barth et al.: Dynamically constrained ensemble perturbations perturbations that have to satisfy a given linear balance.In particular, a method to produce balanced perturbations for tidal models is presented here.As the method relies on the definition of a cost function, the land-sea mask and a possibly variable correlation-length can be taken into account.
The motivation for applying this technique to tidal boundary conditions is to assimilate a data set resolving tides.Since part of the model error is associated with erroneous tidal boundary conditions and initial conditions, those errors have to be taken into account.In an ensemble approach, the uncertainty of those fields is represented by an ensemble of likely and physically balanced initial and boundary conditions.
In Sect.2, the general method is introduced to generate smooth perturbations satisfying certain constraints.Section 3 shows how the structure of the perturbations depends on the land-sea mask in an idealized configuration.A case with a spatially variable correlation length is also presented.As an example, the constraint from a shallow water model is introduced and implemented in a realistic configuration of the West Florida Shelf (WFS) in Sect. 4. This method is compared to other simpler techniques (Sect.5) and the results are discussed in Sect.6.In Sect.7, the numerical cost of this method is studied for different domain sizes.Section 8 summarizes our findings and provides the conclusions.

Methods
All model variables at all grid points that need to be perturbed are grouped into the perturbation vector x which is composed by n elements.We will seek a method that provides smooth stochastic perturbations that satisfy approximately m-constraints given by the m×n matrix M: We consider only linear constraints here.Non-linear constraints have to be linearized around the unperturbed state.This constraint can be any relationship between different elements of the vector x known a priori, such as the geostrophic equilibrium, zero horizontal divergence of surface winds, a climatological TS diagram, stationary solution to the advection diffusion equation or the linear shallow water equations.The constraint can also be the requirement that the perturbations have to belong to a subspace defined by e.g.empirical orthogonal functions (EOFs).In this case the columns of M would be all vectors orthogonal to the set of EOFs.
Only homogeneous constraints (i.e. the right-hand side of Eq. ( 1) is zero) are considered here because otherwise the perturbations would have a non-zero mean.The forcing fields to be perturbed are generally assumed to be unbiased, as it is also a requirement for most assimilation schemes.
To describe our a priori knowledge of what a realistic perturbation is, we introduce a cost function J , similar to the cost function in variational analysis techniques: where W M , W D and W E are, for simplicity, diagonal weighting matrices.The first term penalizes the deviations from the linear constraint Eq. (1).It is not enforced strictly (strong constraint), but it has to be satisfied approximately (weak constraint).The diagonal elements of W −1/2 M define the magnitude of an acceptable deviation from the linear constraint.The matrix D is a diffusion operator that ensures the smoothness of the perturbation (Brasseur, 1991;Weaver and Courtier, 2001).The last term controls the amplitude (or total energy) of the perturbations.Later, this cost function will be related to the likelihood of a given perturbation.It is necessary that all constraints added in the cost function are compatible to obtain useful perturbations.
The cost function is a quadratic function in x and can thus be written as: where the matrix B −1 (the Hessian matrix of J (x)) is defined as: The dynamically constrained covariance matrix B could be used directly in assimilation schemes that are based on the inverse of the background covariance matrix (such as variational assimilation) without the need to create an ensemble.For ensemble-based assimilation schemes, the cost function can be used to define the probability of a perturbation x (e.g.Kalnay, 2002): Perturbations resulting in a large value of the cost function J , meaning that the constraints are violated, have thus a low probability.The perturbations satisfying the weak constraint (1) are drawn from this pdf.
To generate an ensemble of perturbations that follows the previous pdf, the matrix B −1 is decomposed in eigenvectors (rows of U) and eigenvalues (diagonal elements of ) : The larger an eigenvalue is, the stronger the corresponding eigenvector violates the dynamical and smoothness constraint.Indeed, if the perturbation is the eigenvector u i associated with eigenvalue λ i , the cost function takes the value  k) where the subscript k is the ensemble member, is created following a normal distribution.
An ensemble of perturbations x (k) following Eq.( 6) can be obtained by: In practice, only the smallest eigenvalues (and their corresponding eigenvectors) need to be calculated.This enables the use of efficient software packages to compute selected eigenvalues and eigenvectors.In this work, we use the GNU Octave interface to the ARPACK package (Lehoucq et al., 1997).This approach for generating perturbations will be used in the subsequent experiments.
As a variant of the previous equation, perturbations can also be obtained by 2nd order re-sampling method used in the SEIK (Singular Evolutive Interpolated Kalman) Filter (Pham, 2001): where N is the number of eigenvectors retained, H is a N +1×N matrix whose column vectors form an orthonormal basis perpendicular to the vector 1 N +1×1 and ( ) k is the kth column of a N×N random orthogonal matrix .This ensemble will have exactly a zero mean and a covariance equal to B reduced to its N largest eigenvalues.Instead of using an eigenvector decomposition, perturbations can also be created by a Cholesky decomposition of B −1 into square root matrices (Fukumori, 2002): The perturbations can then be obtained by: Since R is a triangular matrix, the product of its inverse and a vector can be efficiently calculated by back substitution.

Impact of the land-sea mask and correlation length
First we will show how the effect of land boundaries such as islands and peninsulas are taken into account to compute the perturbations.The effect induced by those barriers is difficult to include in methods which derive perturbations based on a given covariance matrix.For example, the method described in Evensen (2003) is widely used to generate ensemble perturbations.The perturbations are generated in Fourier space.Each Fourier mode is perturbed independently with an expected amplitude proportional to a Gaussian function of the wave number, producing a smooth field in physical space.This method is computationally very efficient since it can be implemented using the Fast Fourier Transform.Indeed, it can be shown that for any translation-invariant covariance matrix, its eigenfunctions are the Fourier modes (Barth et al., 2007a).In some circumstances, the approach might be appropriate for atmospheric fields and also for oceanographic fields far from the coastline.But in oceanographic applications, problems can arise near the coast.This problem will be illustrated with a narrow elongated island.
The effect of land boundaries is included in the smoothness constraint and is independent of any dynamic constraint.
To show this we can think of the diffusion operator (on which the smoothness constraint is based) as a way to transfer information, such as tracer concentration, from one place to an other.The method using constraints to generate perturbations can then easily take into account a land-sea mask in the same way as diffusive fluxes of tracers.In continuous form, the operator D applied to a perturbation field φ can be written as: where the vector field F is given by: The derivatives in Eqs. ( 13) and ( 14) are approximated by centered finite differences, yielding a sparse matrix representation of D. To facilitate the formulation of this sparse matrix, the operator is separated into sub-steps (calculating the vector field F , application of boundary conditions, calculating the divergence of the vector field).Each of the individual steps can be easily expressed as a sparse matrix.The matrix D is then simply the product of all matrices corresponding to the sub-steps in the appropriate order.The same approach will also be used to create a sparse matrix representing the dynamical constraint.
To highlight the impact of this smoothness operator, we will not take into account a linear constraint in Eq. ( 2).Also, we chose the weighting matrices proportional to the identity matrix.
The vector x represents the values of the field φ on a set of grid points and the matrix D is the discretization of the diffusion operator D. The exponent of L has been chosen such that L represents a length-scale.Indeed, this parameter is the horizontal correlation length of the perturbations.At this length scale both terms of the cost function have a similar magnitude (Brasseur et al., 1996).In the examples that will follow, the parameter has been adjusted such that the perturbations based on Eq. ( 15) have the same correlation length than the perturbations created using the Fourier www.ocean-sci.net/5/259/2009/Ocean Sci., 5, 259-270, 2009 transform without any land points.It may be worthwhile to note here that in this case the horizontal structure of the covariance is a Gaussian for the method using the Fourier transform; whereas the covariance of the perturbations based on the cost function is related to a modified Bessel function of order 1 in the continuous case, here discretized on a regular grid (Brasseur et al., 1996).An ensemble of 10 000 perturbations has been created with a horizontal correlation length of 20 km for a square domain with a narrow island.The ensemble covariance has been computed for a grid point near the end of the barrier (the black dot in Fig. 1).As expected, the structure of the ensemble covariance using the Fourier transform is isotropic.On the contrary, the covariance using the cost function is deformed by the presence of land points.Only grid points which can be connected with a short path (relative to the correlation length-scale) not crossing land points have a significant correlation.In general, this is the expected physical behaviour.
This effect of land points is well known in the context of variational analysis where the covariance functions are constructed with differential operators (e.g.Brasseur, 1994).
Here we use this desirable property for the generation of perturbations which can be used in sequential Kalman filters using ensembles.
It should be noted here that in the context of Ensemble Kalman filtering, covariance functions similar to panel a of Fig. 1 are not directly used for assimilating observations.Perturbations with these covariance functions would be applied to the initial condition for example, and the covariance function will be inevitably transformed by the non-linear model (e.g.advected by currents).However, a correlation across the barrier will still remain; only the shape of the correlation function is modified.Thus an observation made on one side of the barrier will also impact the other side.
The method based on the cost function can be applied to any model grid where the discrete diffusion operator can be formulated.This is an advantage compared to the Fourier method which can be only applied to rectangular grids and spherical grids (using spherical harmonics).For more general cases, the latter method requires an intermediate grid since the symmetry requirement (translation invariance) does not only apply to the domain but also to the model grid itself.
Another advantage of the presented method is that perturbations with a spatially varying correlation length can be created.Spatial structures of baroclinic flows in the ocean are related to the internal radius of deformation.This lengthscale is often much smaller on the weakly stratified shelf than in the deep ocean.A correlation length proportional to the internal radius of deformation is thus a reasonable choice.The effect of a spatially varying correlation length can be easily taken into account in the variational formulation Eq. (15).This is not possible with the Fourier method since a spatially varying correlation length is not translation-invariant.Figure 2 shows a random perturbation obtained for a correlation length varying linearly from 3 to 10 km in the zonal direction.As expected, the size of the structure gradually increases as the correlation length increases.
This aspect is also useful for a system of nested grid models and for models with unstructured grids.The length scale of the perturbations has to be properly resolved on the model grid.If only a unique length scale is used then it must be several times larger than the coarsest resolution.With a spatially varying correlation scale this limitation is lifted and areas with locally refined resolution can have smaller-scale perturbations.In a two-way nesting system, the model fields on different grids can also be regrouped in a single perturbation vector and the nesting feedback can be introduced as an additional constraint to ensure a coherent transition between model grids.

Application to tidal boundary conditions
This method is now applied to create multivariate perturbations of sea surface height and depth-averaged currents which have to be a harmonic solution of the shallow water equations.This example has a practical relevance since tides are governed by those equations.For simplicity, the shallow water equations are expressed here on a Cartesian grid, but they are implemented on a curvilinear grid as it will be applied to a curvilinear model grid.
where h is the depth, f is the Coriolis parameter and g is the acceleration due to gravity.For simplicity we consider only one tidal constituent.The tidal component M2 is chosen because it is generally among the largest constituents.We require that the time dependency of the perturbations has the following form: u(x, y, t) = e iωt u (x, y) v(x, y, t) = e iωt v (x, y) where ω is the angular frequency of the M2 tides.The perturbation vector x is composed by the elevation ζ and the depth-averaged currents u and v .It follows that, The discrete operator M is obtained by discretizing the spatial derivatives Eq. (22-24) on an Arakawa C grid (Arakawa and Lamb, 1981) using finite volumes.Only wet points are included in the state vector x.In this sense, boundary effects are already included in M. The normal velocity at the land-sea boundary is prescribed to be zero and is not part of the state vector x.
Open ocean boundary values are not constrained by the shallow water equations.If no smoothness constraint would be present, the resulting ensemble members would be discontinuous at the boundary in the direction parallel to the boundary.The explicit smoothness constraint (Laplacian with a The method is tested using the WFS ROMS model (Barth et al., 2008c) which is nested in the Atlantic HYCOM model (Chassignet et al., 2007).The nesting procedure is explained in Barth et al. (2008a).The model uses a curvilinear grid with a resolution of about 3.5 km near the coast and 10 km near the open boundary.The model is initialized on the 1st January 2005 using the elevation, velocity, temperature and salinity from HYCOM.The boundary conditions from HY-COM mainly impose the path of the Gulf of Mexico Loop Current generating mesoscale eddies and filaments inside the model domain (Barth et al., 2008b).The tidal boundary conditions produce tidal waves propagating inside the model domain and increase their amplitude near the coast as predicted by linear tidal wave theory for wide continental shelves (Clarke, 1991) and by a numerical ocean model of the WFS (He and Weisberg, 2002).

Experiments
Three methods, of increasing complexity, for the generation of perturbations of the tidal motions are examined: -The elevation and the two horizontal velocities are perturbed independently.Instead of perturbing the amplitude and phase directly, it is preferable to work here with the complex representation of these tidal parameters.The real and imaginary parts are perturbed independently.The horizontal correlation length of the perturbations is 300 km which is the typical length-scale of the tidal maps.The complex perturbations are added to the TPXO6.2(Egbert et al., 1994 -The elevation is perturbed as described in the previous experiment, but now the corresponding velocity perturbations are diagnosed using the momentum balance: Those equations become the geostrophic equilibrium for ω=0.The geostrophic equilibrium is often used to model error covariances in data assimilation (e.g.Dee, 1991;Brankart et al., 2003) and also to compute perturbations of the velocity based on the perturbation of the density field (Barth et al., 2007a).Equations ( 25)-( 26) are more complete than the geostrophic equilibrium since the former do not assume small accelerations.
-In the third method, perturbations are weakly constrained by the Eqs.(22-24) following the approach described in Sect. 2. The weighting matrix W is defined by the energy E of shallow water waves: where on the right-hand side, the discretized version defines the weighting matrix (which ensures also proper dimensional relationships).Up to a constant factor ρ 0 , E corresponds to the total energy of a system governed by the shallow water equations.
The weighting matrices for the dynamical constraint W M , the smoothness constraint W D and the total norm W E are proportional to W. Since the perturbations are scaled afterwards, one proportionality coefficient can be arbitrarily fixed without loss of generality.Here the matrices are defined by: where L (dimension of a length scale) and α (adimensional) are parameters to be chosen.The smaller these values are, the stronger the dynamical constraint will be enforced.However, we do not require that the constraint will be exactly satisfied since it is derived using a series of assumptions.
After several values of those parameters where tested, α was set to 0.001 and L to 10 km.The preferential length-scale taking only the smoothness and total energy constraint into account is thus α −1/4 L=56 km which is small compared to the length-scale of the tidal structures.This indicates that the structure of the perturbations is mainly determined by the dynamical constraint and not by the two other constraints.
When this method for generating ensemble perturbations is applied to data assimilation, one can use crossvalidation (Wahba and Wendelberger, 1980) to estimate the parameters L and α more objectively in a similar way than parameters are optimized in e.g.DIVA (Data-Interpolating Variational Analysis) Brankart and Brasseur, 1996) and DINEOF (Data Interpolating Empirical Orthogonal Functions; Beckers and Rixen, 2003;Alvera-Azcárate et al., 2007): a small subset of observations are not used during the data assimilation and reserved for the validation of the results.The ensemble run and assimilation steps are repeated for different values of the parameters L and α.The best set of parameters is the one that minimizes the root mean squared error compared to the validation data set.This approach is not used in this present work but might be explored in further data assimilation studies.
The amplitude of the perturbations is chosen such that, in all the experiments, the expected value of the energy norm x T Wx is a given constant.Here we use 0.01 m 3 s −1 times the total surface of the model domain.This produces perturbations of the elevation of the order of cm which is comparable to the error of tidal models of this region (He and Weisberg, 2002).All methods produce thus perturbations which are equally energetic.
The M2 tidal parameters for the elevation and the depthaveraged velocity are obtained from the TPXO6.2global inverse tide model.Those tidal parameters are perturbed using the three different methods.For each method, an ensemble of 32 members has been created.The tidal surface elevation and depth-averaged velocity are added to the initial conditions and to the HYCOM boundary conditions.All ensemble members are integrated for 10 days with the full, non-linear ROMS model.
The first method produces isotropic covariances, but this is not the case for the constrained perturbations since they have to satisfy (approximately) the shallow water equations.The horizontal covariance of the constrained perturbations for a point near the open boundary is shown in Fig. 3.This covariance represents how a hypothetical error (or an observation if this covariance is used for data assimilation) near the boundary would affect the solution within the model domain.In the vicinity of this point the covariance decreases monotonically with a preferred direction to the North-East inward to the model domain.However the covariance increases in Florida's Big Bend and near the Mississippi Delta.Thus the impact of an error in the boundary is not only local but it can be seen remotely due to the propagation of tidal waves.To include such aspects in error covariance models, dynamical constraints have to be taken into account.

Results
To compare the three different examined methods, it is necessary to quantify the amount of transient motions created due to the perturbations.The surface elevation of an unperturbed (central) run was subtracted from each ensemble member to isolate impact of the initial and boundary perturbations.
A tidal analysis was performed to subtract the tidal variations.This residual is then averaged over the 10-day time period which is shown in Fig. 4. To compute the total amount of transient motions those maps are averaged in space and the results are given in Table 1.If this residual would be zero, then the perturbations would have only created tidal waves at the M2 frequency (unless the amplitude of the perturbations is so large that it creates harmonics).
The largest transient motions are created by perturbing elevation and velocity independently (panel a on Fig. 4).The structure of the standard deviation of the elevation residual is quite similar to the amplitude of the M2 tide in the WFS.In cases 1 and 2, the highest values are observed in Florida's Big Bend and in the southern part of the domain.This suggest that the amplitude of the transient motions increases near the coast in the same way as the barotropic tidal waves increase its amplitude when the depth decreases and when the wave is reflected from the coastline.The fact that the transient motions are amplified near the coast is particularly problematic since this is often the region of interest for regional models.
To quantify the overall magnitude of the transient motions, the variance of the residual is averaged over the model domain.The square root of the averaged variance (or average standard deviation) of all three experiments is shown in Table 1.The experiment with independent perturbations has indeed the largest average variance of transient motions.Using the momentum equation to compute the velocity perturbation based on the elevation perturbation does not significantly improve the dynamical balance of the perturbed model simulation since the average standard deviation is only slightly reduced.This result is surprising since the geostrophic equilibrium, which is related to Eqs. (25-26), is used successfully in data assimilation.The difference here is that the (pure) geostrophic equilibrium does satisfy the horizontal continuity equation (on an f-plane) and does not require vertical motions.But this is not the case for Eqs.(25-26) which can thus entrain a movement of the free surface generating spurious gravity waves.
The smallest transient motions are obtained by using the constrained perturbations.The variance of the residual is reduced by 60% compared to the previous two approaches.While the standard deviation of the residual is still the largest near the coast compared to the open ocean, it is now significantly reduced.This shows that although the perturbations are derived using several assumptions (in particular, no bottom drag, purely barotropic and linear dynamics), they are applicable to an implementation of a baroclinic and nonlinear model with realistic friction.

Impact of bottom drag
In the previous section, only linear terms were examined for methods 2 and 3.The bottom drag was thus ignored.However, it is well known that it plays an important role for modelling tides in shelf seas.Including the bottom friction, the governing equations are: The WFS ROMS uses a quadratic parametrization for the bottom drag τ : where the drag coefficient r is 10 −3 .This bottom drag is linearized around the velocity vector ( ū, v): The velocities ū and v are set to the half of the maximum tidal velocity in the x and y directions respectively as this corresponds to the average of the tidal velocity squared.The dynamical constraint with bottom drag becomes: where c u and c v are given by: Methods 2 and 3 are repeated by including a linearized drag term.With the drag term, the Eqs.( 25) and ( 26) become: where A is given by, From this denominator, it is evident that the drag reduces in average the size of the velocity perturbations (for a given perturbation of the elevation) and alters also the phase.
Figure 5 shows the standard deviation of the residual using perturbations taking the bottom drag into account.In panel a, the perturbations we created similar to experiment 2 but using the Eqs.( 42) and ( 43).The results in panel b are obtained by perturbations constrained by the spatial discretization Eq. (37-39) similar to experiment 3.
The spatial structure of the residual is comparable to the residual in the previous experiments with an increase of the amplitude of transient motions near the coast.However, the amplitude is indeed reduced in both cases, which confirms our expectation that the more complete our dynamical constraint is, the more balanced our perturbations are.The amplitude is actually reduced by approximately the amount relative to the corresponding experiment without bottom drag.
The smallest transient motions are generated by using the perturbations constrained by the shallow water equations including the bottom drag.

Motion induced by the constrained perturbations
The main purpose of the proposed procedure is the generation of balanced perturbations.In the previous section the magnitude of the transition motions were used to quantify how close the model state is to an equilibrium state.To further study the tidal perturbations we can compare the amplitude and phase of the perturbations with the corresponding parameters diagnosed from a perturbed run.Since the model is derived from the non-linear baroclinic equations, we will examine how relevant the simplified shallow water equations are compared to a realistic model.Also, the constraints still use continuous time while the model is discretized in time.
Panel a of Fig. 6 shows the elevation of a realization of the constrained perturbations.The colors represent the amplitude and the isolines are the phase.This perturbation shares indeed common characteristics known from tidal propagation.The tidal amplitude is amplified in some regions on the shelf.It also contains an amphidromic point in the south eastern part of the model domain.
This perturbation (and the corresponding velocity perturbation) is applied to initial conditions and boundary conditions of the WFS ROMS model.The model is then integrated for 10 days using realistic forcings.The amplitude and phase of the difference between the elevation of the perturbed run and the unperturbed run is computed and shown in panel b of Fig. 6.
The location and value of the amplitude maxima in the perturbed model run are comparable to the applied perturbation.The model results contain also the amphidromic point as it can be seen from the phase.However, some differences in phase can be seen especially in Florida's Big Bend, where the phase difference is about 40 • .The agreement between both fields is reasonable given the number of assumptions that were necessary for the shallow water equations that were not made in the WFS ROMS.
In the context of an ensemble forecast, if one ensemble member is particularly close to the observations (closer for example than the central, unperturbed simulation), we can therefore assume that the perturbations produce indeed initial and boundary conditions which are more realistic than the unperturbed initial and boundary conditions.Formally, this can be done by including the perturbation in the data assimilation state vector.The analysis will then provide an improved estimation of those boundary conditions.

Numerical cost
To assess the numerical cost of this scheme and to test its feasibility with a high resolution ocean model, the method was applied to a square domain of 200 km length with different resolutions.In a first series of experiments only 50 eigenvectors are retained during the eigenvector decomposition.All other parameters where the same than in the perturbation for the WFS case.
The code is tested on a single core of an Intel Xeon E5420 CPU.The code is run in Octave 3.0.5 compiled among others with SuiteSparse 3.4.0(Davis, 2004a,b), GotoBLAS 1.26 and ARPACK 96 (Lehoucq et al., 1997).Those libraries are used in the eigenvector decomposition.The time in seconds of different steps in the algorithm are shown in Table 2 for different grid sizes.A domain size of 512×512 was tested but the method required more than the available 16 GB of RAM.As expected, the creation of the matrix B −1 , and the ensemble creation increases essentially linearly with the number of grid points while the eigenvector decomposition increases faster than linearly with the number of grid points.The slope of a linear regression in log-log space is 1.3.This progression is still quite similar to a linear increase because the matrix B −1 is sparse and because a fixed number of eigenvectors are retained.
In a second series of tests, the domain size is fixed (128×128 grid points) and the number of eigenvectors are increased from 50 to 200 (Table 3).The numerical cost of the eigenvector decomposition shows a linear progression relative to the number of eigenvectors.The numerical cost of the ensemble creation increases only slowly and is marginal in the overall cost.
By increasing the resolution of a domain, one can argue that the number of eigenvectors and ensemble members should also increase proportional to the number of grid points.In this case, the progression rate of both series have to be combined and the numerical cost scales approximately by the number of grid points elevated to 2.3.
The program code has been written such that it can run unmodified also on MATLAB.The benchmark was repeated on the same machine with MATLAB R2008a (64-bit version).The numerical cost in function of the number of grid points and the number of eigenvectors retained varied in a similar way than with Octave.Overall, Octave was 13% faster than MATLAB in completing the two series of tests.In summary, the CPU time of this method is acceptable since it is very small compared to the CPU time needed for the ensemble run.More limiting than the CPU time, can be the required amount of RAM memory for large model configurations.

Conclusions
A new method to produce ensemble perturbations is presented.It allows to take linear constraints into account.The formalism similar to weak constraints in variational analysis can be used in an ensemble scheme for generating ensemble perturbations.The constraints can be chosen such that the perturbations are dynamically balanced.This is useful for ensemble forecasts and in particular for data assimilation where the ensemble spread is supposed to reflect the uncertainty in the model forecast and should not include the variability of transient motions generated by an unbalanced state.
The method has been tested to create elevation and velocity perturbations of tidal motions.The shallow water equations constitute a constraint that relate elevation and velocity.In a first test, those variables are perturbed independently.In a second simulation of intermediate complexity, only the momentum equation is enforced.This procedure allows to derive the velocity perturbation from an elevation perturbation.The third tests uses the linear shallow water equations (momentum equations and continuity equations) to create the perturbations.For each of those simulations, the amount of transient motions created by the tidal analysis is computed by applying a tidal analysis.In all cases the amplitude of the transient motions were highest near the coast.The transient motions created using the linear shallow water equations were significantly smaller than those generated with the two other methods showing that the constrained ensemble was dynamically more balanced.A further reduction of the transient motions could be obtained by including the effect of bottom drag into the dynamic constraint.The importance of drag on tidal dynamics is well known, but in this context it highlights the fact that important non-linear terms cannot be simply neglected but must be linearized around an appropriate mean-state.
The challenge of this approach will be to formalize an appropriate dynamical constraint to quantify if a perturbation is realistic or not.It will require a good knowledge of the dynamical behavior of the studied system and of its error sources.Some constraints (involving for example density) are non-linear.But for small errors, a useful local linearization can be generally obtained.Even if no linear constraint or balance can be formulated, the presented procedure can still be useful to create perturbations that are aware of the land-sea boundary.Spurious correlation across land points are thus avoided.Also, a spatially varying correlation length can be used with this method.For example, one can specify the correlation length as a multiple of the radius of deformation.Those two aspects are not possible in the Fourier-based method to generate perturbations.Despite the numerical cost is higher than for the Fourierbased method, its cost is still acceptable for most model setups.
Altought only two-dimensional perturbations have been shown, the ensemble generation code is written in a way that it can generate n-dimensional perturbations given a userspecified dynamical constraint.The source code which runs on MATLAB and GNU Octave is freely available at http: //modb.oce.ulg.ac.be/mediawiki/index.php/WCE.
In future works, the method will be used for assimilating surface current observations.By including the boundary perturbations into the model state vector, the assimilation can provide also an improved estimation of the boundary values.

Fig. 1 .
Fig. 1.Ensemble covariance using Fourier modes (a) and constrained perturbations based on the land-sea mask (b).

Fig. 2 .
Fig. 2. Illustration of random field with a variable correlation length for a square domain without land points.

Fig. 3 .
Fig. 3. Horizontal covariance of the constrained perturbations between the point near the open boundary marked by a black dot and all other grid points.

Fig. 4 .
Fig. 4. Standard deviation of the elevation residual.(a) using independent perturbations, (b) using the momentum equation and (c) using constrained perturbations.

Fig. 5 .
Fig. 5. Standard deviation of the elevation residual (a) using the momentum equation and (b) using constrained perturbations.Bottom drag is taken into account.

Fig. 6 .
Fig. 6.Left panel: a realization of a constrained perturbation.Only the elevation amplitude and phase are shown.Right panel: tidal analysis of the difference between the elevation of the perturbed and the unperturbed simulation.
) complex tidal parameters and converted into amplitude and phase, as tides are usually characterized.If the amplitudes would be directly perturbed then one could not ensure that they remain positive.

Table 1 .
The average standard deviation (in m) of the elevation residual.This quantity indicates the magnitude of the transient motions created by different perturbations.

Table 2 .
Time in seconds as a function of the number of grid points of the different steps involved in the creation of the perturbations.

Table 3 .
Time in seconds as a function of number of eigenvector retained for a domain of 128×128 grid points.