Technical Note: Tail behaviour of the statistical distribution of extreme storm surges
- Met Office, FitzRoy Road, Exeter, EX1 3PB, UK
- Met Office, FitzRoy Road, Exeter, EX1 3PB, UK
Abstract. The tail behaviour of the statistical distribution of extreme storm surges is conveniently described by a return level plot, consisting of water level (Y-axis) against average recurrence interval on a logarithmic scale (X-axis). An average recurrence interval is often referred to as a “return period”.
Hunter’s allowance for sea-level rise gives a suggested amount by which to raise coastal defences in order to maintain the current level of flood risk, given an uncertain projection of future mean sea level rise. The allowance is most readily evaluated by assuming that sea-level annual maxima follow a Gumbel distribution, and the evaluation is awkward if we use a generalised extreme value (GEV) fit. When we use a Gumbel fit, we are effectively assuming that the return level plot is a straight line. In other words, the shape parameter, which describes the curvature of the return level plot, is zero.
On the other hand, coastal asset managers may need an estimate of the return period of unprecedented events even under current mean sea levels. For this purpose, curvature of the return level plot is usually accommodated by allowing a non-zero shape parameter whilst extrapolating the return level plot beyond the observations, using some kind of fit to observed extreme values (for example, a GEV fit to annual maxima).
This might seem like a conflict: which approach is “correct”?
Here I present evidence that the shape parameter varies around the coast of the UK, and is consequently not zero.
Despite this, I argue that there is no conflict: a suitably-constrained non-zero-shape fit is appropriate for extrapolation and a Gumbel fit is appropriate for evaluation of Hunter’s allowance.
Tom Howard
Status: closed
-
RC1: 'Comment on os-2022-14', Philip Woodworth, 22 Mar 2022
22 March 2022
Comments on "Technical Note: Tail behaviour of the statistical distribution of extreme storm surges" by Tom Howard (OSD)
This is a short technical note which attempts to make 3 points: (a) the shape parameter of extreme sea level curves at most UK sites is not zero (and usually negative) and so any parameterisation of the extreme level curve should accommodate its curvature, (b) in spite of that, an assumption of zero shape for a Gumbel distribution is reasonable for Hunter's allowance calculation, and (c) the shape parameters derived from short records are imprecise. These things were known already (or suspected anyway) but it does no harm to restate them in the same place.
I have no objections to the note's publication if the small things below can be attended to. The text is clearly written although the document itself is a little rough (hence some of the trivial comments below).
line 6 - mean sea level rise here and mean-sea-level rise at line 40 (I said these were trivial comments but they suggest some lack of attention)
15 - you don't present evidence that the shape parameter varies around the UK coast. You have a scatter plot in Figure 1 that shows there are clearly different values at different places but, unless you know where the UK place names refer to, you have no insight on how the shape varies around the actual coastline. A map is needed or at least a couple of sentences to say how it varies.
49 - not incompatible ==> compatible!
50-52 - these lines would be better following on at line 39
65 - I know this is a short technical note and there are many details in HW21, but it does no harm to give some essential minimum information. For example, presumably the surge extrema used for Figure 1 are from exactly the same years as the tide gauge extrema, or comparisons are not exact. So say so. Also say what the minimum record length of tide gauge record is employed.
After Figure 1 there should be a sentence to tell the reader that most of the UK shape parameters are negative. And that this observation is
not new. For example, see Figure 9 of Marcos and Woodworth (JGR, 2017) which shows consistent negative shape parameters for both North Atlantic coasts. And Wahl et al. (2017) claim that 85% of records worldwide have negative shape parameters. As for the UK, I am sure the negative shapes will have been pointed out in older papers by Blackman, Horsburgh, Tawn etc. (although I have not checked which)69 - give a reference. For example chapter 7 of Pugh and Woodworth (2014). As well as the physics of wind stress etc., there is a general point that there is only so much water in the ocean, so one would imagine any extreme level curve to turn down at some point.
73 .. in [shape parameter] (reference needed. HW21 again?)
I don't understand why in practice you know there is spatial correlation in the location parameters. That can only be in model runs where the datum at every point is MSL. But if you are using real tide gauge data the location parameters will depend on the datums used at each site. (I hope you see what I mean.)
79 - why 'vector'? It seems an odd word to use here.
82 - say 'For the model data at each tide gauge site'. To make it clear you are using just the short model data sets here and not the 484 year set mentioned later.
90 - from any other site. (?)
104 - the long run of 484 years. And this is for the 44 (?) tide gauge sites?
112 - .. not Gumbel-distributed as was known previously.
Figure 2 (a) and (b) should have (m) on each axis
line 4 of caption - .. the site of the 44 (?) tide gauges on ..
section 2.4 - I got the idea of this section although you have to read it a few times. It would help to fully explain things. For example, what does 'standard-uniform' (line 121) mean?
126 - ... from a given site conforms to a precise GEV distribution.
130 - .. depends on the three GEV parameters.
133-134 - standard-uniform (as above)
151 - an average (?) optimum .. They prefered
155 - simulation as represented in Figure 1 (presumably)
Figure 3 - I don't understand why there are 4 plots here. Shouldn't there be 8? You have tide gauge data (shown here) and line 155 says you use model data also, so you need another 4 for the model data?
title caption should be VdBK and not VdB&K to be consistent with the text But I would remove that anyway and just have QQ plot to be consistent with PP plot on the right. Preumably the dots are ordered so as to be monotonic. Define in the caption the delta symbol on y-axis for QQ (differences at the outliers). Finally I don't understand why you call them 'theory'.
caption line 1 - this should be reworded as you say above for both that QQ and PP derive from VdBK
caption line 2 - at the 44 (?) sites of UK ...
172 - in general zero, as known already (refs).
Figure 4 - I thought you were using 44 sites (see caption figure 5). This should be mentioned at the places in the text I pointed out above. However here in Figure 4 there are 46 locations given.
188 - why did CFB2018 take +0.0119 as its prior shape parameter when all the evidence from previous publications and your Figure 1 has it negative? And at line 192 why did you use a prior of +0.0119 ?
Could you explain Figure 5 a bit better? If the data really has a non-zero shape, and the choice of prior is reasonable, then wouldn't you expect the right-hand side to be tighter than the left for Gumbels?
198 - .. is negative in common with most UK sites (Figure 1) and worldwide (Marcos and Woodworth, 2017; Wahl et al., 2017).
Figure 6 left - the Hinkley plot is described in some detail in Batstone et al. (2013)
Acknowledgements - define BEIS and Defra
221 - Climate, 231 - Research Letters, 235 - Communications
- AC1: 'Reply on RC1', Tom Howard, 03 Apr 2022
-
CC1: 'Comment on os-2022-14', John Hunter, 11 Apr 2022
The paper is generally well written, the problem is described clearly in the Introduction and the results are summarised clearly (although perhaps too briefly for some) in the Conclusions. However, the middle part (the bulk of the manuscript) does contain some quite complex concepts and methods, which could, I feel, be helped with a little more explanation and perhaps a few equations.
The following line-by-line comments are reasonably minor - if they are attended to, I would have no objection to the manuscript being published.
Page 2, lines 27-29: given that the previous sentence mentions both the annual-maxima and peak-over-threshold methods, this sentence should mention the peak-over-threshold equivalent of the GEV, which is the Generalised Pareto Distribution (GPD).
Page 2, line 37: I generally dislike unnecessary abbreviations; the reference "Howard and Williams, 2021" (here abbreviated to "HW21") only occurs five times in the manuscript, but I leave this decision to the editor. Likewise, "Van den Brink and Können (2008) (henceforth VdBK08)" on Page 6, line 114.
Page 3, line 53: it would probably be good to define the "skew surge" at this stage.
Page 3, line 59: the author uses the terms "geographical variations" and "spatial correlation" throughout Section 2. I don't think this is an appropriate terminology, because there is no actual "geographical" or "spatial" coordinate used in the analysis (for example, there is no analysis of the correlation of the shape parameter with latitude). It just happens that the modelled and observed data come from the same locations - the actual locations are just "labels" which relate modelled data points to their equivalent observational data points. I don't think this, in any way, affects the results but the author may want to clarify his terminology.
Page 5, lines 104-112: I got a bit lost here - it would be good if the author showed what he did by using a few equations. It also seems strange that he says that "the RMS difference based on the real data is more than 6 standard deviations away from the mean of this random distribution", when Figure 2 (a) seems to show that most points are within the 95-percent (approximately 2 standard deviation) uncertainty range of the red line. I am not saying the author is wrong - only that he needs to provide enough equations to convince me that he is right.
Page 6, line 120: I found the "PIT" at the start of the line very confusing, until I read the whole paragraph and realised that it stood for "probability integral transform". I can't see the point of this initial "PIT" - I'd omit it altogether.
Page 6, lines 128 and 134: it is a bit confusing that the author uses U_i to represent the PIT-transformed Y_i and U_j to represent the PIT-transformed M_j - it would be clearer if a letter other than "U" was used in the second instance.
Page 6, line 129: it would be good to expand on "standard-uniform" (e.g. f(x) = 1 for 0 <= x <=1; f(x) = 0 for x < 0 or x > 1).
Page 8, Figure 3: I may be being pedantic here, but I feel that it would be easier to understand if the PP plots (which were introduced first, on line 142) were on the left and the QQ plots (which were introduced second, on line 144) on the right.
Page 8, Figure 3: the labelling of the vertical axis of the left-hand panels is obscure - it is presumably the gumbel variate of the samples, and so should be labelled something like "Gumbel variate (samples)". For consistency with the right-hand panels, the horizontal axis should be labelled something like "Gumbel variate (theory)". It would also clarify the panels if the labelling of the tics was consistent for both vertical and horizontal axes.
Page 10, line 185: this is the first occurrence of "CFB2018", which is not defined anywhere. It is presumably "Environment Agency: Coastal Flood Boundary Conditions for the UK: update 2018" - this should be defined here.
Page 12, lines 203-207: I love the brevity and clarity of the Conclusions, although some may find it too terse.
-
AC6: 'Reply on CC1', Tom Howard, 28 Apr 2022
Hi John, Thank you for the review. My responses are uploaded as a supplement to the third (anonymous) reveiw. This includes tabulated repsonses to all 3 reviews to date, because there is some overlap. I have also upload the proposed appendix as author comment AC4. I accidentally uploaded it twice: AC5 is identical.
-
AC6: 'Reply on CC1', Tom Howard, 28 Apr 2022
-
EC1: 'Comment on os-2022-14', John M. Huthnance, 20 Apr 2022
This is to clarify that the comment by John Hunter was intended to be a referee comment as invited by me as Editor. Its appearance in another guise is accidental.
- AC2: 'Reply on EC1', Tom Howard, 20 Apr 2022
-
RC2: 'Comment on os-2022-14', Anonymous Referee #2, 21 Apr 2022
Thank-you for this technical note, which I believe will be helpful in analysing return periods of UK gauges and is worth publishing in OS. A little work is required to make it easier to understand, especially for readers considering how it applies to sites outside of the UK network. Some suggestions:The paper could be made easier to follow independently. It currently relies on too much cross-referencing to other papers particularly H&W21.Please provide equations for GEV/Gumbel, and GPD that is mentioned later.Fig 1 only makes sense to someone very familiar with the names and locations of the UK TG network, and even then it is hard to determine whether there is a spatial relationship or whether the relationship is due for example to tidal range. Perhaps using the colouring to group neighbouring gauges in clusters would be useful? And certainly a map.But I would also like to see some evidence of the spatial correlation of mu and lambda, perhaps some maps indicating all three parameters, as fitted to the data as it stands, and then under the experimental conditions? Or plotted with the coastal position as an axis - I see from panel 1c that you have already ordered the sites clockwise around the coast.line 73: artefact since we're in British spellingline 113: You show that the fitted scale parameter lambda, assuming a Gumbel distribution, is slightly higher than a Gumbel should allow? What does this imply, physically? How does assuming a Gumbel therefore bias the extrapolation? Perhaps work through an example? ... ah this comes in figure 6, thanks. It might be easier to understand the general argument if you brought fig 6 forward.Figure 3: I'm afraid I don't really follow what is going on here, this plot is not well explained.Fig 6: It is quite concerning that Hinkley shows such a large uncertainty at very long return periods depending on the method, considering the reason for the gauge! If this is very atypical, a more typical example would be illustrative. Probably not Bournemouth, which has its own unusual challenges.line 202: Can you give any guidance on whatâ constraints should be applied to the GEV shape parameter in practice? Is it the same at any site? If not, what varies?Conclusions: I agree with the other reviewer that it is refreshing to see such succinct conclusions!References: There are several missing DOIs.Data: data should be open and links provided.
- AC3: 'Reply on RC2', Tom Howard, 28 Apr 2022
- AC4: 'Comment on os-2022-14', Tom Howard, 28 Apr 2022
-
AC5: 'Comment on os-2022-14', Tom Howard, 28 Apr 2022
Here is the proposed appendix with details of the statistical test associated with Fig. 2.
- AC7: 'Reply on AC5', Tom Howard, 28 Apr 2022
Status: closed
-
RC1: 'Comment on os-2022-14', Philip Woodworth, 22 Mar 2022
22 March 2022
Comments on "Technical Note: Tail behaviour of the statistical distribution of extreme storm surges" by Tom Howard (OSD)
This is a short technical note which attempts to make 3 points: (a) the shape parameter of extreme sea level curves at most UK sites is not zero (and usually negative) and so any parameterisation of the extreme level curve should accommodate its curvature, (b) in spite of that, an assumption of zero shape for a Gumbel distribution is reasonable for Hunter's allowance calculation, and (c) the shape parameters derived from short records are imprecise. These things were known already (or suspected anyway) but it does no harm to restate them in the same place.
I have no objections to the note's publication if the small things below can be attended to. The text is clearly written although the document itself is a little rough (hence some of the trivial comments below).
line 6 - mean sea level rise here and mean-sea-level rise at line 40 (I said these were trivial comments but they suggest some lack of attention)
15 - you don't present evidence that the shape parameter varies around the UK coast. You have a scatter plot in Figure 1 that shows there are clearly different values at different places but, unless you know where the UK place names refer to, you have no insight on how the shape varies around the actual coastline. A map is needed or at least a couple of sentences to say how it varies.
49 - not incompatible ==> compatible!
50-52 - these lines would be better following on at line 39
65 - I know this is a short technical note and there are many details in HW21, but it does no harm to give some essential minimum information. For example, presumably the surge extrema used for Figure 1 are from exactly the same years as the tide gauge extrema, or comparisons are not exact. So say so. Also say what the minimum record length of tide gauge record is employed.
After Figure 1 there should be a sentence to tell the reader that most of the UK shape parameters are negative. And that this observation is
not new. For example, see Figure 9 of Marcos and Woodworth (JGR, 2017) which shows consistent negative shape parameters for both North Atlantic coasts. And Wahl et al. (2017) claim that 85% of records worldwide have negative shape parameters. As for the UK, I am sure the negative shapes will have been pointed out in older papers by Blackman, Horsburgh, Tawn etc. (although I have not checked which)69 - give a reference. For example chapter 7 of Pugh and Woodworth (2014). As well as the physics of wind stress etc., there is a general point that there is only so much water in the ocean, so one would imagine any extreme level curve to turn down at some point.
73 .. in [shape parameter] (reference needed. HW21 again?)
I don't understand why in practice you know there is spatial correlation in the location parameters. That can only be in model runs where the datum at every point is MSL. But if you are using real tide gauge data the location parameters will depend on the datums used at each site. (I hope you see what I mean.)
79 - why 'vector'? It seems an odd word to use here.
82 - say 'For the model data at each tide gauge site'. To make it clear you are using just the short model data sets here and not the 484 year set mentioned later.
90 - from any other site. (?)
104 - the long run of 484 years. And this is for the 44 (?) tide gauge sites?
112 - .. not Gumbel-distributed as was known previously.
Figure 2 (a) and (b) should have (m) on each axis
line 4 of caption - .. the site of the 44 (?) tide gauges on ..
section 2.4 - I got the idea of this section although you have to read it a few times. It would help to fully explain things. For example, what does 'standard-uniform' (line 121) mean?
126 - ... from a given site conforms to a precise GEV distribution.
130 - .. depends on the three GEV parameters.
133-134 - standard-uniform (as above)
151 - an average (?) optimum .. They prefered
155 - simulation as represented in Figure 1 (presumably)
Figure 3 - I don't understand why there are 4 plots here. Shouldn't there be 8? You have tide gauge data (shown here) and line 155 says you use model data also, so you need another 4 for the model data?
title caption should be VdBK and not VdB&K to be consistent with the text But I would remove that anyway and just have QQ plot to be consistent with PP plot on the right. Preumably the dots are ordered so as to be monotonic. Define in the caption the delta symbol on y-axis for QQ (differences at the outliers). Finally I don't understand why you call them 'theory'.
caption line 1 - this should be reworded as you say above for both that QQ and PP derive from VdBK
caption line 2 - at the 44 (?) sites of UK ...
172 - in general zero, as known already (refs).
Figure 4 - I thought you were using 44 sites (see caption figure 5). This should be mentioned at the places in the text I pointed out above. However here in Figure 4 there are 46 locations given.
188 - why did CFB2018 take +0.0119 as its prior shape parameter when all the evidence from previous publications and your Figure 1 has it negative? And at line 192 why did you use a prior of +0.0119 ?
Could you explain Figure 5 a bit better? If the data really has a non-zero shape, and the choice of prior is reasonable, then wouldn't you expect the right-hand side to be tighter than the left for Gumbels?
198 - .. is negative in common with most UK sites (Figure 1) and worldwide (Marcos and Woodworth, 2017; Wahl et al., 2017).
Figure 6 left - the Hinkley plot is described in some detail in Batstone et al. (2013)
Acknowledgements - define BEIS and Defra
221 - Climate, 231 - Research Letters, 235 - Communications
- AC1: 'Reply on RC1', Tom Howard, 03 Apr 2022
-
CC1: 'Comment on os-2022-14', John Hunter, 11 Apr 2022
The paper is generally well written, the problem is described clearly in the Introduction and the results are summarised clearly (although perhaps too briefly for some) in the Conclusions. However, the middle part (the bulk of the manuscript) does contain some quite complex concepts and methods, which could, I feel, be helped with a little more explanation and perhaps a few equations.
The following line-by-line comments are reasonably minor - if they are attended to, I would have no objection to the manuscript being published.
Page 2, lines 27-29: given that the previous sentence mentions both the annual-maxima and peak-over-threshold methods, this sentence should mention the peak-over-threshold equivalent of the GEV, which is the Generalised Pareto Distribution (GPD).
Page 2, line 37: I generally dislike unnecessary abbreviations; the reference "Howard and Williams, 2021" (here abbreviated to "HW21") only occurs five times in the manuscript, but I leave this decision to the editor. Likewise, "Van den Brink and Können (2008) (henceforth VdBK08)" on Page 6, line 114.
Page 3, line 53: it would probably be good to define the "skew surge" at this stage.
Page 3, line 59: the author uses the terms "geographical variations" and "spatial correlation" throughout Section 2. I don't think this is an appropriate terminology, because there is no actual "geographical" or "spatial" coordinate used in the analysis (for example, there is no analysis of the correlation of the shape parameter with latitude). It just happens that the modelled and observed data come from the same locations - the actual locations are just "labels" which relate modelled data points to their equivalent observational data points. I don't think this, in any way, affects the results but the author may want to clarify his terminology.
Page 5, lines 104-112: I got a bit lost here - it would be good if the author showed what he did by using a few equations. It also seems strange that he says that "the RMS difference based on the real data is more than 6 standard deviations away from the mean of this random distribution", when Figure 2 (a) seems to show that most points are within the 95-percent (approximately 2 standard deviation) uncertainty range of the red line. I am not saying the author is wrong - only that he needs to provide enough equations to convince me that he is right.
Page 6, line 120: I found the "PIT" at the start of the line very confusing, until I read the whole paragraph and realised that it stood for "probability integral transform". I can't see the point of this initial "PIT" - I'd omit it altogether.
Page 6, lines 128 and 134: it is a bit confusing that the author uses U_i to represent the PIT-transformed Y_i and U_j to represent the PIT-transformed M_j - it would be clearer if a letter other than "U" was used in the second instance.
Page 6, line 129: it would be good to expand on "standard-uniform" (e.g. f(x) = 1 for 0 <= x <=1; f(x) = 0 for x < 0 or x > 1).
Page 8, Figure 3: I may be being pedantic here, but I feel that it would be easier to understand if the PP plots (which were introduced first, on line 142) were on the left and the QQ plots (which were introduced second, on line 144) on the right.
Page 8, Figure 3: the labelling of the vertical axis of the left-hand panels is obscure - it is presumably the gumbel variate of the samples, and so should be labelled something like "Gumbel variate (samples)". For consistency with the right-hand panels, the horizontal axis should be labelled something like "Gumbel variate (theory)". It would also clarify the panels if the labelling of the tics was consistent for both vertical and horizontal axes.
Page 10, line 185: this is the first occurrence of "CFB2018", which is not defined anywhere. It is presumably "Environment Agency: Coastal Flood Boundary Conditions for the UK: update 2018" - this should be defined here.
Page 12, lines 203-207: I love the brevity and clarity of the Conclusions, although some may find it too terse.
-
AC6: 'Reply on CC1', Tom Howard, 28 Apr 2022
Hi John, Thank you for the review. My responses are uploaded as a supplement to the third (anonymous) reveiw. This includes tabulated repsonses to all 3 reviews to date, because there is some overlap. I have also upload the proposed appendix as author comment AC4. I accidentally uploaded it twice: AC5 is identical.
-
AC6: 'Reply on CC1', Tom Howard, 28 Apr 2022
-
EC1: 'Comment on os-2022-14', John M. Huthnance, 20 Apr 2022
This is to clarify that the comment by John Hunter was intended to be a referee comment as invited by me as Editor. Its appearance in another guise is accidental.
- AC2: 'Reply on EC1', Tom Howard, 20 Apr 2022
-
RC2: 'Comment on os-2022-14', Anonymous Referee #2, 21 Apr 2022
Thank-you for this technical note, which I believe will be helpful in analysing return periods of UK gauges and is worth publishing in OS. A little work is required to make it easier to understand, especially for readers considering how it applies to sites outside of the UK network. Some suggestions:The paper could be made easier to follow independently. It currently relies on too much cross-referencing to other papers particularly H&W21.Please provide equations for GEV/Gumbel, and GPD that is mentioned later.Fig 1 only makes sense to someone very familiar with the names and locations of the UK TG network, and even then it is hard to determine whether there is a spatial relationship or whether the relationship is due for example to tidal range. Perhaps using the colouring to group neighbouring gauges in clusters would be useful? And certainly a map.But I would also like to see some evidence of the spatial correlation of mu and lambda, perhaps some maps indicating all three parameters, as fitted to the data as it stands, and then under the experimental conditions? Or plotted with the coastal position as an axis - I see from panel 1c that you have already ordered the sites clockwise around the coast.line 73: artefact since we're in British spellingline 113: You show that the fitted scale parameter lambda, assuming a Gumbel distribution, is slightly higher than a Gumbel should allow? What does this imply, physically? How does assuming a Gumbel therefore bias the extrapolation? Perhaps work through an example? ... ah this comes in figure 6, thanks. It might be easier to understand the general argument if you brought fig 6 forward.Figure 3: I'm afraid I don't really follow what is going on here, this plot is not well explained.Fig 6: It is quite concerning that Hinkley shows such a large uncertainty at very long return periods depending on the method, considering the reason for the gauge! If this is very atypical, a more typical example would be illustrative. Probably not Bournemouth, which has its own unusual challenges.line 202: Can you give any guidance on whatâ constraints should be applied to the GEV shape parameter in practice? Is it the same at any site? If not, what varies?Conclusions: I agree with the other reviewer that it is refreshing to see such succinct conclusions!References: There are several missing DOIs.Data: data should be open and links provided.
- AC3: 'Reply on RC2', Tom Howard, 28 Apr 2022
- AC4: 'Comment on os-2022-14', Tom Howard, 28 Apr 2022
-
AC5: 'Comment on os-2022-14', Tom Howard, 28 Apr 2022
Here is the proposed appendix with details of the statistical test associated with Fig. 2.
- AC7: 'Reply on AC5', Tom Howard, 28 Apr 2022
Tom Howard
Tom Howard
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
396 | 76 | 25 | 497 | 3 | 3 |
- HTML: 396
- PDF: 76
- XML: 25
- Total: 497
- BibTeX: 3
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1