Towards a fair comparison of statistical and dynamical downscaling in the framework of the EURO-CORDEX initiative

Both statistical and dynamical downscaling methods are well established techniques to bridge the gap between the coarse information produced by global circulation models and the regional-to-local scales required by the climate change Impacts, Adaptation, and Vulnerability (IAV) communities. A number of studies have analyzed the relative merits of each technique by inter-comparing their performance in reproducing the observed climate, as given by a number of climatic indices (e.g. mean values, percentiles, spells). However, in this paper we stress that fair comparisons should be based on indices that are not affected by the calibration towards the observed climate used for some of the methods. We focus on precipitation (over continental Spain) and consider the output of eight Regional Climate Models (RCMs) from the EURO-CORDEX initiative at 0.44∘ resolution and five Statistical Downscaling Methods (SDMs) —analog resampling, weather typing and generalized linear models— trained using the Spain044 observational gridded dataset on exactly the same RCM grid. The performance of these models is inter-compared in terms of several standard indices —mean precipitation, 90th percentile on wet days, maximum precipitation amount and maximum number of consecutive dry days— taking into account the parameters involved in the SDM training phase. It is shown, that not only the directly affected indices should be carefully analyzed, but also those indirectly influenced (e.g. percentile-based indices for precipitation) which are more difficult to identify. We also analyze how simple transformations (e.g. linear scaling) could be applied to the outputs of the uncalibrated methods in order to put SDMs and RCMs on equal footing, and thus perform a fairer comparison.

In the above mentioned studies, SDMs and RCMs were compared without bring-90 ing into question whether the indicators considered in the comparison were influ-91 enced by the calibration or tuning of the downscaling methods. As far as we know, 92 there is no previous comprehensive comparison study taking this factor into account. 93 In this paper we shed light on this problem and describe an inter-comparison ex-94 periment for precipitation over Spain considering eight EURO-CORDEX RCMs at 95 a 0.44 • resolution and five PP SDMs trained using the Spain044 gridded observa-96 tion data in a cross-validation form. The methods considered include an analog re-97 sampling technique and four methods based on a Bernoulli (for occurrence) and a 98 Gamma (for amount) distributions, fitted to the data conditioned to circulation in dif-99 ferent forms. Therefore, the training process of the SDMs used in this study only 100 affects directly the mean and distribution shape of the precipitation amount, except 101 for the analog method which affects various aspects of the distribution due to its re-102 sampling nature. By doing this, we keep the number of parameters affected in the 103 training phase as small as possible, unlike other methods that calibrate the whole 104 distribution. Moreover, in order to analyze the potential impact of the adjustment of 105 these statistics, the comparison is also performed after the application of two basic 106 bias correction methods to both statistical and dynamical downscaling for precipita-107 tion frequency and intensity. 108 This paper is structured as follows. In Section 2 we present the data and methods In this work, daily precipitation values from the freely-available RCM simulations 127 within the EURO-CORDEX initiative at 0.44 • resolution were downloaded from the considered the simulations driven by the ERA-Interim reanalysis (Dee et al, 2011) 130 covering the common period 1990-2008. Notice that this ensemble contains two versions of the WRF model, with different microphysics and radiation schemes but the 132 same convection parameterization. We refer the reader to Table 1  The SDMs used in this work (see Table 2  and is extended here to SDMs.

193
The indices 90pWET, RX1day and RX5day were corrected using a multiplicative local scaling (LS) factor obtained as the quotient of the observed and simulated wetday precipitation: where RR DS represents daily downscaled precipitation. The correction factor changed 194 from season to season for each grid box. The precipitation indices were computed 195 from the resulting RR LS series.

196
Other precipitation indicators, such as CDD, are more related to precipitation occurrence and the autocorrelation of the precipitation series. This indicator changes as the wet-day threshold (typically 1mm) changes, thus it would be sensitive to changes in the wet-day frequency. The frequency adjustment was applied to the precipitation series by obtaining the adjusted wet-day threshold P * that adjusts the simulated and observed wet-day frequency (i.e. the percentage of wet-days is the same for observations and simulation). For this purpose, P * was estimated selecting the value of the downscaled precipitation matching the observed wet-day frequency computed with a 1mm threshold (RR1 OBS = F OBS (1mm)) for each grid box: where F is the empirical cumulative density function (CDF), so F DS and F OBS refer to 197 the downscaled and observed CDFs, respectively. Thus, the correction of CDD con-198 sists in using P * (instead of 1mm) as the wet-day threshold in the index calculation. 8 Note that this correction adjusts the precipitation occurrence, but does not af-  Therefore, a direct comparison of results from both techniques is also unfair in this 329 case. After local scaling, the results of the RCMs become comparable to the SDMs.

330
Similar results were also found for RX5day (not shown).

340
SDMs show very small changes after the frequency adjustment (mainly a reduction 341 in the spatial variability). This suggests that they present inherent deficiencies in rep-342 resenting dry spells, which cannot be solved by means of a bias correction. Note that 343 the correction does not alter the series autocorrelation, but the wet-day frequency.

344
In particular, S5 shows a completely different behaviour as compared to the other 345 SDMs, whereas the analog method (S1) is the best-performing SDM. Bear in mind 346 that the analog method is an algorithmic method that is based on a resampling of 347 the observations. Therefore, it does not explicitly calibrate the mean or the temporal 348 correlation but, according to the results, they are indirectly quite well captured. This 349 is one advantage of this method, but it also presents some limitations such as the     CDD Bias (threshold=P* mm) Fig. 3 Biases for CDD before the correction (first and second columns), wet-day adjusted thresholds P * (third and fourth columns, see Section 2.5) and CDD biases after the correction (fifth and sixth columns) for the SDMs (S1-5) and some representative RCMs, in winter. The numbers inside the figures are the spatially averages MAE's. For a better contrast of spatial differences in P * , values are presented using a non-linear scale.