Introduction

The Need for Downscaling in Climate

Global climate models (GCMs) constitute the primary and most comprehensive tools to study the processes in current climate and to investigate the characteristics of potential future climates. GCMs contain components describing atmospheric, oceanic and land-surface processes, including sea ice and other components of the Earth’s system that influence the climate on time scales ranging from hours to several hundred and even thousand of years. Due to the complexity of the physical processes described (e.g. carbon cycle, atmospheric chemistry, aerosols), the variety of external forcings considered (e.g. greenhouse gases and aerosols emission scenarios), the need for ensembles of very long simulations and the limitations in computing power, modellers are forced to use rather coarse computing meshes for GCMs compared to numerical weather prediction models, which precludes a detailed representation of fine-scale processes in the climate system. The current typical horizontal grid spacing of the atmospheric component of state-of-the-art GCMs participating in the fifth Coupled Model Intercomparison Project (CMIP5) on centennial scale is about 3 (e.g., [29]).

Several improvements can be expected from increased resolution in climate model simulations (e.g. [65, 80]). Numerical truncation error in the discretization of the field equations is automatically reduced with the use of finer computational grids, which in principle, should lead to improved simulations [43]. Furthermore, the use of finer computational meshes permits the explicit representation of small-scale processes that are precluded in low-resolution simulations. These include for example mesoscale circulations, the improved nonlinear interactions between large and small scales, and the development of specific hydrodynamics instabilities. The concomitant increase in the resolution of surface fields such as topography, coastal lines, inland water bodies and islands, also allows for the representation of a wide range of surface-forced processes that exert important local climate effects, some of which may even have synoptic- and planetary-scale influences. For example, in regions of complex topography, the use of a finer grid mesh allows resolving smaller scale near-surface temperature gradients due to the better-described orography and the general variation of air temperature with altitude [20, 60]. Another example is given by the interaction between middle-latitude synoptic weather systems and a higher-resolution topography; the improved representation of higher mountains may enhance the rain-shadow effect in the lee side of the mountains, influencing the precipitation locally but also far from the mountains (e.g. Fig. 6 of [45]).

A distinct challenge associated with increasing resolution, however, concerns the adaptation of the parameterization algorithms used to represent the ensemble effect of subgrid-scale processes (e.g. [5, 37, 43, 58]). The absence of suitable recalibration can partially offset or even counteract the benefits arising from the higher resolution. It is generally expected, however, that after suitable recalibration to adapt the parameterizations to the increased resolution, improvements in the simulated climate would ensue.

The motivation behind producing higher resolution climate simulations is not only related to scientific arguments of the type described above. Future changes in the climate, in its mean, variability and extremes, may have significant societal effects due to their impact on human activities such as agricultural food production, water management, energy consumption and production, as well as on infrastructures, health and ecosystems. The study of potential impacts and the development of adaptation strategies to mitigate climate changes require trustworthy climate information at different spatio-temporal scales, including knowledge of changes within countries, provinces and even cities (e.g. [4]). The need for climate information at very fine scales hence constitutes a strong incentive to perform higher-resolution climate projections.

Development of the Nested RCM Approach and Expected Outcomes

In order to circumvent the limitations of current low-resolution GCMs and practical difficulties of making operational high-resolution global climate simulations, several so-called downscaling techniques have been developed (e.g. [29, 32]). Downscaling techniques attempt translating large-scale low-resolution atmospheric fields into local, high-resolution information of the climate variables of interest. Downscaling techniques can be classified into two main categories: dynamical downscaling and statistical downscaling; see the reviews by Giorgi and Mearns [32] and Rummukainen [64] for the former and by Maraun et al. [51] for the later.

In this article, attention will be focused on dynamical downscaling using one-way nested, limited-area, regional climate models (RCMs) [78]; the alternative dynamical downscaling technique using global variable-resolution grid models (e.g. [2, 16, 30]) will not be discussed here. The dynamical downscaling approach using RCMs was initiated by the seminal work of Giorgi and Bates [31]. It consists of using time-varying large-scale atmospheric variables (usually winds, temperature, water vapour and surface pressure) and sea surface conditions (temperature and sea-ice cover) obtained from interpolation of coarse resolution GCM-simulated data or historical analyses (reanalyses) to drive a higher resolution model over a limited region of the globe. The key challenge behind the application of RCMs is to reap the benefits of higher resolution while avoiding the possible drawbacks arising from the use of nesting over a limited domain (e.g. [47]). Years of investigations have led the regional climate modelling community to believe that RCMs can improve small- and medium-scale features that are either absent or poorly represented in the coarser boundary conditions (e.g. [18, 28, 37, 44]). Whether RCMs can and should improve some of the larger-scale fields provided in the boundary conditions remains debated within the community [21, 76]. The large-scale conditions used to drive the RCM are generally referred to as the lateral boundary conditions even if, in some cases, interior nudging of the large scales is also applied (e.g. [36, 55, 79]). To avoid potential confusion, the RCM driving fields will henceforth be referred to as global driving data (GDD).

Given that dynamical downscaling is a relatively new technique, a good part of the work since the inception of RCMs was devoted to issues related to their development to ensure their technical feasibility and to optimise the results. Once these aspects had been positively answered, practical interest in their application grew at a fast pace, as did the expectations of the climate change impact community. As a result, a number of projects have produced regional climate projections using RCM ensembles including, but not limited to, PRUDENCE [10], ENSEMBLES [75], NARCCAP [54] and NARCliM [26]. A more global (multiple regions around the world) and more comprehensive (broader range of downscaling approaches) project is currently underway (CORDEX; [33]). Simultaneous with the development of the mentioned projects, investigations regarding the actual benefits of using RCMs were taking place; this quest came to be called “the added value issue” and today, is the focus of much attention in the RCM modelling community as can be seen in the large number of papers published on it (see recent reviews by Hong and Kanamitsu [37] and Xue et al. [84]) and of presentations in recent conferences and workshops (e.g. [1]).

The aim of this article is to carry a conceptual review of several issues associated with the added value (AV) afforded by RCMs in dynamically downscaling low-resolution climate analyses or GCM simulations and projections. The paper is organised as follows. The next section briefly describes various ways in which RCMs can be evaluated and discusses alternative meanings of “added value”. Factors Influencing the Added Value discusses the main factors that exert an influence on added value. Finally, summarises and concludes.

What Added Value Means

General Evaluation of RCMs

A key element in the development of any approach trying to describe or predict some aspect of the natural world is its evaluation. That is, in order to quantify how reliable a numerical model is and how confident we can be about its simulations and predictions, model results should be compared either with known solutions in idealized frameworks or with observations in the real world (e.g. [62]).

A usual way to evaluate simulations from a climate model (CM), whether global or regional, is to test its ability to simulate some climate statistics of the recent past (X CM) by comparing with the corresponding climate statistics obtained from observations (X OBS). In this case, the climate model performance can be quantified using some statistical metric d (e.g. the mean square error) to measure the “distance” between the simulated and observed climate as

$$ O = d(X_{\text{CM}},X_{\text{OBS}}) $$
(1)

When the statistic is multivariate, specific weights are assigned to individual variables in the calculation of the metric. Obvious as it may sound, this kind of evaluation is challenging as it requires observations at a similar resolution and spatial coverage as the simulated fields. It is worth mentioning that by “similar resolution” we do not mean a nominal resolution such as that claimed by many databases in areas of sparse observations. For example, the database CANGRID [77] comes on a 50-km grid, but in some of the covered regions stations, can be more than 500 km apart. While for GCMs the evaluation is made difficult due to large regions with poor coverage (e.g. over the oceans), for RCMs, the difficulty comes from the scarce availability of high-density observations nearly everywhere, which strongly limits the observational evaluation to a few variables over specific regions around the world (e.g. [17, 25, 38, 67]).

Another approach to evaluate an RCM is to verify its ability to reproduce the climate simulated by a global model at the same resolution and with similar formulation. This type of evaluation is denoted as a “perfect model” approach and, following Eq. 1, it can be quantified as follows:

$$ \text{PM} = d(X_{\text{RCM}},X_{\text{HRGCM}}) $$
(2)

where X HRGCM represents the climate statistics of the high-resolution GCM simulation and PM denotes a perfect model-type evaluation. This perfect model approach has been extensively discussed in the context of the Big Brother Experiment (e.g. [3, 11, 15, 21, 24, 48, 66]). An advantage of the perfect model approach is that it does not require observations, and hence, it can be used to evaluate in detail the possible drawbacks of running over a limited area. Furthermore, it can be used to evaluate the ability of RCMs not only to simulate the present-day climate but also the future climate and climate changes, although so far we are not aware of any study of the latter type.

Another way to evaluate the skill of RCMs consists in assessing the comparative performance of the RCM simulation and its GDD in representing the present climate over a given region. This kind of evaluation arises naturally from the fact that the RCM is not a self-contained tool and needs boundary conditions to simulate the climate of a given region. As a consequence, an essential requirement for an RCM to be useful is that its simulation improves some aspect of the climate compared to that of its GDD. The comparison of the relative performance of the RCM and its GDD is generally designed as the study of the added value (AV) of RCMs. The AV can be quantified by comparing the distance metric between the GDD and the observations on one hand, and between the RCM simulation and the observations on the other:

$$ \text{AV} = d(X_{\text{GDD}},X_{\text{OBS}}) - d(X_{\text{RCM}},X_{\text{OBS}}), $$
(3)

Defined this way, the AV is positive whenever the distance (i.e. error) between the RCM and observations is smaller than the one between the GDD and observations. This AV type of evaluation could also be applied in other contexts such as evaluating whether a new version of a model improves upon an earlier version. As with the observational approach (see Eq. 1), the AV evaluation requires high-resolution observations and hence, can only be performed for a limited number of climate variables over some regions of the world. However, as will be discussed later, Eq. 3 does not constitute the only way to evaluate the AV and other ways should be contemplated to reflect possible improvements in broader aspects of the simulation.

It is important to make some general remarks about the various RCM evaluation approaches described above. All of them rely on the use of some metric d—and its associated weights in the case of multivariate statistics—to measure the difference between the climate of a simulation and some reference. The metric is usually selected depending on the application of interest, although there is often a lack of information about its relevance to characterise the predictive skill of climate models. As discussed by Gleckler et al. [34], the choice of metric is arbitrary and hence there is no objective way to evaluate the overall performance of climate models and to identify the “best” climate model. Furthermore, there is no way to ensure that the apparent best performance is not due to compensation of errors in the model, hence giving a false sense of confidence.

In addition, due to the internal variability of both driving model and RCMs, it is difficult to establish statistically robust figures for these metrics (see [47] and [19]). For experiments to be effective, it is then necessary to work with either rather long simulations or several ensemble members, which is uncommon in multi-institutional RCM ensembles [54]. What makes this situation even more difficult is that internal variability is higher in smaller spatial and temporal scales as well as in extreme values, precisely where RCMs added value is expected to thrive [12, 50].

It is also important to highlight here that, although the different types of evaluation are related with each other, no simple inferential relations can be drawn between them. For example, if an RCM scores well in the context of a perfect model experiment, this does not imply that there is some AV nor that the RCM performs well when compared with observations. Similarly, the climate of an RCM can be fairly close to the observed climate and even show improvements compared to the GDD, and still fare badly in a perfect model-type evaluation. This tells us that all evaluations discussed above have something to teach us, and this explains why they represent important research activities of the RCM community.

On the Various Meanings of Added Value

For the reasons discussed in the last section, the determination of the skill of a climate model or the value added by an RCM simulation is ultimately dependent on some arbitrary choices made by individual researchers. Moreover, the fact that there is no specific definition of what AV means implies that there is no formalism nor systematic approach that is universally accepted. A review of the literature reveals that AV may take a variety of meanings. While some authors perform the assessment of AV using quantitative metrics (e.g. [13, 22, 40, 59]), others may use qualitative assessments such as visual comparison of maps or of the complexity of some phenomenon (e.g. [49]). Also, while in many studies the mere fact that an RCM produces small-scale features that are absent in the GDD has been considered as AV (e.g. [7, 72]), other studies considered this as “added variability” that can potentially lead to some AV, conditional to other factors (e.g. [6, 18, 20]). In what follows, we will describe and argue about a number of categories of what we believe can be considered as AV. The list is probably incomplete, but we hope that some of the principles behind the categories are meaningful.

A simple and straightforward way in which an RCM can add value is by improving the score of some metric compared to the GDD. This type of AV, hereafter denoted by “observational AV”, has the advantage of being easily quantified using some equation of the type presented in Eq. 3. Furthermore, the AV as presented in Eq. 3 can be normalized in such ways that a direct comparison can be performed for the AV arising from different variables (e.g. precipitation vs. temperature), statistical properties (time average vs. 95th percentile) and/or distance metrics (mean square error vs. spatial correlation). Hence, these types of comparisons are particularly useful when trying to compare the influence of various factors on the total AV (e.g. flat vs. complex topography regions). The normalization processes may be also useful when the overall AV of the RCM technique needs to be evaluated in a similar way as done when evaluating the overall performance of GCMs (e.g. CMIP3 vs. CMIP5 GCMs) [34, 80].

There is, however, an important drawback associated with the observational AV approach. As discussed before, its evaluation is limited to climate statistics for which reliable and consistent observations are available for comparison. As a consequence, there may be several aspects of the RCM simulation that are systematically improved (deteriorated) compared to the GDD but for which no assessment can be performed. One example may help to illustrate this point. It has been shown that RCMs systematically improved the simulation of the spatial patterns of 2-m temperature compared with the GDD in regions of complex orography [20, 27, 60]. As discussed by Di Luca et al. [20] and Prein et al. [59], this improvement is largely related with the general relation between temperature and terrain elevation: the more detailed representation of terrain elevation gradients in the RCM creates stationary temperature gradients due to the general decrease of temperature with height. The fact that this improvement appears to be systematic and that it is based on a relatively simple process suggests that it may be expected over most regions where complex topographic features exists, although the observational data may not be available to confirm it. It is worth noting at this point, although it will be discussed later in more detail, that when added value has such a straightforward origin, it is also probably easily attainable with simpler methods, such as a rudimentary post processing.

Another example may be related with the fact that RCMs can extend the variance spectra to higher frequencies and wave numbers by adding fine-scale variability that was absent in the GDD. Even if no direct observations may be available to verify the quality of these small scales, a theoretical relation such as the well-known spatial-scale dependence of the atmospheric kinetic energy spectra in the free troposphere [68, 72] can be used to assess the plausibility of the downscaled fields.

An additional negative effect of the lack of observations available for comparison is that they do not allow for a comprehensive evaluation of the AV that could help to understand the physical causes for the improvements/deteriorations. Commonly, the climate statistics that can be evaluated (such as biases of time-mean precipitation or 2-m temperatures) lead to very few insights on which and how the physical processes act in concert to give a particular AV. As a consequence, a given score may be deteriorated/improved by the RCM compared with the GDD, while the formulation of some of the underlying processes may be improved/deteriorated.

Based on such considerations, it may be convenient to define a second category of AV, here denoted as “conjectural AV”, to characterise the AV in those climate statistics for which no observational estimations are available but that are either firmly supported by theoretical considerations or by a number of studies realised under similar circumstances. Clearly, the determination of the conjectural AV is more subjective than that of the observational AV because one has to decide whether or not evidence supporting a particular assertion about the AV is strong enough to extrapolate the result. This conjectural AV can also be used to argue that, for example, representing hail in a physically sophisticated manner as done in RCM simulations performed at convection-permitting scales (with grid spacing of a few kilometers) is an AV even if opportunities for its verification are scarce. An interesting aspect of the conjectural AV is that it can be used to assess the AV of RCMs in future climate simulations; that is, the extension of the AV to the future climate can be done through the conjectural AV.

Another interesting way to characterise the possibility of an RCM-produced AV is through the potential AV (PAV) concept [6]. As argued in [18, 19]), a necessary condition for RCMs to add some value at fine scales is that the field of interest contains some structural small-scale richness. For example, geopotential height is strongly dominated by large scales, hence giving little chance to add any value in smaller scales. While the absence of fine-scale information (i.e. lack of PAV) is a proof of lack of AV in the small scales, the presence of PAV does not imply however that there is some AV. This is because the error of the RCM-simulated small-scale variability may be larger than the variability itself (see [19] for a more detailed explanation). An important advantage of the PAV framework is that it does not need observations and so it can be applied equally to present and future climate simulations. That is, as with the conjectural AV, the PAV framework can be used to assess necessary conditions for RCMs to add value in future climate and climate change statistics.

In addition to these alternative ways of looking at AV, it is also important to consider the AV from the users perspective. For example, if the estimate of the uncertainty of climate changes is important to users, they may choose to use GCMs rather than RCMs independently of any AV present because the larger size of GCM ensemble members may allow for a better estimate of the model uncertainty through projection spread. It is worth noting that this choice is not without some risk of losing valuable information, such as when the variable of interest is dependent on strong local processes. On the other hand, other users aiming at performing impact studies may prefer RCM simulations, even without a tangible proof of any kind of AV, due to the greater level of detail and the more realistic temporal variability compared to GCM data. What these examples say is that, from the user standpoint, there are several other elements in addition to explicitly documented AV that come into the choosing between global or regional models [8]. As emphasized by Barsugli et al. [4], many users will take weighty decisions based on the information provided by climate change projections. Hence, the search for AV and its role in helping practitioners and decision makers becomes far from academic. In this conceptual review, we have chosen not to pursue the topic from this perspective despite its importance for reasons of space and of lack of experience of the authors. Progress from this point of view is taken place, however, particularly in validation of downscaling methods to produce information that is both actionable and defensible (see activities of VALUE network in [52]).

Factors Influencing the Added Value

In this section, we discuss a number of factors that may strongly influence the amount of AV and we separate them in two categories: those depending on the numerical experimental design and those depending on choices related with the particular climate statistics to be evaluated. Several aspects of the first category were largely discussed in a recent review paper by Xue et al. [84] who found that in a number of studies they reviewed, RCMs add value only under an adequate choice of parameters such as the size and location of the domain, and of model components such as convective, land-surface or numerical schemes. What constitutes an adequate choice is difficult to assess “a priori”, however. Here, we focus our attention on more formal arguments to discuss how some particular choices of the experimental design can influence the expected AV.

Figure 1 shows three of the choices to be made when designing an RCM experimental set-up that can have important consequences on the amount, kind and meaning of value added by an RCM. As highlighted by Castro et al. [7], one important choice is given by the type of GDD used to drive the RCM that, in a climate studies context, can be reduced to either reanalyses or GCM-simulated fields. When using reanalyses to drive the RCM, the upper limit we can aim for is to generate a proxy for a high-resolution reanalysis, using dynamical downscaling as a much simpler method than a full-fledged data assimilation system, i.e. “poor-man reanalysis” (e.g. [39, 70, 74, 79]).

Fig. 1
figure 1

Choices in the design of the experimental setup that will influence the AV to be found in a subsequent analysis

Given that reanalyses data constitute the best guess of the large-scale conditions, it is not expected that the RCM could be able to improve upon those large-scale variables that are used to drive the RCM itself. It may happen, however, that the RCM simulation improves some non-driven variables at large scales (such as rainfall) although the main AV is expected in the small-scale features (small is here relative to the effective resolution of the reanalyses). The AV of reanalyses-driven RCM simulations can be explored using both the observational and the conjectural approach.

It is important to keep in mind, however, that the grid size of a reanalysis is not a good proxy for the scale of a phenomenon to be captured, unlike the case with GCMs. For example, let us take the case of large topographic forcing, such as that of the Rocky Mountains. Since reanalyses are fed by observations of the real world affected forcings at all scales, using an RCM with a resolution higher than that of the reanalysis will not necessarily materialize in measurable gains. Improvements may appear as a result of an RCM being more apt locally than the reanalysis and being able to produce a more realistic spatio-temporal spectrum.

When driven by GCMs in present climate, dynamical downscaling should make the simulated weather more realistic and statistically closer to observations due to the development of small-scale and topography-related phenomena that are poorly developed or unresolved in GCMs. Conceptually, we aim at reaching the standard of what would be obtained by running a GCM at the same resolution as the RCM (i.e. high-resolution GCM), except for improvements in the very large-scale circulation that cannot be achieved by one-way dynamical downscaling. Again, the AV of GCM-driven RCM simulations can be studied using both the observational and the conjectural approach, but the results should be analyzed with caution due to the propagation of errors from the GDD. For example, a bias in the driving model may be in some cases amplified by improving the description of certain phenomena. It has been suggested by some (e.g. [61]) that a useful test for RCMs would be to study whether they can improve upon the GCMs skill at reproducing the slow-evolution component of the observed climate, such as decadal fluctuations. It has been argued, however, (e.g. [46]) that in addition to the issue of the propagation of errors from the GDD, such test is hampered by the difficulty of separating the component of climate variations that results from changes in forcing (whether natural or anthropogenic) from the natural variability due to the chaotic nature of the climate system.

Finally, the last case is when RCM are driven by GCMs in future climate projections. Clearly, because we do not yet know the future and there does not exist a proxy for similar changes, only the conjectural and potential AV can be evaluated, as discussed in On the Various Meanings of Added Value section. Pielke et al. [57] have argued that RCMs AV in reanalyses-driven simulations does not necessarily imply AV in GCM-driven projections, due to the propagation of errors from the GDD. A counter argument could be made that processes strongly influenced by local forcings (such as snow-albedo feedback, complex orography or coastal lines) clearly benefit from increased resolution, independently of the GDD and perhaps their biases.

Another important choice to be made during the experimental set-up design that may influence the amount of AV is related to the horizontal resolution jump between the RCM simulation and the GDD. In general, the use of a single nest grid allows decreasing the horizontal spacing by about a factor five compared with the boundary conditions (e.g. from 250 to 50 km), although larger jumps have also been realised successfully [71]. Larger resolution jumps can also be performed by using multiple nested grid simulations with larger domain coarse-mesh simulations successively driving smaller domains with finer meshes simulations [9, 26]. Although the multiple nesting approach may allow in principle to decrease the grid spacing as much as is desired, errors propagating across the boundary may limit its application to some extent. It is worth mentioning that the resolution jump is not always a simple matter of choice: sometimes, it is imposed by resolution needs. By this, it is meant that some topographic features such as a valley, or weather phenomena such as convective systems [41] cannot be well simulated before reaching a certain resolution (e.g. [9]). It is clear that any dynamical downscaling approach that does not reach the critical resolution proper to each phenomenon will underperform with respect to its potentiality.

An additional aspect of the experimental set-up that may affect the amount and robustness of AV is related to the ensemble size. As is well known, it is generally difficult to determine whether the AV arises due to the higher resolution of the RCM or whether the particular RCM being used performs better than the lower-resolution GCM. As a consequence, it seems that the only way to identify improvements that are related to some intrinsic characteristic of the dynamical downscaling technique and independent of a specific RCM is through the use of a large ensemble of RCM-GCM pairs of simulations. As shown by Watterson et al. [80], increases in the horizontal resolution of GCMs do not lead to unequivocal improvements in the simulated climate when considering individual GCMs, but they show a consistent overall improvement when considering results from sub-ensembles of GCMs grouped according to their resolution. Limitations in computational resources, however, constrain, the actual number of GCMs that can be downscaled and so, objective approaches are needed to efficiently select a sample of GCMs as GDD [26, 53].

An important aspect of the experimental set up when conducting very high resolution, convection-permitting RCM simulations (horizontal grid meshes smaller than about 4 km) is related with the use of explicit convection instead of the parametrized representation. Using an ensemble of RCM simulations, Prein et al. [59], showed that convection-permitting simulations improve upon the parent simulations with an implicit representation of convection (with grid spacing of 10 km) mainly by improving the diurnal cycle of summer precipitation and intensities of very extreme precipitation events. A deeper insight on the potential benefits and problems of convection-permitting simulations is given in a recent review article by Westra et al. [81]) and in the study by Kendon et al. [41].

Figure 2 shows various factors that define a given climate statistic and that can strongly influence the amount of AV for a given experimental set-up. Given that the main difference between the climatic information provided by the GDD and by the RCM is related with their horizontal resolution, it seems natural that the AV will depend on the spatial scale of analysis. Various studies have assessed the AV generated by RCMs using some spatial scale-decomposition method [6, 18, 20, 27]. For example, [18] separated the AV in two components, denoted by AV1 and AV2, with AV1 representing the AV related with the small scales permitted by the higher resolution (i.e. scales only represented in the RCM) and AV2 representing the AV arising at relatively large scales (i.e. spatial scales common to both the GDD and the RCM) that results from the upscaling of fine scales into large scales.

Fig. 2
figure 2

The choice of the climate statistics will also have an influence on the amount of AV. The various factors that may change the amount of AV are detailed in this chart

The finer resolution afforded by RCMs also implies that shorter temporal scale processes can be explicitly simulated. As a consequence, AV may be expected to depend on the time scale of the phenomenon studied. For example, [18] have shown that the PAV of RCMs when looking at the 95th percentile of precipitation was much higher for 3-h data than for daily or 15-day averaged data. Variables with spatio-temporal spectra dominated by large scales—such as temperature in the middle troposphere—are less prone to provide AV, while in variables with flat spectra—such as precipitation or water vapour—small-scale components contribute more importantly to the total variance and to local extremes where AV may be found.

RCM climate statistics and their associated AV can also be strongly influenced by the nature and relative importance of forcings (e.g. stationary or transient) that characterise the particular region/season of interest (see “Forcing source” category in Fig. 2). Extensive research [14, 18, 20, 27, 60, 63, 69, 83] has shown that, no matter the statistical property considered (e.g. long-term mean, 99th percentile), a certain amount of AV can be expected in those regions characterized by important fine-scale surface forcings, thanks to the enhanced richness of surface characteristics at higher spatial resolution. However, the AV arising from stationary forcings is generally not obvious in spatially homogeneous regions where fine-scale surface forcings are weak (see section 9.6.4 of [29]); a list of several studies where added value was detected in mesoscale phenomena and extremes is also given in the same publication. In homogeneous regions, fine-scale AV can only arise from fine-scale transient features such as convective processes, mesoscale phenomena (e.g. low-level jets) and perhaps some better-resolved synoptic patterns (e.g. frontal precipitation).

In close relation with the type of forcing influencing the climate statistic of interest is the particular statistical property that is considered. Temporal and spatial averaging tends to filter out the variability induced by transient forcings and therefore, to highlight the transient effects, it is necessary to make use of statistical properties quantifying the variability and the extremes of the distribution (e.g. standard deviations, high percentiles, etc.).

The role of the Earth’s surface in exchanging momentum, heat and humidity with the atmosphere—through topographic features such as mountains, coastlines, open bodies of water or any spatial inhomogeneity on the Earth’s surface—makes the lower troposphere a particularly rich region for finding AV. Leduc and Laprise [48] have shown that the potential of AV on atmospheric variables is largest near the surface. This is due to the fact that low-resolution features entering the domain have two main reasons to be less affected at higher levels: (1) the proximity of the surface facilitates the creation of small-scale features by surface forcing and (2) high-level winds are stronger, making the spatial spin-up of small scales longer. Hong and Kanamitsu [37] stressed that most of the AV is expected in simulated variables that either depend strongly on the physical processes (e.g. precipitation) or are heavily influenced by high-resolution surface forcings through the planetary boundary layer physical representation. As a consequence, the AV is strongly dependent on how well physical processes are parameterized and how adapted they are to the high resolution.

When looking at future climate projections, an important aspect of a climate statistic is whether it quantifies some aspect of the future climate or of future climate changes [19, 56, 73]. As discussed before, looking for AV in the future climate can be done by investigating either the PAV or the conjectural AV, which are probably not very different from those in present climate. Although the study of the AV in the climate change signal has the same restrictions as those of studying future climate, less AV may be expected in the climate change signal due to the fact that the difference between the climate statistics of future and present filters to a large extent the AV induced by stationary forcings such as topographic details or land-sea contrasts. For higher statistical moments such as extremes, particularly for those variables where extremes results from small-scale phenomena—such as precipitation—differences could be considerable and there seems to be potential for AV in the climate change signal. However, as higher statistical moments are considered, natural variability and model structural uncertainties associated with the fine-scale information may be large enough prevent results to be useful by for impact and adaptation studies.

It is well known that GCM-projected climate changes associated with thermodynamical processes (such as anthropogenic warming and associated acceleration of the hydrological cycle) are more trustworthy than those associated with dynamical processes (such as storm track shifts). It is hence to be expected that downscaled climate signals arising from warming are more credible than those arising from circulation changes. Hall [35] gives two such examples, the lake-effect snow along the fringes of the Great Lakes and the snow-albedo feedback effects in the high-elevation headwaters of the Ganges River, which are not adequately simulated by the GCMs due to their coarse-mesh that prevents adequate resolution of inland lakes and mountain height variations.

Discussion and Conclusions

In this paper, we have reviewed some important issues related to the capability of high-resolution RCMs to add value to that of the low-resolution global driving data. Our initial goal in writing this paper was to develop a conceptual framework integrating recent advances in the field, clarifying terminology and systematizing current knowledge about the added value issue. During the writing process, the difficulty of reaching consensus between the individual ideas of the three coauthors, despite being long-term colleagues, led us to realise that our initial ambitions were somewhat too high. This also indicates that this specific field of research is, in some sense, not yet well grounded, and issues, apparently trivial, such as what is meant by added value, are still poorly defined, a conclusion supported by several presentations at a recent RCM Workshop [1]. We hope that this paper will stimulate further discussions in this direction, as we strongly believe this is an important issue.

While at first sight it seems obvious that the search for added value is to show that RCMs have a role to play in the large concert of tools devised to help users to adapt to climate change, it is unclear to us whether this objective is wise in the way it has been usually understood. To see this, it is instructive to take an example from the global modelling community.

Flato et al. [29] and Watterson et al. [80] compared simulations of (CMIP5) models with those of the earlier CMIP3 models and found quite limited improvement, and when they did, it was more a property of the ensemble than of individual models. Such disappointing findings have had however little negative impact in the global modelling community because it holds the deeply entrenched belief that the appropriate development pathway is making models more comprehensive, including more complex processes and increasing their resolution, without expecting ground-breaking improvements in scores between model generations. The situation with RCMs is quite different, and finding that RCM simulations do not improve upon the driving GCM data in some aspect is often interpreted quite negatively (e.g. [42]). Hence, the stakes for regional models to clearly display AV appear much higher than for global models. The assessment of GCM simulations by Flato et al. [29] should, however, hold a lesson for those exploring the AV in RCMs: that it is unrealistic to expect a vast amount of AV since models already perform rather decently and we may have entered into a period of diminishing return where investments pay off less now than when models were far less sophisticated and of coarser resolution. It is hence perhaps unwise for the RCM community to attach too much expectations to visible AV signs instead of, as for GCMs, to progress related with improvements to the formulation of models and processes represented.

It is for this reason that we favour the idea that added value should be defined in the less restrictive way possible, keeping in mind the complexities that may make a dataset richer than other, and not to succumb to the quantitative temptation of turning this richness into insipid numbers with little meaning. At the end of the day, we probably need to be quantitative, but a lot of effort has to be devoted to the search for a meaningful process that is worth quantifying. That being said, we believe that it is also very important to be as rigorous and precise as possible when evaluating the AV of RCMs. Several discussions in this article suggest that the expression “added value” should be used carefully, and that it necessitates specifying the point of view and the particular aspect of the climate that is being improved. Phrases such as “RCMs add value to GCMs” are too vague to be of use, since this can be true or false depending on the point of view, the variable or the time scale considered.

As it has become clear in the last years, RCMs will have to show their worth, not only against the driving data, but also against less expensive alternatives such as statistical downscaling techniques [23]. One of the risks of staying in a simplistic quantitative approach in the evaluation of AV is that RCMs will face a tougher fight in this territory. It is easy to see, for example, how surface temperature can be reasonably downscaled by simply following what is done to correct observational datasets for height variations [82]. The advantages and the eventual superiority of dynamical downscaling approaches will become more evident when exploiting their power to reproduce processes by means of a physically meaningful causal chain. It is from local interactions—defined by differential equations—that physics sets the conditions for the emergence of weather and climate variability at all spatial and temporal scales. The great story behind climate models is that they reproduce with surprising precision this causal chain. We believe that more meaningful added value may be found by exploring conditions conducive to particular weather and climate events than by focusing on simple statistics.

As discussed along this manuscript, an important part of the AV generated by RCMs, particularly when looking at future climate projections, is related with the conjectural AV. The conjectural AV depends on large amounts of both theoretical and/or empirical evidences that support a generalization of the AV. This means that research showing systematic improvements on some metrics or on phenomena can be very helpful to indicate particular processes and metrics for which we can expect AV.