WRF 4 G : WRF experiment management made simple

Introduction Conclusions References


Introduction
Numerical simulation experiments for weather research and operational purposes are among the most computationally demanding applications due to both the huge amount of data involved and the enormous requirements of computing power.Numerical climate models, in addition, require very long simulation times.The management of a weather or climate numerical simulation experiment is a challenging task.This task is often carried out in an ad hoc manner, which cannot be easily adapted to other types of experiments or even to a different computer system (Redler et al., 2012).This work presents the implementation and use of a framework, WRF4G, to manage the experiment workflow of a particular numerical weather and climate model, the Weather Research and Forecasting (WRF) modelling system.WRF4G provides a flexible design, execution and monitoring for a general class of scientific experiments.It has been designed with the aim of facilitating the management and reproducibility of complex Figures experiments.The use of parametrized templates for the experiments definition makes WRF4G ideal to perform all kind of parametrized studies.Furthermore, the concepts behind the design of this framework can be straightforwardly extended to other models (Fernández-Quiruelas et al., 2011).
The WRF modelling system (Skamarock et al., 2008) is one of the most popular mesoscale atmospheric models, widely used by the weather and climate communities.WRF is developed by the National Center for Atmospheric Research (NCAR) in collaboration with several agencies and Universities in the US.Furthermore, WRF is a community model in the public domain and, therefore, benefits from the contributions from an active community (currently, there are more than 25 000 registered users worldwide).Unlike other application-oriented models, WRF provides a flexible and computationally-efficient framework which allows solving a variety of atmosphere simulation problems across different time-scales, from weather forecast to climate change projection.Section 2 provides a brief description of the components of this modelling system, which are essentially shared by other mesoscale atmospheric models.
From a computational point of view, climate models are complex applications formed by several components (including pre-and post-processing modules) which are executed sequentially exchanging information through configuration files and data sets.An example of this "model workflow" is represented in Fig. 1a for the WRF modelling system.The execution of this workflow is a time-consuming and error-prone task, since several configuration files need to be adjusted for the particular simulation to be carried out.Therefore, typically, WRF users develop ad hoc in-house scripts to execute their simulations in their computing infrastructure (workstation, cluster, ...).
Weather and climate simulation experiments often require more than a single simulation.Additionally, computational constraints, such as file size or computing wall-time restrictions, may force single-simulation experiments to be split into dependent pieces.Therefore, the model workflow usually needs to be executed several (sometimes many) times and the simulations involved in the experiment may depend on each other, giving rise to an "experiment workflow".Section 3 provides several common examples Introduction

Conclusions References
Tables Figures

Back Close
Full of simulation experiments found in the literature.The management of this experiment workflow is, again, a time consuming task which requires an execution/monitoring system to abstract and automate the process.This kind of system is usually referred to as a workflow execution framework.
There are only few examples of these workflow execution frameworks in the literature.Since the needs are common to all models, the approach given for these frameworks is essentially the same.They provide mechanisms to easily configure, run and monitor different kinds of experiments with a given model, usually on HPC environments.The IC3 autosubmit1 and WRFPortal2 allow to execute the EC-Earth and WRF models, respectively.
Other experiment workflows incorporate data pre-and post-processing as well.But, these frameworks are devised to work with specific resources (in most cases the one used in the developers institution) and cannot be exported to others.For instance, the LEAD (now XSEDE) Portal (Christie and Marru, 2007), which runs the WRF, ARPS and DAS models, is a science gateway that uses TeraGrid resources3 ; FRE (Redler et al., 2012) that manages EMS model offers support for NOAA HPC resources; or the ecFlow4 framework available only on the European Center (ECMWF model).
The efficient management of the simulations of a numerical climate modelling experiment requires a change not only in the application workflow, but also in the user's point of view (Fernández-Quiruelas et al., 2011).The design of different kinds of experiments using the same framework needs a way to describe the experiment in terms of a few parameters.The dependencies among simulations introduce new concepts, which are described in Sect. 4. These concepts guide the configuration of the experiment, which is implemented in two separate configuration files: one for the computer resources and Introduction

Conclusions References
Tables Figures

Back Close
Full the other for the scientific experiment to be carried out.Some configuration examples are provided in Sect.6, covering those experiment workflows introduced in Sect.3. As the number of simulations increase, users may need to use additional computer resources and there is an additional effort in configuring the model scripts to execute in the new infrastructures.Furthermore, if the user wants to use several computer infrastructures at the same time, new services are needed to schedule and dispatch the jobs according to the status of the resources.Section 5 shows how new computing and data resources can be easily added to WRF4G.In a recent work, Fernández-Quiruelas et al. (2011) describe the challenges and limitations posed by climate application workflows on distributed computing infrastructures.Fernandez-Quiruelas et al. (2015) provide specific details of the implementation of WRF4G.Therefore, in this work, we focus on the user interface and the new concepts required to design different experiments in a flexible way.The reader interested in a detailed description of the computational implementation, including the ability of WRF4G to distribute the load across distributed computer infrastructures, is referred to Fernandez-Quiruelas et al. (2015).WRF4G is an open-source and publicly available software that can be downloaded from the Santander Meteorology Group website5 .

WRF model workflow
The WRF model workflow (Fig. 1a) consists of a sequence of steps which take initial and boundary conditions from coarse global models and high-resolution static data (orography, land use, ...) and produce high-resolution meteorological fields consistent with these forcings.
The sequence is as follows.Two programs need to be run initially, in any order.Using geogrid, the user defines the model spatial domain, choosing a horizontal resolution and a grid projection, size and location.This program interpolates high-resolution static Introduction

Conclusions References
Tables Figures

Back Close
Full data to this domain and stores theinformation in a NetCDF-formatted file (geo_em).
Using ungrib, the coarse global meteorological data in Grib format are converted to a binary WRF intermediate format.This step can be replaced by a home-made program as long as it produces the intermediate format files.The ungrib program is just an example to get these files out of GRIB-formatted files.
Intermediate files are still in the original global meteorological data resolution.The next step requires both the intermediate and geo_em files.Using metgrid, the information of these two kinds of files is merged and the coarse global meteorological data are interpolated to the model domain.
WRF discretizes the vertical dimension into hybrid terrain-following eta levels.The interpolation in the vertical from the native levels of the meteorological data to the eta levels is performed by a program called real, which produces the initial and boundary conditions in the format (NetCDF with WRF metadata) required by the numerical core of the model.This numerical core (wrf) is the last step in the model workflow and solves the dynamics and physics of the atmosphere, producing high-resolution meteorological data.The programs geogrid, ungrib and metgrid are known as the WRF Preprocessing System (WPS) and are configured using a common file (namelist.wps).Likewise, the real and wrf programs are configured using a different, but common, file (namelist.input).This model workflow does not only apply to WRF, but also to many other mesoscale or regional climate models.For instance, in MM5 (Grell et al., 1995) the steps are terrain (similar to geogrid), regrid (ungrib), interpf (metgrid+real) and mm5 (wrf).In RegCM4 (Giorgi et al., 2012), there is also the equivalence with the programs: terrain (geogrid), icbc (metgrid+real), and regcm (wrf).Introduction

Conclusions References
Tables Figures

Back Close
Full The simplest atmospheric modeling experiment consists of a single execution of the model workflow (Sect.2).However, current research studies or operational procedures are seldom based on a single simulation.Instead, several simulations with different dependencies among them are run.This Section illustrates three general classes of experiments which are commonly carried out with WRF and other models.

Parametric
In a general sense, a parametric experiment consists of a set of simulations where a given configuration parameter is changed.In this kind of experiments the sensitivity of the model output to the variation of the parameter is studied for different purposes: process understanding, model improvement, uncertainty assessment, etc.
A particular example are multi-physics ensembles, where the same dynamical core is run with different physical parameterizations.These parameterizations handle the effects of important small-scale phenomena on the model variables.Processes such as radiative heating/cooling, water condensation, turbulence or soil-atmosphere exchanges are handled by physical parameterizations, which are not unique.The performance of different parameterizations for the same physical process depends on many variables: geographical location, season, meteorological variable of interest, or even the statistic of interest, e.g.mean, variability, extremes, etc. (Fernández et al., 2007;Jerez et al., 2012).Therefore, a set of simulations with different schemes, a multiphysics ensemble, is common before applying the model to a particular region.
There are many recent examples of WRF multi-physics ensembles in the literature (Awan et al., 2011;Evans et al., 2011;Mooney et al., 2013;García-Díez et al., 2012).Physical parameterization changes are not the only parametric experiments with WRF in the literature.Other options such as changes in the number of vertical levels, the model top of the atmosphere, damping options, etc. are also commonly used (Awan et al., 2011).Introduction

Conclusions References
Tables Figures

Back Close
Full

Hindcast
Short to medium range weather forecasts are routinely produced at different institutions.A few of them produce global forecasts and many others use these global forecasts as boundary conditions for higher-resolution regional forecasts.In order to assess forecast errors, a long sample of previous forecasts with the same system is required.For these purpose, it is common to produce an ensemble of past historical forecasts, which are referred to as hindcasts.The same methodology is applied in seasonal forecasting (Díez et al., 2011).
Hindcasts are also a mean to produce past reconstructions of the atmospheric state.Jiménez and Dudhia (2012); García- Díez et al. (2012); Menéndez et al. (2014) are recent examples of WRF experiments run for this purpose.The frequent initialization of the model using observed data prevents it to drift away from the observed reality and constrains the internal variability.
In any case, this type of experiment requires the initialization of the model at regular time intervals and it usually involves many simulations.For example, a 20 year hindcast of daily simulations requires the management of more than 7000 simulations.

Climate simulation
Climate simulation requires long records to infer the average behaviour of the atmosphere.The models used are essentially the same as for short term forecasting, but they are usually coupled to other models dealing with slower components of the climate system (land surface, oceans, etc.).In the case of limited area models, most of these slow components are usually prescribed through the boundary conditions and, therefore, regional climate simulations consist of long, continuous simulations using the same model as in the previous experiment types.
There are also many recent studies using WRF to produce regional climate simulations over different parts of the world (Nikulin et al., 2012;Mooney et al., 2013;Argüeso et al., 2012a, b;Cardoso et al., 2012;Jiménez-Guerrero et al., 2013;Vautard et al., Introduction Conclusions References Tables Figures

Back Close
Full 2013).Most of them deal with the model ability to represent the observed climate and to project future changes at regional scale, in response to increased concentrations of greenhouse gases.Regional climate simulations can use a computer resource during months to produce several decades of simulated atmospheric states.This results in huge output data sets.
Furthermore, since regional models require not only initial, but also boundary conditions along the whole simulated period, the input files for this type of experiment are also very large.Since a computer resource can hardly be available uninterruptedly for several months, WRF (and all models) has the possibility to be restarted from special files which are created according to the user needs.Therefore, the climate simulation can be divided into parts but, unlike the previous experiment type, these parts are dependent on the previous one, which will produce the restart file required to run the next.

Understanding WRF4G: new concepts
The execution of a single simulation with WRF is very vulnerable to a large amount of errors and problems, such as missing or corrupt input data, errors editing the namelist files, etc.These problems can make the model crash in any step of the workflow or, even worse, let the model work with an improper configuration or input data that might affect the scientific results.In the case of ensembles or multi-parameter experiments, if the simulations are handled separately, it is easy to find differences in the configuration of the members, what leads to errors in the output.
As the number of simulations of an experiment increases, users may need to use more computer resources.The effort of configuring new infrastructures to run WRF experiments is very high.It involves developing scripts that fit the new infrastructure characteristics (batch system, parallel environment, data repository, ...).Furthermore, if the user wants to use several computing infrastructures at the same time, new challenges have to be faced (Fernandez-Quiruelas et al., 2015).Introduction

Conclusions References
Tables Figures

Back Close
Full WRF4G provides an easy handling and monitoring of WRF experiments and hides the complexity of leveraging several computing resources at the same time.To understand how WRF4G works and how to run experiments with it, it is necessary to get familiar with the concepts it has been developed on.

Experiment components
In WRF4G, a set of individual simulations which try to answer one or some specific scientific questions are called "experiment".We call each of this individual simulations a "realization".From the point of view of an automatic workflow, most of the interesting experiments can be defined by a parametrized namelist file.Thus, experiment.wrf4gcontains the parameter ranges of the individual namelist files corresponding to the different realizations forming the experiment.
WRF is able to produce restarts, which are large NetCDF files containing all the data that WRF needs to start running from a given model timestep as it was during the simulation (as a kind of memory dump).When a simulation stops, WRF can continue the simulation from the last restart file, so there is no need to start from the beginning.
Even the WPS and real steps are skipped when starting from a restart file.This capability is essential to perform long, continuous simulations, with weeks of walltime, and enables the use of WRF as a Regional Climate Model.
When a realization covers a very long period of time, which is the case of climate change projections, dealing with restarts is not enough.The boundary files with decades of data are huge and may cause storage space problems.Also, some file systems might have problems handling very large files.To solve this, the concept of "chunk" is defined in WRF4G.Realizations can be divided into chunks, which are consecutive, dependent WRF simulations, which read the last restart file generated by the previous simulation (chunk) in the chain.Introduction

Conclusions References
Tables Figures

Back Close
Full

WRF execution workflow
When a single computing resource is used to run an experiment, data is usually stored locally and data transfer time is negligible compared to the overall job execution time.
When more than a single computing resource is used, data has to be accessible from all computing resources.If data repositories and computing resources are not connected through a fast network, data transfers become a bottleneck.To minimize data transfers, WRF4G integrates the postprocessing step (WRF output is filtered to reduce its size) after the simulation execution in the computing node.For the same reason, the geogrid program is not run inside the WRF workflow of WRF4G (see Fig. 1b).
The output generated by geogrid can be recycled in different model runs, as the geographical data are independent from the driving model, time range covered and model configuration (except the resolution).These data must be produced by the user, and provided to WRF4G to specify the model spatial domain.
A preprocessor step has also been included in the WRF4G workflow to convert the input data used as initial and boundary conditions to a format readable by ungrib.
The WRF workflow execution in the computing node is driven by a wrapper script in charge of preparing the environment and orchestrating the run.The wrapper contains a monitor that tracks the events occurred during the model execution and updates them in a central database.This wrapper also uploads the restart and post-processed output files to the data respositories as their are being produced.

Computing and storage resources
Today, most researchers have access to several clusters and grid infrastructures and can rent on-demand cloud resources to temporarily solve peak workloads.The distributed nature of these infrastructures complicates tasks such as the monitoring and debugging of applications.And the interoperability among the computing resources is also an issue.Introduction

Conclusions References
Tables Figures

Back Close
Full WRF4G hides this complexity and facilitates the use of several computing infrastructures at the same time.Resource management is performed transparently and in a centralized manner from the computer where WRF4G is installed (see Fig. 2).This computer will be referred to as WRF4G User Interface (WUI) in the following.Users can configure the WRF4G computing resources (WCR) where they want to run their experiments by listing them in a WUI configuration file.Each resource is defined by the protocol used to access it (ssh, gsissh or local) and by its resource manager (none, sge, pbs, lsf, slurm, loadleveler, globus or cream).Section 5.2.2 describes in detail how computing resources are configured in WRF4G.
The experiment chunks can be run in any of the configured resources or just in a subset of them.Filters and scheduling policies based on the resources characteristics (operating system, CPU speed, disk quotas, etc.) can be established.The only software requirement to run WRF4G in a WCR is the Python programming language, which is commonly available.Nothing else needs to be installed or configured.
The location of the executables and libraries required to run WRF can be customized in the experiment configuration file for each computing resource.The serial, MPI, OpenMP and Hybrid (MPI-OpenMP) execution environments of WRF are supported.
If the location of the executables and libraries required to run WRF is not supplied, a precompiled binary is transferred to the computing nodes during the environment preparation.The precompiled binary comes along with OpenMPI.
The domains, input and boundary data and the precompiled WRF binaries can be stored in different storage resources and accessed through different protocols.These resources will be referred to as WRF4G Data Resources (WDR).Section 5.2.1 shows how the data management is performed with WRF4G.

WRF4G framework
WRF4G is a tool that simplifies the execution of atmospheric numerical simulation experiments with WRF.It provides full control of the configuration of the simulations and Introduction

Conclusions References
Tables Figures

Back Close
Full WRF4G is an open-source and publicly available software that can be downloaded from the Santander Meteorology Group website6 and deployed in any supported Linux system.Again, the only requirement for the WUI is Python.In order to simplify the learning process, WRF4G provides some examples ready to run in the WUI.In the default WRF4G configuration, the WUI also acts as a WCR and as a WDR.A small repository with the input and boundary datasets needed to run the examples together with a pre-compiled MPI version of WRF is included in the installation.Thus, users can get familiar with WRF4G by running small experiments in their PC.
WRF4G runs a database that persists all the experiment information and status (namelist files, toolchain used to create the executables, preprocessor, ...), thus facilitating the reproducibility of the results.Although, the database used by default is MySQL installed with the WRF4G, other instances of MySQL or any other relational database could be used.With the information stored in the database, the status of the experiment execution can be monitored in real-time.
WRF4G is layered to separate the experiment design from the execution environment.An atmospheric simulation experiment is defined through two configuration files: one contains the scientific configuration of the experiment (start and end dates, model configuration, experiment setup, input data, postprocessing to apply, etc.) and the other, the execution environment (number of MPI process to run, memory required, data repositories, etc.).Below we describe how to configure different kinds of experiments with WRF4G.More information about the WRF4G usage and components can be found in the WRF4G website.Introduction

Conclusions References
Tables Figures

Back Close
Full

Experiments definition
In WRF4G, the characteristics of an experiment are defined in the file experiment.wrf4g.The main fields to be defined are: -experiment_name and experiment_description.
-domain_name is the folder containing the information of the domain (namelist.wpsand geo_em files).
-extdata_vtable, extdata_preprocessor, extdata_path, extdata_ interval: These variables describe the external data to feed the model (variable table to decode the input, the preprocessor to make it readable for ungrib, the location and the time interval between input records).
postprocessor is the label identifying the postprocessor to be used to filter and reformat the model output.
-start_date and end_date define the time span of the experiment.
-chunk_size_h is the length of the chunks in hours.
-multiple_dates activates the hindcast mode.Many realizations are setup with different starting dates.These starting dates must be into the interval defined by the previous variables star_date and end_date.Another two variables (simulation_interval_h and simulation_length_h) define the interval between the starting dates and the length of each realization respectively (see examples is Sect.6).
-multiple_parameters activates (if set to 1) parametric experiments.The realizations defined by the previous settings will be run with different parameters.The parameters to vary are defined using multiparams_variables, and they can be any parameter of the original namelist.inputfile.The different parameter Introduction

Conclusions References
Tables Figures

Back Close
Full combinations are defined with the multiparams_combinations variable, and they can be labeled using multiparams_labels.
Apart from these specifications, any parameter of the namelist can be changed from its default value in a simple way.Section 6 shows a few examples of how the experiment types described in Sect. 3 would be configured.

Resources definition
In the default configuration of WRF4G, the only computer resource available to run experiments is the computer where WRF4G is installed.This computer is also used as storage resource.All the example experiments use the "/repository" folder, under the default installation location, as storage repository.
Below, we describe how other computing and data resources can be added.

Storage resources
The execution of each experiment requires several files: The input and boundary condition data, the domains, the preprocessor and postprocessor, etc. WRF4G distinguishes four kinds of data sources: -WRF4G_APPS: In this location can be stored all the applications required to run the WRF workflow in the computing resources.The data repository installed by default contains some pre-and post-processors, the NetCDF library and a precompiled MPI version of WRF.
-WRF4G_DOMAINPATH: Location where the domains geo_em NetCDF files for this experiment are located.As shown in 4.2, before running a experiment, users have to run geogrid manually and upload the resulting files to this repository.Introduction

Conclusions References
Tables Figures

Back Close
Full This information is configured for each experiment in the resources.wrf4gfile.Each of these data sources can be located in a different server and exposed through different protocols (see Fig. 2).To date, the following URL protocols are supported: file, rsync, sftp, http and gridftp (Allcock et al., 2005).WRF4G performs the data transfers between the data repositories and the computing resources in a way that is transparent to the users.Note that local copies (file) should be only considered when the simulations are run in the same node where the repository resides.

Computing resources
WRF4G is not designed with the purpose of submitting jobs to a specific type of WCR.Instead, researchers can define a wide range of resources.WRF4G is a generalpurpose framework that manages different types of CRs.Thus, an experiment may be performed on laptops, desktops, workstations, clusters, supercomputers, clouds and grids; being all of them configured in a file called framework4g.conf.This configuration resource file consists of sections, each led by a [section] header, followed by key=value entries.This is illustrated in Fig. 3.This example shows the framework4g.confcontent of a WUI installed on a workstation.Specifically, it defines three CRs: my_wokstation where the WUI has been installed, and two clusters accessed through SSH protocol (pbs_cluster and slrum_cluster).CRs in WRF4G are characterized by the protocol used to access resources (communicator) and the resource manager used to handle jobs (lrms).These keys are always common for each resource.Note that in case of an ssh communicator, a username and a frontend for the connection must be provided.Usually, this information is not enough to define a WCR.Thus, a resource can contain other keys, such as queue, max_jobs_running and max_jobs_in_queue.The combination of them is used by the job scheduler to sort the resources.Several other attributes

Conclusions References
Tables Figures

Back Close
Full are allowed, which are not present in this example.More information about them can be found on the WRF4G wiki7 .
In order to list CRs, their features (e.g.operating system, architecture, queues, etc.) and their status (number of running jobs, number of jobs in queue, etc.), WRF4G provides a command named wrf4g_resources (a sample output from this command is shown in Fig. 4).
Additionally, when an experiment is submitted, a scheduler selects the best-fitted resource according to a default scheduling policy.This default scheduling policy can be modified for each experiment by configuring the variables REQUIREMENTS and RANK in the resources.wrf4gfile.Both are mathematical expressions that are evaluated for each resource.REQUIREMENTS evaluates which resources are considered to submit an experiment, whereas RANK sorts the candidates filtered by the REQUIREMENTS.Figure 5 shows an example of both expressions.
Finally, researchers can also define the number of MPI processes used to run WRF(NP).If NP is set to 1, the serial version of WRF will be run, otherwise, MPI will be used.The default WRF executables provided by WRF4G have been compiled against openMPI libraries.

Managing experiments with WRF4G
Once the configuration files defining the experiments have been created (experiment.wrf4gand resources.wrf4g),the user is ready to prepare the WRF4G environment to run the experiment using the wrf4g_prepare command.In this phase, the experiment.wrf4gconfiguration is analysed and the details about the resulting realizations and chunks are recorded in the database.WRF configuration files for each of the chunks are also created and transfered to the data repositories pointed by the "WRF4G_BASEPATH" variable in the resources.wrf4gfile.

Conclusions References
Tables Figures

Back Close
Full Then, users can submit the whole experiment (by default) or just some realizations by running the wrf4g_submit command.In this phase, the database is queried to obtain a list of the chunks that need to be run.These chunks are efficiently scheduled to the computing resources taking into consideration the dependencies among them.The experiment (or just some realizations) can be killed at any moment using wrf4g_kill and its status can be monitored with wrf4g_status.
Figure 6 shows the output of wrf4g_status when the second Chunk of a experiment with a single realization is downloading the boundary conditions.
The granularity of the wrf4g commands allows to handle separately realizations and chunks.In this way, if, for example, one output file generated by a chunk is corrupt, we can rerun just this chunk again.This can be done because the restart file associated to the previous chunk is stored in the repository.
6 Use cases

Parametric
As previously explained, a parametric experiment is a set of simulations identical except for a parameter, or a parameterization, that is changed inside the model.García-Díez et al. ( 2014) is an example of this kind of experiment.These authors simulated the Euro-CORDEX domain with 6 different combinations of parameterizations, and compared the results with observations.The novelty of the study is that, apart from temperature and precipitation, they evaluated variables seldom studied as radiative fluxes and soil moisture.As shown in Fig. 7, the ensemble was easily designed by using the variable "multiparams_combinations" to choose the values of the namelist variables defined in "multiparams_variables".In this case, cumulus, radiation and microphysics parameterizations are varied, but any parameter in the WRF namelist can be changed.

Conclusions References
Tables Figures

Back Close
Full

Hindcast
In a hindcast experiment, the regional model is restarted frequently from the global model, running retrospective forecasts or re-forecasts.Menéndez et al. (2014) compared this running scheme with other techniques, known as spectral nudging and grid nudging.They found that the re-forecast scheme was the more accurate, apart from being the computationally cheapest.Thus, this scheme was used to produce a high resolution regional reanalysis over the Mediterranean sea, covering 20 years.This experiment was configured in WRF4G following Fig. 8.After setting the variable "multi-ple_dates" to one, "simulation_interval_h = 24" and "simulation_lenth = 42" tell WRF4G to start the model daily, and to run it for 42 h.Thus, in this case, we configured WRF4G to run 7305 individual simulations.These are independent and can be run in parallel.

Climate simulation
Finally, an example of a climate simulation, a continuous simulation with no restarts or multi-parameter settings, is the one that was used in Nikulin et al. (2012) (labeled as UC-WRF31).These authors evaluated the precipitation climatology of a regional multi-model ensemble over Africa for the first time.In Fig. 9 it is shown how WRF4G was configured in this case.As "multiple_dates" is set to zero, WRF4G does not restart WRF from the global model in any case, and simply runs continuously for twenty years.In this case "chunk_size_h" is set to 14 days (in hours).This means that the time interval is divided into 522 individual simulations, each one starting from the last restart file written by the previous one.Thus, in contrast with the hindcast, these are dependent and must be run sequentially.

GMDD Introduction Conclusions References
Tables Figures

Back Close
Full

Conclusions
In this paper we present a public domain WRF workflow execution framework (WRF4G 8 ) that automatically manages the above mentioned problems by means of execution, monitoring and data management frameworks which provides transparent access to heterogeneous resources (local clusters, grid, clouds, HPC, etc.).Thus, the definition and execution of typical experiments commonly undertaken with WRF is highly simplified.This leads to an enormous save of time and facilitates the access to new infrastructures with no additional overhead.The WRF4G framework is based on layers to separate the experiment design from the execution environment, including a monitoring and management system to easily restart broken simulations until the experiment is completed, and the ability of running these experiments on heterogeneous distributed computing resources concurrently in a transparent way.Thus, the WRF4G capability to harness powerful computing resources such as grids and supercomputers, allow the WRF community to undertake more ambitious problems with higher impact in society.Introduction

Conclusions References
Tables Figures

Back Close
Full   files for each of the chunks are also created and transfered to the data repositories pointed by the "WRF4G_BASEPATH" variable in the resources.wrf4gfile.
Then, users can submit the whole experiment (by default) or just some realizations by running the wrf4g_submit command.In this phase, the database is queried to obtain a list of the chunks 365 that need to be run.These chunks are efficiently scheduled to the computing resources taking into consideration the dependencies among them.The experiment (or just some realizations) can be killed at any moment using wrf4g_kill and its status can be monitored with wrf4g_status.
Figure 6 shows the output of wrf4g_status when the second Chunk of a experiment with a single realization is downloading the boundary conditions.

370
The granularity of the wrf4g commands allows to handle separately realizations and chunks.In this way, if, for example, one output file generated by a chunk is corrupt, we can rerun just this chunk again.This can be done because the restart file associated to the previous chunk is stored in radiative fluxes and soil moisture.As shown in Figure 7, the ensemble was easily designed by using the variable "multiparams_combinations" to choose the values of the namelist variables defined in "multiparams_variables".In this case, cumulus, radiation and microphysics parameterizations are varied, but any parameter in the WRF namelist can be changed.
Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | means for restarting part or the whole experiment in case of failure.It also provides the ability of reproduce the experiment fully or partially.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | -WRF4G_BASEPATH: Location where the files generated by the experiment are saved.These include output files along with log, restart and WPS files.
Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 1 .
Figure 1.Schematic representation of the WRF workflow (for a real case experiment).The ellipses represent the executable binaries, and the rectangles the intermediate files produced by them.(a) represents the WRF workflow alone, while (b) represents the WRF workflow as it is embedded in WRF4G.The gray background area is the part of the workflow executed by WRF4G.The white background must be run and the output provided by the user.

Figure 6 .
Figure6.wrf4g_status output when the second chunk of exp1 is downloading the boundary conditions.The columns indicate the realization name (Realization), a unique job identification (JOB_ID), the chunk status (Stat: P = prepared, W = waiting, R = running, D = done), the chunk running (Chunks: R = running, T = total), the computing resource running the chunk (Comp.Res), the worker node within the WCR (WN), the running stage of WRF workflow (Run.Sta.), the exit code (ext) and the percentage of the realization completed (%).
scheme was the more accurate, apart from being the computationally cheapest.Thus, this scheme 390

Figure 7 .
Figure 7.A fragment of the experiment.wrf4gused to produce the simulations used in García-Díez et al. (2014).