Comparison of neuron-based, kernel-based, tree-based and curve-based machine learning models for predicting daily reference evapotranspiration

Lifeng Wu; Junliang Fan

doi:10.1371/journal.pone.0217520

Abstract

Accurately predicting reference evapotranspiration (ET₀) with limited climatic data is crucial for irrigation scheduling design and agricultural water management. This study evaluated eight machine learning models in four categories, i.e. neuron-based (MLP, GRNN and ANFIS), kernel-based (SVM, KNEA), tree-based (M5Tree, XGBoost) and curve-based (MARS) models, for predicting daily ET₀ with maximum/maximum temperature and precipitation data during 2001–2015 from 14 stations in various climatic regions of China, i.e., arid desert of northwest China (NWC), semi-arid steppe of Inner Mongolia (IM), Qinghai-Tibetan Plateau (QTP), (semi-)humid cold-temperate northeast China (NEC), semi-humid warm-temperate north China (NC), humid subtropical central China (CC) and humid tropical south China (SC). The results showed machine learning models using only temperature data obtained satisfactory daily ET₀ estimates (on average R² = 0.829, RMSE = 0.718 mm day⁻¹, NRMSE = 0.250 and MAE = 0.508 mm day⁻¹). The prediction accuracy was improved by 7.6% across China when information of precipitation was further considered, particularly in (sub)tropical humid regions (by 9.7% in CC and 12.4% in SC). The kernel-based SVM, KNEA and curve-based MARS models generally outperformed the others in terms of prediction accuracy, with the best performance by KNEA in NWC and IM, by SVM in QTP, CC and SC, and very similar performance by them in NEC and NC. SVM (1.9%), MLP (2.0%), MARS (2.6%) and KNEA (6.4%) showed relatively small average increases in RMSE during testing compared with training RMSE. SVM is highly recommended for predicting daily ET₀ across China in light of best accuracy and stability, while KNEA and MARS are also promising powerful models.

Citation: Wu L, Fan J (2019) Comparison of neuron-based, kernel-based, tree-based and curve-based machine learning models for predicting daily reference evapotranspiration. PLoS ONE 14(5): e0217520. https://doi.org/10.1371/journal.pone.0217520

Editor: Paweł Pławiak, Politechnika Krakowska im Tadeusza Kosciuszki, POLAND

Received: March 11, 2019; Accepted: May 13, 2019; Published: May 31, 2019

Copyright: © 2019 Wu, Fan. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The statistical results are within the manuscript and its Supporting Information file. However, raw data underlying the results presented in the study are available from the National Meteorological Information Center (NMIC) of China Meteorological Administration (CMA) (http://data.cma.cn/).

Funding: This study was jointly supported by the Central Public-interest Scientific Institution Basal Research Fund (Farmland Irrigation Research Institute, CAAS, FIRI2018-01), the National Natural Science Foundation of China (No. 51709144). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Accurate prediction of reference evapotranspiration (ET₀) is significant for irrigation schedules design, crop growth modeling and agricultural water management [1–5]. Various mathematical models have been proposed to estimate ET₀ from meteorological variables, among which the FAO-56 Penman–Monteith (FAO-56 PM) equation is suggested by the Food and Agriculture Organization of the United Nations as a reference model in various regions and climates [6], because it considerers both the thermodynamic and aerodynamic items. However, the FAO-56 PM model needs a variety of climatic parameters as model inputs for calculation, e.g., maximum and minimum ambient temperatures, wind speed, relative humidity and net radiation [7–11], which significantly restricts the application of the FAO-56 PM model in many worldwide regions. Therefore, the simplified empirical models with fewer climatic variables is becoming increasingly popular in the absence of compete data [12–15], such as temperature-based models [16], mass transfer-based models [17] and radiation-based models [18]. However, evapotranspiration is a complex and highly nonlinear phenomenon dependent on several climatic parameters. Therefore, it is difficult to establish empirical models that can consider all those complicated processes. In recent years, much attention has been drawn to use alternative techniques such as machine learning models for ET₀ prediction as a result of their excellent performance in tackling the nonlinear relationship between the model inputs and output [19–21].

The neuron-based machine learning models, i.e. artificial neural networks (ANNs), are the earliest and most widely used models for ET₀ prediction. [22] compared the multi-layer perceptron (MLP) model and (semi) empirical models to estimate daily ET₀ in the Basque Country of Spain with different input combinations, and revealed that the MLP model obtained better ET₀ estimates than the locally calibrated empirical models. [23] also investigated the potential of three ANNs models, including the MLP model, radial basis function neural networks (RBNN) along with generalized regression neural networks (GRNN) for estimating ET₀ at two weather stations in USA using ambient temperatures, relative humidity, wind speed and solar radiation. It was concluded that the MLP and RBNN models attained satisfactory ET₀ estimates. [14] evaluated the potential of four ANNs models, including the generalized feedforward (GFF), linear regression (LR), probabilistic neural network (PNN) and MLP models for short-term ET₀ prediction using forecasted weather data (minimum/maximum temperatures and net solar radiation). It was also found that the MLP model was generally superior to the other three ANNs models. [24] assessed the ANFIS and ANNs models for predicting daily ET₀ in two locations in South Korea based on temperature, sunshine hours, wind speed and relative humidity. They found that the proposed machine learning models performed well for ET₀ estimation. [25] compared the capability of three ANNs models (MLP, RBNN and GRNN) with the adaptive neuro fuzzy inference system (ANFIS) and gene expression programming (GEP) for monthly ET₀ estimation in two locations in the Mediterranean Region of Turkey with only temperature data. They found that the GRNN and GEP models performed better than the other models at one station, while the performance of the RBNN and ANFIS models was best at the other station.

The kernel-based machine learning modes have become increasingly popular for ET₀ prediction. [26] compared the capability of the kernel-based support vector machine (SVM) with the multiple linear regression (MLR), multiple nonlinear regression (MNLR), ANFIS and empirical models for predicting ET₀ in the semi-arid regions of Iran. The results indicated that the SVM and ANFIS models performed better than the other models. [13]Wen et al. (2015) predicted daily ET₀ using the SVM model with few climatic variables in arid China. It was found that the SVM model outperformed the ANNs and empirical models. [27] evaluated the capabilities of the least-squares support vector machine (LS-SVM), multivariate adaptive regression spline (MARS) and M5 model tree (M5Tree) for the estimation of ET₀ in Mediterranean Turkey, and found that the kernel-based LS-SVM model was superior to the other two models. [28] assessed the potential of the kernel-based extreme learning machine (ELM) to predict daily ET₀ in Iraq, and found it exhibited good efficiency and generalization performances. [29] estimated daily ET₀ with the ELM and GRNN models based on air temperatures along in southwestern China. They found that the ELM model was superior to the GRNN model. The kernel-based SVM, LS-SVM and ELM models are also coupled with pre-treatment or optimization algorithms such as wavelet transform (WT) [30] and genetic algorithm (GA) [31] to improve the prediction accuracy. Recently, an improved version of kernel-based machine learning models, i.e., kernel-based nonlinear extension of Arps decline model (KNEA) has been developed and successfully applied in various fields [32–33]. However, this new powerful model has not yet been tested in evapotranspiration studies.

The tree-based machine learning models have recently started to draw researchers’ attention due to their relative simplicity but powerful capability in time-series prediction [34]. [35]evaluated the performance of the M5Tree model for estimating daily ET₀ in California of USA, and found the M5Tree model gave satisfactory ET₀ estimates. [36] evaluated the M5Tree and feedforward ANNs models to predict ET₀ in the arid regions. It was concluded that ET₀ values predicted by the M5Tree and ANNs models agreed well with the FAO-56 PM values. [37] compared the M5Tree and ANNs models to predict ET₀ at two sites in USA. They found that the M5Tree model outperformed the ANNs models for estimation of ET₀ when the input and output data at the target station were not available. [38] compared the random forest (RF) to the GRNN model for prediction of daily ET₀ in southwestern China, and concluded that the RF model gave better daily ET₀ estimates than the GRNN model. [39] further used a hybrid RF model with the wavelet algorithm for daily ET₀ estimation in Southern Iran. They indicated that the new coupled RF model outperformed the classic RF model. [40] explored the performance of two kernel-based models and four tree-based models for daily ET₀ estimation with limited meteorological data across China. They found that the extreme gradient boosting (XGBoost) and gradient boosting decision tree (GBDT) exhibited similar accuracy and stability compare with the kernel-based SVM and ELM models.

Other machine learning models, such as the curve-based MARS model, have also been applied to predict ET₀. [27] modeled monthly ET₀ in Mediterranean Turkey using the MARS, LSSVM and M5Tree models. It was found that the MARS model outperformed the least-squares support vector regression (LSSVR) and M5Tree models. [41] predicted monthly ET₀ in Iran using the MARS, SVM, GEP and empirical models. The results showed that the MARS and SVM-RBF models were generally superior to the GEP and SVM-Poly models. [42] has also evaluated the performances of the GEP model and the (semi)empirical models for predicting daily ET₀ in the hyper-arid regions of Iran, and revealed that the superiority of the GEP model for ET₀ estimation over the (semi)empirical models.

Although the neuron-based, kernel-based, tree-based and curve-based machine learning models have been widely used to predict ET₀ around the world, their performances are inconsistent in various ET₀ studies. Particularly, there is still lack of direct and comprehensive comparison of various categories of machine learning models for prediction of ET₀ in a specific region or country such as China with a vast territory and diverse climates. Due to the limited availability of complete climatic variables, the applicability of these machine learning models for estimation of ET₀ with more cheaply and reliably measured meteorological variables (e.g. temperature and precipitation) should be explored. In addition, although the utmost attention is usually paid to prediction accuracy when applying machine learning models, model stability is another major factor to consider because unstable models may produce inaccurate ET₀ estimates if new data are included [34]. Thus, the objectives of this study are to: (1) determine the effects of temperature and precipitation (a variable representing relative humidity to some extent) on the prediction accuracy of daily ET₀ in different climatic zones of China, and (2) further compare both the prediction accuracy and model stability of eight machine learning models in four categories (MLP, GRNN, ANFIS, SVM, KNEA, M5Tree, XGBoost and MARS) for predicting daily ET₀ across China as a case study.

Materials and methods

Case study and site description

According to multiple-year mean temperatures, precipitation and altitude (Table 1), China is classified as seven climatic regions (Fig 1), i.e. the arid desert of northwest China (NWC), the semi-arid steppe of Inner Mongolia (IM), the Qinghai-Tibetan Plateau (QTP), the (semi-)humid cold-temperate northeast China (NEC), the semi-humid warm-temperate north China (NC), the humid subtropical central China (CC), and the humid tropical south China (SC) [43–44]. The mean daily air temperature in NWC, IM, QTP, NEC, NC, CC and SC varied between -13.2°C and 26.2°C, -10.5°C and 21.1°C, between -9.6°C and 19.8°C, between -19.3°C and 25.7°C, between -4.0°C and 28.4°C, between 1.5°C and 30.7°C and between 3.5°C and 35.6°C, respectively. The average annual precipitation were 269 mm in NWC, 302 mm in IM, 382 mm in QTP, 637 mm in NEC, 658 mm in NC, 1538 mm in CC and 1964 mm in SC.

Download:

Fig 1. The distribution of different climatic regions across China and the geographical locations of 14 weather stations used in this study.

NWC: arid desert of northwest China, IM: semi-arid steppe of Inner Mongolia, QTP: Qinghai-Tibetan Plateau, NEC: (semi-)humid cold-temperate northeast China, NC: semi-humid warm-temperate north China, CC: humid subtropical central China, and SC: humid tropical south China; the South China Sea Islands are presented in the bottom right-hand corner; the same below.

https://doi.org/10.1371/journal.pone.0217520.g001

Download:

Table 1. The geographical locations and annual mean values (± standard deviation) of meteorological data during 2001–2015 for each of the 14 weather stations used in the present study.

https://doi.org/10.1371/journal.pone.0217520.t001

Data collection and analysis

Long-term daily maximum (T_max) and minimum (T_min) temperatures, relative humidity (H_r), wind speed (U) and horizontal global solar radiation (R_s) from 2001–2015 were obtained at 14 representative stations across various climatic regions of China (Fig 1). The geographical locations and meteorological values of the 14 stations are presented in Table 1. These meteorological data are provided by the National Meteorological Information Center (NMIC) of China Meteorological Administration (CMA), who has rigorously examined the data quality. The daily data were further excluded if any of the above meteorological data were missing. Overall, missing data accounted for only 0.08% of the database, ranging from 0 to 0.53% at various weather stations.

FAO-56 Penman–Monteith equation

The FAO-56 Penman–Monteith model was utilized to calculate daily ET₀ (mm day^-1) and provide the reference data for the training and testing of machine learning models in this paper: (1) where, Rn: net radiation (MJ m^-2 day^-1); G: soil heat flux (MJ m^-2 day^-1); T_mean: average ambient temperature (°C); U₂: wind speed at 2 m height (m s^-1); e_s: saturation vapor pressure (kPa); e_a: actual vapor pressure (kPa); Δ: slope of the vapor pressure curve (kPa °C^-1); γ: psychrometric constant (kPa °C^-1). Detailed calculation procedures can be found in [6].

Machine learning models for predicting daily reference evapotranspiration

Multilayer perceptron neural networks (MLP).

The MLP model is one of the widely used ANNs models, which is a feed-forward neural network for nonlinear function approximation. The MLP model consists of three layers: the input, hidden and output layers. The hidden layer often has only one layer, and the number of neural unit has to be determined by the trial and error approach. In the present study, a three-layer neural network was developed: the first layer was input layer, the neural number was equal to the input numbers and the output layer has one neural unit. The MLP model is trained by the Levenberg–Marquardt algorithm, which interpolates between the Gauss–Newton algorithm (GNA) and the gradient descent algorithm. It is more robust than the GNA but still can stick with local, rather than global minima. Further details on the MLP model refer to [4].

Generalized regression neural network (GRNN).

The GRNN model is proposed by [45] and is one of the radial basis function neural network (RBF) models. This model can approximate non-linear function of the input and output vectors with a function estimate obtained from the training dataset. It shows a parallel structure and no iterative process is required for model learning between the inputs and outputs. The GRNN model does not need iterative training procedures compared with the back propagation method. Further details on the GRNN model is given by [46].

Adaptive neuro-fuzzy inference system (ANFIS).

The ANFIS model is proposed by [47], which is a multi-layer adaptive network coupled with neural networks with the fuzzy inference system. The first-order Sugeno fuzzy model with two fuzzy if-then rules is applied in the ANFIS model to approximate the nonlinear function in this study. The ANFIS model is consisted of five layers: the fuzzification, product, normalization, de-fuzzification and output layers. The model uses different node functions to learn and adjust the parameters in a fuzzy inference system, where the forward and backward passes are applied to decrease the computed errors. More details about the ANFIS model is given by [47].

M5 model tree (M5Tree).

The M5Tree model is firstly established by [48], which is a powerful learning method to estimate the true values in a large dataset. It has a series of linear regression functions at the terminal nodes, which develops relationships between the independent variables and a dependent variable. The model firstly constructs a regression tree by splitting the instance space in a recursive manner, and selects the one maximizing the expected errors reduction following all the potential splits. The over-grown trees are then pruned and the sub-trees are then replaced by the linear regression functions. Further details of the M5Tree model refer to [49].

Extreme gradient boosting (XGBoost).

The XGBoost model is proposed by [50] and is originated from the idea of "boosting". The XGBoost model integrates all the predictions of a series of “weak” learners to develop a “strong” learner via an additive training process. It is supposed to avoid the over-fitting issue and reduce the computational time. This can be obtained by simplifying the objective functions and combining the predictive and regularization terms, while it maintains optimum computation efficiency at the same time. Parallel calculation is also automatically implemented during the training period. More details about the XGBoost model refer to [50].

Support vector machine (SVM).

The SVM model is developed by [51], which is widely used for classification, pattern recognition and regression analysis. The SVM model can estimate the regression on the basis of a set of kernel functions, which are capable of implicitly converting the original, lower-dimensional input dataset to a higher-dimensional feature space. The SVM model has been successfully applied in predicting ET₀ [31, 41]. The radial basis function (RBF) nonlinear kernel function was utilized in the present study as a result of its outstanding performance for predicting ET₀ relative to other kernel functions [19](Kisi, 2015), such as linear, polynomial and sigmoid functions. Further information about the SVM model is given by [51].

Kernel-based nonlinear extension of Arps decline model (KNEA).

The KNEA model is a newly nonlinear model initially proposed by [32]based on the Arps decline model and kernel method. Compared with the non-parametric and “Black-Box” kernel-based models such as least-squares SVM, the KNEA model is based on the idea of “Grey-Box” and uses the semi-parametric formulation to build the nonlinear models [52]. The kernel-based grey system models are more efficient with small samples [53–54], while the KNEA model performed better with larger samples since samples are not accumulated in the model.

The KNEA model can be described as: (2) where f(x) is the output at this time, f(x−1) is the output at the last step time. u(x) represents the factors affecting the output, g(u(x)) can be interpreted as the relationship between u(x) and f(x), μ is the bias. From this model, we can see that the output of this time is the result of joint action between the output from last time step and the influencing factors at this time.

The nonlinear function g(u(x)) is difficult to determine and can be translated to: (3)

This means mapping the original influencing factors into the new space. Therefore, the formula (2) can be written as: (4)

Although we still can't solve Eq (4), we can find a very small value so that the difference between the left and right side of the equation is as small as possible: (5) (6) (7) where γ is the regularization term, it can control the model smoothness. Like SVM, this optimization problem can be solved by Lagrangian multiplier method: (8) where λ_x is the Lagrangian multiplier. The Karush–Kuhn–Tucker (KKT) conditions for optimality of the Lagrangian multiplier method are as follows: (9) (10) where

,

x_2|n = [x(2),x(3),⋯,x(n)]^T,

Ω_ij = φ(u(i))×φ(u(j)) = K(u(i),u(j)), where I_n-1 is an n−1 dimensional identity matrix with all the diagonal elements as 1 and others as 0. λ,μ and a can be obtained by Eq (9). The Ω_ij can be employed a kernel function K(⋅,⋅), which satisfies the Mercer’s theorem, and a Gauss-type kernel function was selected in the present study. Further information about is given by [32].

(11)

Multivariate adaptive regression spline (MARS).

The MARS model is a non-parametric regression approach proposed by [55], which needs no assumption on the relationships between the independent and dependent variables. In the MARS model, a series of coefficients and functions defined as basis functions are used for modeling. The basis function of the MARS model is the outcome of a truncated spline function or multiple spline functions. The number of basis functions and the determination of basis functions are automatically determined by data. Meanwhile, the MARS model integrates the merits of the recursive auto-fractional regression method in dividing spatial regions, projection tracking method in processing high-dimensional data and the advantages of accumulative regression node self-adaptation. Further details of the MARS model refer to [10].

Input combinations and K-fold cross-validation

Precipitation is not directly correlated to ET₀, but it is a manifestation of relative humidity to some extent and may correct the temperature-based ET₀ models. However, the real amount of precipitation may underestimate or exaggerate its effect on the reduction of daily ET₀ due to large variation range from 0 mm to even hundreds of mm in humid regions. Therefore, a simple transformed precipitation (P_t, 1 for precipitation > 0 and 0 for precipitation = 0) was applied here to represent the general effect of precipitation on ET₀ prediction. Two input combinations of meteorological variables were thus used in the present study to assess the temperature and precipitation effects on daily ET₀ prediction, i.e., C1: T_max, T_min and R_a; C2: T_max, T_min, and R_a. The K-fold cross-validation method was applied, where the obtained temperature and precipitation data during 2001–2015 were equally partitioned into five periods. Four periods were used for model training and the last one was used to test the models, which was run over the five various stages (Table 2). The main parameters of the eight machine learning models were optimized by using the grid-search method. Fig 2 presents the simple flowchart of the proposed methodology in the present study.

Download:

Fig 2. Simple flowchart of the proposed methodology in the present study.

https://doi.org/10.1371/journal.pone.0217520.g002

Download:

Table 2. The five cross-validation stages involved in the present study.

https://doi.org/10.1371/journal.pone.0217520.t002

Statistical evaluation

Four common statistical indicators were used in this study to evaluate the models, i.e., RMSE, R², MAE and NRMSE, which can be expressed as [56–57]: (12) (13) (14) (15) where Y_i,m, Y_i,e, and are the measured, estimated, mean of measured, mean of estimated reference evapotranspiration, respectively; n is the number of observed data. Higher R² values indicate high prediction accuracy, whereas lower values of RMSE, MAE and NRMSE suggest better model performance. Considering the requirements of the MLP and KNEA models, the raw climatic data were normalized between 0 and 1 as follows: (16) where z_n and z_i is the normalized and raw data; z_max and z_min are the minimum and maximum raw data.

Results and discussion

Comparison of prediction accuracy of eight machine learning models across China

The statistical results of the eight machine learning models (MLP, GRNN, ANFIS SVM, KNEA, M5Tree, XGBoost and MARS) for estimating daily ET₀ in NWC (Urumqi and Dunhuang), IM (Yinchuan and Erenhot), NEC (Harbin and Shenyang), NC (Beijing and Zhengzhou), QTP (Geermu and Lasa), CC (Wuhan and Guilin) and SC (Guangzhou and Haikou) under the two input combinations of climatic variables (C1: T_max, T_min and R_a; C2: T_max, T_min, P_t and R_a) during both training and testing periods, are presented in Tables 3–9, respectively.

Download:

Table 3. Statistical values of the eight machine learning models with different input parameters during training and testing at Urumqi and Dunhuang in the arid desert of northwest China.

https://doi.org/10.1371/journal.pone.0217520.t003

Download:

Table 4. Statistical values of the eight machine learning models with different input parameters during training and testing at Yinchuan and Erenhot in the semi-arid steppe of Inner Mongolia.

https://doi.org/10.1371/journal.pone.0217520.t004

Download:

Table 5. Statistical values of the eight machine learning models with different input parameters during training and testing at Harbin and Shenyang in the (semi-)humid cold-temperate northeast China.

https://doi.org/10.1371/journal.pone.0217520.t005

Download:

Table 6. Statistical values of the eight machine learning models with different input parameters during training and testing at Beijing and Zhengzhou in the semi-humid warm-temperate north China.

https://doi.org/10.1371/journal.pone.0217520.t006

Download:

Table 7. Statistical values of the eight machine learning models with different input parameters during training and testing at Geermu and Lasa in the Qinghai-Tibetan Plateau.

https://doi.org/10.1371/journal.pone.0217520.t007

Download:

Table 8. Statistical values of the eight machine learning models with different input parameters during training and testing at Wuhan and Guilin in the humid subtropical central China.

https://doi.org/10.1371/journal.pone.0217520.t008

Download:

Table 9. Statistical values of the eight machine learning models with different input parameters during training and testing at Guangzhou and Haikou in the humid tropical south China.

https://doi.org/10.1371/journal.pone.0217520.t009

In terms of R², RMSE, NRMSE and MAE averaged over the fourteen weather stations across China, the GRNN model (on average 0.888, 0.578 mm day⁻¹, 0.201 and 0.394 mm day⁻¹, respectively), XGBoost model (on average 0.882, 0.612 mm day⁻¹, 0.207 and 0.424 mm day⁻¹, respectively) and M5Tree model (on average 0.873, 0.626 mm day⁻¹, 0.217 and 0.432 mm day⁻¹, respectively) generally produced better prediction accuracy in predicting daily ET₀ than the other machine learning models during training in the whole China. The MLP model (on average R² = 0.850, RMSE = 0.685 mm day⁻¹, NRMSE = 0.257 and MAE = 0.476 mm day⁻¹) produced the worst performance in all the climatic zones, followed by the MARS model (on average R² = 0.855, RMSE = 0.668 mm day⁻¹, NRMSE = 0.232 and MAE = 0.469 mm day⁻¹) and SVM model (on average R² = 0.857, RMSE = 0.662 mm day⁻¹, NRMSE = 0.232 and MAE = 0.456 mm day⁻¹). On the contrary, the SVM model (on average R² = 0.860, RMSE = 0.674 mm day⁻¹, NRMSE = 0.234 and MAE = 0.470 mm day⁻¹), KNEA model (on average R² = 0.857, RMSE = 0.676 mm day⁻¹, NRMSE = 0.236 and MAE = 0.474 mm day⁻¹) and MARS model (on average R² = 0.855, RMSE = 0.685 mm day⁻¹, NRMSE = 0.237 and MAE = 0.483 mm day⁻¹) generally performed better than the other models for daily ET₀ estimation during the testing period in the whole China. The M5Tree model (on average R² = 0.840, RMSE = 0.716 mm day⁻¹, NRMSE = 0.249 and MAE = 0.504 mm day⁻¹), MLP model (on average R² = 0.851, RMSE = 0.698 mm day⁻¹, NRMSE = 0.259 and MAE = 0.487 mm day⁻¹) and GRNN model (on average R² = 0.845, RMSE = 0.696 mm day⁻¹, NRMSE = 0.245 and MAE = 0.492 mm day⁻¹) performed worst among the machine learning models. These results are in good agreement with other machine learning-based ET₀ studies. For instance, [58] found that the LS-SVM model yielded accurate ET₀ estimation in the Changwu County, China. [13] indicated that the SVM model outperformed the ANNs model for the estimation of daily ET₀ in an extreme arid region of China. [27] found that the LS-SVM model outperformed the MARS and M5Tree models, while the MARS model was superior to the LS-SVM and M5Tree models in cross-station applications. [59] suggested that the SVM model gave better daily ET₀ estimation than the tree-based assemble models (RF, M5Tree, GBDT and XGBoost) under various input combinations across China.

Specifically, the GRNN model outperformed all the other machine learning models in daily ET₀ modeling in the seven climatic zones of China during the training period. However, during the testing period, the statistical indicators indicated that the KNEA model had the lowest values of average RMSE, NRMSE, MAE but the highest R² values in NWC (0.757 mm day⁻¹, 0.241, 0.476 mm day⁻¹ and 0.899, respectively) and IM (0.732 mm day⁻¹, 0.229, 0.491 mm day⁻¹ and 0.902, respectively). The lowest values of average RMSE, NRMSE, MAE but the highest R² values in QTP (0.477 mm day⁻¹, 0.145, 0.339 mm day⁻¹ and 0.914, respectively), CC (0.757 mm day⁻¹, 0.241, 0.476 mm day⁻¹ and 0.899, respectively) and SC (0.757 mm day⁻¹, 0.241, 0.476 mm day⁻¹ and 0.899, respectively) were obtained by the SVM model. However, the SVM and KNEA models exhibited very close predication accuracy in NEC (on average R² = 0.871, RMSE = 0.659 mm day⁻¹, NRMSE = 0.287 and MAE = 0.436 mm day⁻¹) and NC (on average R² = 0.806, RMSE = 0.788 mm day⁻¹, NRMSE = 0.273 and MAE = 0.576 mm day⁻¹). Much lower statistical errors were observed in QTP (on average R² = 0.907, RMSE = 0.489 mm day⁻¹, NRMSE = 0.154 and MAE = 0.351 mm day⁻¹) compared with those obtained in the other climatic zones (on average R² = 0.850, RMSE = 0.685 mm day⁻¹, NRMSE = 0.257 and MAE = 0.476 mm day⁻¹). The results suggested that these machine learning models gave higher accuracy in the Qinghai-Tibet Plateau. The RMSE values obtained by these best-performing models in each climatic zone were generally smaller than or close to those obtained in the corresponding regions by previous studies when using only T_max and T_min data, e.g., by the SVM (0.539 mm day⁻¹) and ANN (0.561 mm day⁻¹) models in Ejina City of China [13], by ELM (0.444–0.498 mm day⁻¹) mode, GANN (0.445–0.499 mm day⁻¹) and WNN (0.443–0.641 mm day⁻¹) models in the humid region of southwest China using T_max and T_min [2], by SVM (0.530–0.868 mm day⁻¹), M5Tree (0.637–0.953 mm day⁻¹) and XGBoost (0.532–0.817 mm day⁻¹) models in different climatic zones of China [40].

The scatter plots of daily FAO-56 PM ET₀ values and those predicted by the eight machine learning models for the capital city of China (Beijing) over the five-fold cross validation periods under the two input combinations during testing are presented in Figs 3 and 4, respectively. It can be seen that the selected machine learning models had various prediction accuracies over the five periods. Overall, higher statistical errors were attained during the S4 period (2010–2012), and the S5 period (2013–2015) produced higher prediction accuracy. These differences were largely resulted from the time-series changes in climatic variables among the five periods. This confirms the needs to apply the K-fold cross-validation method for accurately estimating daily ET₀ in various climates [27, 59]. Nevertheless, the performances of these machine learning models showed the same tendency at various cross-validation stages, with better daily ET₀ estimates by the SVM, KNEA and MARS models. The dispersion degree of the data points of the SVM, KNEA and MARS models was lower than that of the M5Tree, MLP and GRNN models. Figs 5 and 6 show the scatter plots of estimated values of daily ET₀ by these machine learning models against the corresponding FAO56-PM values in the arid NWC (Urumqi and Dunhuang) and the humid SC (Guangzhou and Haikou) during the testing period, while the corresponding scatter plots of IM (Yinchuan and Erenhot), NEC (Harbin and Shenyang), NC (Beijing and Zhengzhou), QTP (Geermu and Lasa) and CC (Wuhan and Guilin) can be found in S1–S5 Figs. It is seen that the M5Tree, MLP and GRNN models exhibited more scattered estimates compared with the other models. The daily ET₀ values predicted by the SVM, KNEA and MARS models were more close to the corresponding FAO-56 PM values and generally showed the same tendency, which further confirmed the superiority of the SVM, KNEA and MARS for daily ET₀ estimation across China.

Download:

Fig 3. Scatter plots of the ET₀ values calculated by the FAO-56 PM equation for China’s capital city of Beijing and the values estimated by the eight machine learning models during five cross-validation stages under the input combination of T_max, T_min and R_a in the testing stage.

https://doi.org/10.1371/journal.pone.0217520.g003

Download:

Fig 4. Scatter plots of the ET₀ values calculated by the FAO-56 PM equation for China’s capital city of Beijing and the values estimated by the eight machine learning models during five cross-validation stages under the input combination of T_max, T_min, P_t and R_a in the testing stage.

https://doi.org/10.1371/journal.pone.0217520.g004

Download:

Fig 5. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Urumqi and Dunhuang in the arid desert of northwest China.

https://doi.org/10.1371/journal.pone.0217520.g005

Download:

Fig 6. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Guangzhou and Haikou in the humid tropical south China.

https://doi.org/10.1371/journal.pone.0217520.g006

Comparison of prediction accuracy of eight machine learning models under two input combinations

As seen from Tables 3–9, the predicted ET₀ values differed significantly under the two input combinations of climatic variables (C1: T_max, T_min and R_a; C2: T_max, T_min, P_t and R_a) during training and testing. Generally, machine learning models with the input combination of T_max, T_min, P_t and R_a (on average R² = 0.879, RMSE = 0.613 mm day⁻¹, NRMSE = 0.214 and MAE = 0.426 mm day⁻¹ during training; R² = 0.862, RMSE = 0.663 mm day⁻¹, NRMSE = 0.235 and MAE = 0.469 mm day⁻¹ during testing) obtained better prediction accuracy compared with those with the input combination of T_max, T_min and R_a (on average R² = 0.844, RMSE = 0.666 mm day⁻¹, NRMSE = 0.232 and MAE = 0.467 mm day⁻¹ during training; R² = 0.829, RMSE = 0.718 mm day⁻¹, NRMSE = 0.250 and MAE = 0.508 mm day⁻¹ during testing) in the whole China. These statistical results were generally similar to those obtained by various machine learning models with only T_max and T_min data in previous studies [2, 13, 40]. However, the incorporation of P_t as input parameter to the machine learning models introduced an average decrease in RMSE by 7.7% and 7.6% during the training and testing periods, respectively. Specifically, an average decrease in RMSE by 4.8%, 6.0%, 7.4%, 6.5% and 6.5% were obtained during the testing period by considering P_t in the machine learning models in NWC, IM, QTP, NEC and NC, respectively. However, much higher decreases in RMSE were obtained in CC (by 9.7%) and SC (by 12.4%) by machine learning models with the input combination of T_max, T_min, P_t and R_a compared with those with the input combination of T_max, T_min and R_a. These results indicated that the incorporation of precipitation information in machine learning models can improve the prediction accuracy of daily ET₀, particularly in the subtropical and tropical humid regions (Tables 3–9). [59]have found that the prediction accuracy of empirical and machine learning models for estimating daily global solar radiation can be much improved by considering precipitation as an input, because it was a manifestation of cloud cover and could correct the temperature-based models by considering its effects on the radiation reduction. This can also explain why the performance of machine learning models for daily ET₀ prediction from daily minimum/maximum temperature can be enhanced when the information of precipitation was further included.

Comparison of model stability of eight machine learning models

As illustrated earlier, the GRNN, XGBoost and M5Tree models outperformed the corresponding MLP, SVM, KNEA, ANFIS and MARS models for predicting daily ET₀ in the whole China during the training period in terms of R², RMSE, NRMSE and MAE (Tables 3–9). However, the SVM, KNEA and MARS models produced better daily ET₀ estimates compared with the other machine learning models during the testing period. The percentage increase in RMSE during testing relative to RMSE during training by the eight machine learning models for estimating daily ET₀ in NWC (Urumqi and Dunhuang), IM (Yinchuan and Erenhot), NEC (Harbin and Shenyang), NC (Beijing and Zhengzhou), QTP (Geermu and Lasa), CC (Wuhan and Guilin) and SC (Guangzhou and Haikou) under the two input combinations are also shown in Tables 3–9. The model stability is also an essential factor to consider for predicting more accurate and reliable daily ET₀. These tables suggested that the SVM, MLP and MARS models were the most stable models with the consistently small percentage increase in RMSE during testing relative to that during training in all the climatic zones of China (on average 1.9%, 2.0% and 2.6%, respectively). The KNEA and ANFIS models also exhibited relatively smaller increase in testing RMSE (on average 6.4% and 7.8%, respectively). However, the GRNN, M5Tree and XGBoost models exhibited the much larger increases in testing RMSE (on average 20.1%, 14.5% and 12.0%, respectively). These increase indicated the instability of the GRNN, M5Tree and XGBoost models as they introduced high decreases in model performances when including new dataset. [40]showed that the kernel-based SVM and ELM models were more stable compared with the tree-based RF, M5Tree and XGBoost models for the estimation of daily ET₀. [34]also found that the RF and bagging models showed greater increases in RMSE during testing compared with the SVR and gradient models when predicting global solar radiation. These suggest that the kernel-based machine learning models (e.g., SVM, ELM and KNEA) are generally more stable than the tree-based models (RF, M5Tree and XGBoost).

Comprehensive evaluation of eight machine learning models

The SVM, KNEA and MARS models outperformed the other machine learning models in daily ET₀ modeling in terms of prediction accuracy during the testing period. Considering the model stability, the SVM, MLP and MARS exhibited very small percentage increase in RMSE during testing (< 3.0%), while the KNEA and ANFIS models showed relatively small increase in testing RMSE (< 8.0%). The SVM model exhibited the best combination of prediction accuracy and model stability among the eight machine learning models, while the KNEA and MARS model also provided satisfactory combination of prediction accuracy and model stability. Comprehensively considering the prediction accuracy and model stability, the SVM, KNEA and MARS models are recommended for estimating daily ET₀ using only temperature and precipitation data across various climatic regions of China and maybe elsewhere in similar climates.

Conclusions

The performance of eight machine learning models in four categories, e.g. neuron-based (MLP, GRNN, ANFIS), kernel-based (SVM, KNEA), tree-based (M5Tree, XGBoost) and curve-based (MARS) models, for the estimation of daily ET₀ were compared based on only temperature and precipitation data during 2001–2015 obtained from 14 representative stations across various climatic zones of China. The results showed that the machine learning models using only temperature attained satisfactory daily ET₀ estimation. The prediction accuracy was further improved across China when information of precipitation was considered, especially in the (sub)tropical humid regions. This indicates that precipitation is a manifestation of relative humidity to some extent and can correct the temperature-based ET₀ models. The kernel-based SVM, KNEA and curve-based MARS models generally gave more accurate daily ET₀ estimates than the other models for, with the best performance by KNEA in NWC and IM, by SVM in QTP, CC and SC, as well as a similar best performance by them in NEC and NC. The SVM, MLP, MARS and KNEA models showed relatively small percentage increase in RMSE during testing over the training one. Comprehensively considering both prediction accuracy and model stability, SVM is highly suggested, while KNEA and MARS are also alternative models for predicting daily ET₀ in various climatic regions of China. The satisfactory performances of these proposed machine learning models with ambient temperatures and transformed precipitation indicates that it is possible for near-future prediction of daily ET₀ using public weather forecasts, including daily maximum and minimum temperatures and whether there is precipitation or not. Nevertheless, more study is needed to explore the performances of the proposed machine learning models at varying temporal scales or in various climatic regions.

Supporting information

S1 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Yinchuan and Erenhot in the semi-arid steppe of Inner Mongolia.

https://doi.org/10.1371/journal.pone.0217520.s001

(EPS)

S2 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Harbin and Shenyang in the (semi-)humid cold-temperate northeast China.

https://doi.org/10.1371/journal.pone.0217520.s002

(EPS)

S3 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Beijing and Zhengzhou in the semi-humid warm-temperate north China.

https://doi.org/10.1371/journal.pone.0217520.s003

(EPS)

S4 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Geermu and Lasa in the Qinghai-Tibetan Plateau.

https://doi.org/10.1371/journal.pone.0217520.s004

(EPS)

S5 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Wuhan and Guilin in the humid subtropical central China.

https://doi.org/10.1371/journal.pone.0217520.s005

(EPS)

S1 Data.

https://doi.org/10.1371/journal.pone.0217520.s006

(RAR)

Acknowledgments

Thanks to the National Meteorological Information Center of China Meteorological Administration for offering the meteorological data.

References

1. Fan J, Oestergaard KT, Guyot A, Lockington DA. Estimating groundwater recharge and evapotranspiration from water table fluctuations under three vegetation covers in a coastal sandy aquifer of subtropical Australia. J Hydrol. 2014; 519: 1120–1129. https://doi.org/10.1016/j.jhydrol.2014.08.039
- View Article
- Google Scholar
2. Feng Y, Cui NB, Zhao L, Hu XT, Gong DZ. Comparison of ELM, GANN, WNN and empirical models for estimating reference evapotranspiration in humid region of Southwest China. J Hydrol. 2016; 536: 376–383. https://doi.org/10.1016/j.jhydrol.2016.02.053
- View Article
- Google Scholar
3. Karimi S, Kisi O, Kim S, Nazemi AH, Shiri J. Modelling daily reference evapotranspiration in humid locations of South Korea using local and cross-station data management scenarios. Int J Climatol. 2017; 37: 3238–3246. https://doi.org/10.1002/joc.4911
- View Article
- Google Scholar
4. Wang LC, Kisi O, Hu B, Bilal M, Zounemat-Kermani M, Li H. Evaporation modelling using different machine learning techniques. Int J Climatol. 2017; 37: 1076–1092. https://doi.org/10.1002/joc.5064
- View Article
- Google Scholar
5. Wang X, Fan J, Xing Y, Xu G, Wang H, Deng J, et al. The effects of mulch and nitrogen fertilizer on the soil environment of crop plants. Adv Agron. 2019; 153: 121–173. https://doi.org/10.1016/bs.agron.2018.08.003
- View Article
- Google Scholar
6. Allen RG, Pereira LS, Raes D, Smith M. Crop evapotranspiration—Guidelines for computing crop water requirements—FAO Irrigation and drainage paper 56. 1998:1–15.
7. Kisi O, Zounemat-Kermani M. Comparison of two different adaptive neuro-fuzzy inference systems in modelling daily reference evapotranspiration. Water Resour Manag. 2014; 28(9): 2655–2675.
- View Article
- Google Scholar
8. Shiri J, Nazemi AH, Sadraddini AA, Landeras G, Kisi O, Fard AF, et al. Comparison of heuristic and empirical approaches for estimating reference evapotranspiration from limited inputs in Iran. Comput Electron Agric. 2014; 108: 230–241. https://doi.org/10.1016/j.compag.2014.08.007
- View Article
- Google Scholar
9. Fan JL, Wu LF, Zhang FC, Xiang YZ, Zheng J. Climate change effects on reference crop evapotranspiration across different climatic zones of China during 1956–2015. J Hydrol. 2016; 542: 923–937. https://doi.org/10.1016/j.jhydrol.2016.09.060
- View Article
- Google Scholar
10. Keshtegar B, Kisi O, Arab H G, Zounemat-Kermani M. Subset modeling basis ANFIS for prediction of the reference evapotranspiration. Water Resour Manag. 2018; 32(3): 1101–1116. https://doi.org/10.1007/s11269-017-1857-5
- View Article
- Google Scholar
11. Karbasi M. Forecasting of multi-step ahead reference evapotranspiration using Wavelet-Gaussian Process Regression model. Water Resour Manag. 2018; 32(3): 1035–1052. https://doi.org/10.1007/s11269-017-1853-9
- View Article
- Google Scholar
12. Tabari H, Grismer ME, Trajkovic S. Comparative analysis of 31 reference evapotranspiration methods under humid conditions. Irrig Sci. 2013; 31: 107–117. https://doi.org/10.1007/s00271-011-0295-z
- View Article
- Google Scholar
13. Wen X, Si J, He Z, Wu J, Shao H, Yu H. Support-vector-machine-based models for modeling daily reference evapotranspiration with limited climatic data in extreme arid regions. Water Resour Manag. 2015; 29: 3195–3209. https://doi.org/10.1007/s11269-015-0990-2
- View Article
- Google Scholar
14. Traore S, Luo Y, Fipps G. Deployment of artificial neural network for short-term forecasting of evapotranspiration using public weather forecast restricted messages. Agric Water Manag. 2016; 163: 363–379. https://doi.org/10.1016/j.agwat.2015.10.009
- View Article
- Google Scholar
15. Antonopoulos VZ., Antonopoulos AV. Daily reference evapotranspiration estimates by artificial neural networks technique and empirical equations using limited input climate variables. Comput Electron Agric. 2017; 132: 86–96. https://doi.org/10.1016/j.compag.2016.11.011
- View Article
- Google Scholar
16. Hargreaves GH, Samani ZA. Reference crop evapotranspiration from temperature. Appl Eng Agric. 1985; 1: 96–99. https://doi.org/10.13031/2013.26773
- View Article
- Google Scholar
17. Trabert W. Neue beobachtungen über verdampfungsgeschwindigkeiten. Meteorol Z. 1896; 13: 261–263.
- View Article
- Google Scholar
18. Priestley CHB, Taylor RJ. On the assessment of surface heat flux and evaporation using large-scale parameters. Mon Weather Rev. 1972; 100: 81–92. https://doi.org/10.1175/1520-0493(1972)100<0081:OTAOSH>2.3.CO;2
- View Article
- Google Scholar
19. Kisi O. Pan evaporation modeling using least square support vector machine, multivariate adaptive regression splines and M5 model tree. J Hydrol. 2015; 528: 312–320. https://doi.org/10.1016/j.jhydrol.2015.06.052
- View Article
- Google Scholar
20. Wang L, Kisi O, Zounemat-Kermani M, Li H. Pan evaporation modeling using six different heuristic computing methods in different climates of China. J Hydrol. 2017; 544: 407–427. https://doi.org/10.1016/j.jhydrol.2016.11.059
- View Article
- Google Scholar
21. Jovic S, Nedeljkovic B, Golubovic Z, Kostic N. Evolutionary algorithm for reference evapotranspiration analysis. Comput Electron Agric. 2018; 150: 1–4. https://doi.org/10.1016/j.compag.2018.04.003
- View Article
- Google Scholar
22. Landeras G, Ortiz-Barredo A, López JJ. Comparison of artificial neural network models and empirical and semi-empirical equations for daily reference evapotranspiration estimation in the Basque Country (Northern Spain). Agric Water Manag. 2008; 95: 553–565. https://doi.org/10.1016/j.agwat.2007.12.011
- View Article
- Google Scholar
23. Kisi O. The potential of different ANN techniques in evapotranspiration modelling. Hydrol Process 2008; 22(14): 2449–2460. https://doi.org/10.1002/hyp.6837
- View Article
- Google Scholar
24. Pour-Ali Baba A, Shiri J, Kisi O, Fard AF, Kim S, Amini R. Estimating daily reference evapotranspiration using available and estimated climatic data by adaptive neuro-fuzzy inference system (ANFIS) and artificial neural network (ANN). Hydrol Res. 2013; 44(1): 131–146. https://doi.org/10.2166/nh.2012.074
- View Article
- Google Scholar
25. Sanikhani H, Kisi O, Maroufpoor E, Yaseen Z M. Temperature-based modeling of reference evapotranspiration using several artificial intelligence models: application of different modeling scenarios. Theor Appl Climatol. 2019; 135(1–2): 449–462. https://doi.org/10.1007/s00704-018-2390-z
- View Article
- Google Scholar
26. Tabari H, Kisi O, Ezani A, Talaee PH. SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J. Hydrol. 2012; 444: 78–89. https://doi.org/10.1016/j.jhydrol.2012.04.007
- View Article
- Google Scholar
27. Kisi O. Modeling reference evapotranspiration using three different heuristic regression approaches. Agric Water Manag. 2016; 169: 162–172. https://doi.org/10.1016/j.agwat.2016.02.026
- View Article
- Google Scholar
28. Abdullah SS, Malek MA, Abdullah NS, Kisi O, Yap KS. Extreme learning machines: a new approach for prediction of reference evapotranspiration. J Hydrol. 2015; 527: 184–195. https://doi.org/10.1016/j.jhydrol.2015.04.073
- View Article
- Google Scholar
29. Feng Y, Peng Y, Cui N, Gong D, Zhang K. Modeling reference evapotranspiration using extreme learning machine and generalized regression neural network only with temperature data. Comput Electron Agric. 2017; 136: 71–78. https://doi.org/10.1016/j.compag.2017.01.027
- View Article
- Google Scholar
30. Gocić M, Motamedi S, Shamshirband S, Petković D, Ch S, Hashim R, et al. Soft computing approaches for forecasting reference evapotranspiration. Comput Electron Agric. 2015; 113: 164–173. https://doi.org/10.1016/j.compag.2015.02.010
- View Article
- Google Scholar
31. Yin Z, Wen X, Feng Q, He Z, Zou S, Yang L. Integrating genetic algorithm and support vector machine for modeling daily reference evapotranspiration in a semi-arid mountain area. Hydrol Res. 2017; 48: 1177–1191. https://doi.org/10.2166/nh.2016.205
- View Article
- Google Scholar
32. Ma X, Liu ZB. Predicting the oil production using the novel multivariate nonlinear model based on Arps decline model and kernel method. Neural Comput Appl. 2018; 29: 579–591. https://doi.org/10.1007/s00521-016-2721-x
- View Article
- Google Scholar
33. Wu L, Huang G, Fan J, Zhang F, Wang X, Zeng W. Potential of kernel-based nonlinear extension of Arps decline model and gradient boosting with categorical features support for predicting daily global solar radiation in humid regions. Energy Convers Manag. 2019; 183: 280–295. https://doi.org/10.1016/j.enconman.2018.12.103
- View Article
- Google Scholar
34. Hassan MA, Khalil A, Kaseb S, Kassem MA. Exploring the potential of tree-based ensemble methods in solar radiation modeling. Appl Energy 2017; 203: 897–916. https://doi.org/10.1016/j.apenergy.2017.06.104
- View Article
- Google Scholar
35. Pal M, Deswal S. M5 model tree based modelling of reference evapotranspiration. Hydrol Process 2009; 23: 1437–1443. https://doi.org/10.1002/hyp.7266
- View Article
- Google Scholar
36. Rahimikhoob A. Comparison between M5 model tree and neural networks for estimating reference evapotranspiration in an arid environment. Water Resour Manag. 2014; 28: 657–669. https://doi.org/10.1007/s11269-013-0506-x
- View Article
- Google Scholar
37. Kisi O, Kilic Y. An investigation on generalization ability of artificial neural networks and M5 model tree in modeling reference evapotranspiration. Theor Appl Climatol. 2016; 126: 413–425. https://doi.org/10.1007/s00704-015-1582-z
- View Article
- Google Scholar
38. Feng Y, Cui N, Gong D, Zhang Q, Zhao L. Evaluation of random forests and generalized regression neural networks for daily reference evapotranspiration modelling. Agric Water Manag. 2017; 193: 163–173. https://doi.org/10.1016/j.agwat.2017.08.003
- View Article
- Google Scholar
39. Shiri J. Improving the performance of the mass transfer-based reference evapotranspiration estimation approaches through a coupled wavelet-random forest methodology. J Hydrol. 2018; 561: 737–750. https://doi.org/10.1016/j.jhydrol.2018.04.042
- View Article
- Google Scholar
40. Fan J, Yue W, Wu L, Zhang F, Cai H, Wang X, et al. Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric For Meteorol. 2018; 263: 225–241. https://doi.org/10.1016/j.agrformet.2018.08.019
- View Article
- Google Scholar
41. Mehdizadeh S, Behmanesh J, Khalili K. Using MARS, SVM, GEP and empirical equations for estimation of monthly mean reference evapotranspiration. Comput Electron Agric 2017; 139: 103–114. https://doi.org/10.1016/j.compag.2017.05.002
- View Article
- Google Scholar
42. Shiri J. Evaluation of FAO56-PM, empirical, semi-empirical and gene expression programming approaches for estimating daily reference evapotranspiration in hyper-arid regions of Iran. Agric Water Manag. 2017; 188: 101–114. https://doi.org/10.1016/j.agwat.2017.04.009
- View Article
- Google Scholar
43. Zhao SQ. A new scheme for comprehensive physical regionalization in China. Acta Geogr Sin. 1983; 38: 1–10. https://doi.org/10.11821/xb198301001
- View Article
- Google Scholar
44. Yao N, Li Y, Lei T, Peng L. Drought evolution, severity and trends in mainland China over 1961–2013. Sci Total Environ. 2018; 616: 73–89. pmid:29107781
- View Article
- PubMed/NCBI
- Google Scholar
45. Specht DF. A general regression neural network. IEEE Trans Neural Netw. 1991; 2(6): 568–576. pmid:18282872
- View Article
- PubMed/NCBI
- Google Scholar
46. Wang L, Kisi O, Zounemat-Kermani M, Salazar GA, Zhu Z, Gong W. Solar radiation prediction using different techniques: model evaluation and comparison. Renew Sustain Energy Rev 2016; 61: 384–397. https://doi.org/10.1016/j.rser.2016.04.024
- View Article
- Google Scholar
47. Jang JS. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern. 1993; 23: 665–685. https://doi.org/10.1109/21.256541
- View Article
- Google Scholar
48. Quinlan JR. Learning with continuous classes. In 5th Australian joint conference on artificial intelligence, 1992; 92: 343–348.
- View Article
- Google Scholar
49. Wang Y, Witten IH. Induction of model trees for predicting continuous lasses. In: Proceedings of the Poster Papers of the European Conference on Machine Learning, University of Economics, Faculty of Informatics and Statistics, Prague, 1997.
50. Chen T, He T, Benesty M, Khotilovich V, Tang Y. Xgboost: extreme gradient boosting. R Packag. version 0.4–2 1–4, 2015.
51. Vapnik V. The nature of statistical learning theory. Springer science & business media, 2003.
52. Ma X, Xie M, Wu W, Zeng B, Wang Y, Wu X. The novel fractional discrete multivariate grey system model and its applications. Appl Math Model. 2019; 70: 402–424. https://doi.org/10.1016/j.apm.2019.01.039
- View Article
- Google Scholar
53. Ma X, Liu Z. The GMC (1, n) model with optimized parameters and its application. J grey Syst. 2017; 29(4): 122–138.
- View Article
- Google Scholar
54. Ma X. A brief introduction to the Grey Machine Learning; 2018. Preprint. Available from: arXiv1805.01745.
55. Friedman JH. Multivariate adaptive regression splines. Ann Stat. 1991; 19(1): 1–67. https://doi.org/10.1002/9780470057339.vaa008
- View Article
- Google Scholar
56. Fan J, Wu L, Zhang F, Cai H, Ma X, Bai H. Evaluation and development of empirical models for estimating daily and monthly mean daily diffuse horizontal solar radiation for different climatic regions of China. Renew Sustain Energy Rev. 2019; 105: 168–186. https://doi.org/10.1016/j.rser.2019.01.040
- View Article
- Google Scholar
57. Fan J, Wu L, Zhang F, Cai H, Zeng W, Wang X, et al. Empirical and machine learning models for predicting daily global solar radiation from sunshine duration: A review and case study in China. Renew Sustain Energy Rev. 2019; 100: 186–212. https://doi.org/10.1016/j.rser.2018.10.018
- View Article
- Google Scholar
58. Guo X, Sun X, Ma J. Prediction of daily crop reference evapotranspiration (ET0) values through a least-squares support vector machine model. Hydrol Res. 2011; 42: 268–274. https://doi.org/10.2166/nh.2011.072
- View Article
- Google Scholar
59. Fan J, Wang X, Wu L, Zhou H, Zhang F, Yu X, et al. Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China. Energy Convers Manag. 2018; 164: 102–111. https://doi.org/10.1016/j.enconman.2018.02.087
- View Article
- Google Scholar

[ref1] 1. Fan J, Oestergaard KT, Guyot A, Lockington DA. Estimating groundwater recharge and evapotranspiration from water table fluctuations under three vegetation covers in a coastal sandy aquifer of subtropical Australia. J Hydrol. 2014; 519: 1120–1129. https://doi.org/10.1016/j.jhydrol.2014.08.039
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Feng Y, Cui NB, Zhao L, Hu XT, Gong DZ. Comparison of ELM, GANN, WNN and empirical models for estimating reference evapotranspiration in humid region of Southwest China. J Hydrol. 2016; 536: 376–383. https://doi.org/10.1016/j.jhydrol.2016.02.053
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Karimi S, Kisi O, Kim S, Nazemi AH, Shiri J. Modelling daily reference evapotranspiration in humid locations of South Korea using local and cross-station data management scenarios. Int J Climatol. 2017; 37: 3238–3246. https://doi.org/10.1002/joc.4911
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Wang LC, Kisi O, Hu B, Bilal M, Zounemat-Kermani M, Li H. Evaporation modelling using different machine learning techniques. Int J Climatol. 2017; 37: 1076–1092. https://doi.org/10.1002/joc.5064
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Wang X, Fan J, Xing Y, Xu G, Wang H, Deng J, et al. The effects of mulch and nitrogen fertilizer on the soil environment of crop plants. Adv Agron. 2019; 153: 121–173. https://doi.org/10.1016/bs.agron.2018.08.003
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Allen RG, Pereira LS, Raes D, Smith M. Crop evapotranspiration—Guidelines for computing crop water requirements—FAO Irrigation and drainage paper 56. 1998:1–15.

[ref7] 7. Kisi O, Zounemat-Kermani M. Comparison of two different adaptive neuro-fuzzy inference systems in modelling daily reference evapotranspiration. Water Resour Manag. 2014; 28(9): 2655–2675.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref8] 8. Shiri J, Nazemi AH, Sadraddini AA, Landeras G, Kisi O, Fard AF, et al. Comparison of heuristic and empirical approaches for estimating reference evapotranspiration from limited inputs in Iran. Comput Electron Agric. 2014; 108: 230–241. https://doi.org/10.1016/j.compag.2014.08.007
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref9] 9. Fan JL, Wu LF, Zhang FC, Xiang YZ, Zheng J. Climate change effects on reference crop evapotranspiration across different climatic zones of China during 1956–2015. J Hydrol. 2016; 542: 923–937. https://doi.org/10.1016/j.jhydrol.2016.09.060
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref10] 10. Keshtegar B, Kisi O, Arab H G, Zounemat-Kermani M. Subset modeling basis ANFIS for prediction of the reference evapotranspiration. Water Resour Manag. 2018; 32(3): 1101–1116. https://doi.org/10.1007/s11269-017-1857-5
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref11] 11. Karbasi M. Forecasting of multi-step ahead reference evapotranspiration using Wavelet-Gaussian Process Regression model. Water Resour Manag. 2018; 32(3): 1035–1052. https://doi.org/10.1007/s11269-017-1853-9
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref12] 12. Tabari H, Grismer ME, Trajkovic S. Comparative analysis of 31 reference evapotranspiration methods under humid conditions. Irrig Sci. 2013; 31: 107–117. https://doi.org/10.1007/s00271-011-0295-z
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref13] 13. Wen X, Si J, He Z, Wu J, Shao H, Yu H. Support-vector-machine-based models for modeling daily reference evapotranspiration with limited climatic data in extreme arid regions. Water Resour Manag. 2015; 29: 3195–3209. https://doi.org/10.1007/s11269-015-0990-2
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref14] 14. Traore S, Luo Y, Fipps G. Deployment of artificial neural network for short-term forecasting of evapotranspiration using public weather forecast restricted messages. Agric Water Manag. 2016; 163: 363–379. https://doi.org/10.1016/j.agwat.2015.10.009
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref15] 15. Antonopoulos VZ., Antonopoulos AV. Daily reference evapotranspiration estimates by artificial neural networks technique and empirical equations using limited input climate variables. Comput Electron Agric. 2017; 132: 86–96. https://doi.org/10.1016/j.compag.2016.11.011
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref16] 16. Hargreaves GH, Samani ZA. Reference crop evapotranspiration from temperature. Appl Eng Agric. 1985; 1: 96–99. https://doi.org/10.13031/2013.26773
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref17] 17. Trabert W. Neue beobachtungen über verdampfungsgeschwindigkeiten. Meteorol Z. 1896; 13: 261–263.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref18] 18. Priestley CHB, Taylor RJ. On the assessment of surface heat flux and evaporation using large-scale parameters. Mon Weather Rev. 1972; 100: 81–92. https://doi.org/10.1175/1520-0493(1972)100<0081:OTAOSH>2.3.CO;2
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref19] 19. Kisi O. Pan evaporation modeling using least square support vector machine, multivariate adaptive regression splines and M5 model tree. J Hydrol. 2015; 528: 312–320. https://doi.org/10.1016/j.jhydrol.2015.06.052
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref20] 20. Wang L, Kisi O, Zounemat-Kermani M, Li H. Pan evaporation modeling using six different heuristic computing methods in different climates of China. J Hydrol. 2017; 544: 407–427. https://doi.org/10.1016/j.jhydrol.2016.11.059
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref21] 21. Jovic S, Nedeljkovic B, Golubovic Z, Kostic N. Evolutionary algorithm for reference evapotranspiration analysis. Comput Electron Agric. 2018; 150: 1–4. https://doi.org/10.1016/j.compag.2018.04.003
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref22] 22. Landeras G, Ortiz-Barredo A, López JJ. Comparison of artificial neural network models and empirical and semi-empirical equations for daily reference evapotranspiration estimation in the Basque Country (Northern Spain). Agric Water Manag. 2008; 95: 553–565. https://doi.org/10.1016/j.agwat.2007.12.011
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref23] 23. Kisi O. The potential of different ANN techniques in evapotranspiration modelling. Hydrol Process 2008; 22(14): 2449–2460. https://doi.org/10.1002/hyp.6837
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref24] 24. Pour-Ali Baba A, Shiri J, Kisi O, Fard AF, Kim S, Amini R. Estimating daily reference evapotranspiration using available and estimated climatic data by adaptive neuro-fuzzy inference system (ANFIS) and artificial neural network (ANN). Hydrol Res. 2013; 44(1): 131–146. https://doi.org/10.2166/nh.2012.074
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref25] 25. Sanikhani H, Kisi O, Maroufpoor E, Yaseen Z M. Temperature-based modeling of reference evapotranspiration using several artificial intelligence models: application of different modeling scenarios. Theor Appl Climatol. 2019; 135(1–2): 449–462. https://doi.org/10.1007/s00704-018-2390-z
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref26] 26. Tabari H, Kisi O, Ezani A, Talaee PH. SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J. Hydrol. 2012; 444: 78–89. https://doi.org/10.1016/j.jhydrol.2012.04.007
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref27] 27. Kisi O. Modeling reference evapotranspiration using three different heuristic regression approaches. Agric Water Manag. 2016; 169: 162–172. https://doi.org/10.1016/j.agwat.2016.02.026
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref28] 28. Abdullah SS, Malek MA, Abdullah NS, Kisi O, Yap KS. Extreme learning machines: a new approach for prediction of reference evapotranspiration. J Hydrol. 2015; 527: 184–195. https://doi.org/10.1016/j.jhydrol.2015.04.073
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref29] 29. Feng Y, Peng Y, Cui N, Gong D, Zhang K. Modeling reference evapotranspiration using extreme learning machine and generalized regression neural network only with temperature data. Comput Electron Agric. 2017; 136: 71–78. https://doi.org/10.1016/j.compag.2017.01.027
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref30] 30. Gocić M, Motamedi S, Shamshirband S, Petković D, Ch S, Hashim R, et al. Soft computing approaches for forecasting reference evapotranspiration. Comput Electron Agric. 2015; 113: 164–173. https://doi.org/10.1016/j.compag.2015.02.010
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref31] 31. Yin Z, Wen X, Feng Q, He Z, Zou S, Yang L. Integrating genetic algorithm and support vector machine for modeling daily reference evapotranspiration in a semi-arid mountain area. Hydrol Res. 2017; 48: 1177–1191. https://doi.org/10.2166/nh.2016.205
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref32] 32. Ma X, Liu ZB. Predicting the oil production using the novel multivariate nonlinear model based on Arps decline model and kernel method. Neural Comput Appl. 2018; 29: 579–591. https://doi.org/10.1007/s00521-016-2721-x
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref33] 33. Wu L, Huang G, Fan J, Zhang F, Wang X, Zeng W. Potential of kernel-based nonlinear extension of Arps decline model and gradient boosting with categorical features support for predicting daily global solar radiation in humid regions. Energy Convers Manag. 2019; 183: 280–295. https://doi.org/10.1016/j.enconman.2018.12.103
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref34] 34. Hassan MA, Khalil A, Kaseb S, Kassem MA. Exploring the potential of tree-based ensemble methods in solar radiation modeling. Appl Energy 2017; 203: 897–916. https://doi.org/10.1016/j.apenergy.2017.06.104
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref35] 35. Pal M, Deswal S. M5 model tree based modelling of reference evapotranspiration. Hydrol Process 2009; 23: 1437–1443. https://doi.org/10.1002/hyp.7266
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref36] 36. Rahimikhoob A. Comparison between M5 model tree and neural networks for estimating reference evapotranspiration in an arid environment. Water Resour Manag. 2014; 28: 657–669. https://doi.org/10.1007/s11269-013-0506-x
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref37] 37. Kisi O, Kilic Y. An investigation on generalization ability of artificial neural networks and M5 model tree in modeling reference evapotranspiration. Theor Appl Climatol. 2016; 126: 413–425. https://doi.org/10.1007/s00704-015-1582-z
View Article
Google Scholar

[108] View Article

[109] Google Scholar

[ref38] 38. Feng Y, Cui N, Gong D, Zhang Q, Zhao L. Evaluation of random forests and generalized regression neural networks for daily reference evapotranspiration modelling. Agric Water Manag. 2017; 193: 163–173. https://doi.org/10.1016/j.agwat.2017.08.003
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref39] 39. Shiri J. Improving the performance of the mass transfer-based reference evapotranspiration estimation approaches through a coupled wavelet-random forest methodology. J Hydrol. 2018; 561: 737–750. https://doi.org/10.1016/j.jhydrol.2018.04.042
View Article
Google Scholar

[114] View Article

[115] Google Scholar

[ref40] 40. Fan J, Yue W, Wu L, Zhang F, Cai H, Wang X, et al. Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric For Meteorol. 2018; 263: 225–241. https://doi.org/10.1016/j.agrformet.2018.08.019
View Article
Google Scholar

[117] View Article

[118] Google Scholar

[ref41] 41. Mehdizadeh S, Behmanesh J, Khalili K. Using MARS, SVM, GEP and empirical equations for estimation of monthly mean reference evapotranspiration. Comput Electron Agric 2017; 139: 103–114. https://doi.org/10.1016/j.compag.2017.05.002
View Article
Google Scholar

[120] View Article

[121] Google Scholar

[ref42] 42. Shiri J. Evaluation of FAO56-PM, empirical, semi-empirical and gene expression programming approaches for estimating daily reference evapotranspiration in hyper-arid regions of Iran. Agric Water Manag. 2017; 188: 101–114. https://doi.org/10.1016/j.agwat.2017.04.009
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref43] 43. Zhao SQ. A new scheme for comprehensive physical regionalization in China. Acta Geogr Sin. 1983; 38: 1–10. https://doi.org/10.11821/xb198301001
View Article
Google Scholar

[126] View Article

[127] Google Scholar

[ref44] 44. Yao N, Li Y, Lei T, Peng L. Drought evolution, severity and trends in mainland China over 1961–2013. Sci Total Environ. 2018; 616: 73–89. pmid:29107781
View Article
PubMed/NCBI
Google Scholar

[129] View Article

[130] PubMed/NCBI

[131] Google Scholar

[ref45] 45. Specht DF. A general regression neural network. IEEE Trans Neural Netw. 1991; 2(6): 568–576. pmid:18282872
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref46] 46. Wang L, Kisi O, Zounemat-Kermani M, Salazar GA, Zhu Z, Gong W. Solar radiation prediction using different techniques: model evaluation and comparison. Renew Sustain Energy Rev 2016; 61: 384–397. https://doi.org/10.1016/j.rser.2016.04.024
View Article
Google Scholar

[137] View Article

[138] Google Scholar

[ref47] 47. Jang JS. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern. 1993; 23: 665–685. https://doi.org/10.1109/21.256541
View Article
Google Scholar

[140] View Article

[141] Google Scholar

[ref48] 48. Quinlan JR. Learning with continuous classes. In 5th Australian joint conference on artificial intelligence, 1992; 92: 343–348.
View Article
Google Scholar

[143] View Article

[144] Google Scholar

[ref49] 49. Wang Y, Witten IH. Induction of model trees for predicting continuous lasses. In: Proceedings of the Poster Papers of the European Conference on Machine Learning, University of Economics, Faculty of Informatics and Statistics, Prague, 1997.

[ref50] 50. Chen T, He T, Benesty M, Khotilovich V, Tang Y. Xgboost: extreme gradient boosting. R Packag. version 0.4–2 1–4, 2015.

[ref51] 51. Vapnik V. The nature of statistical learning theory. Springer science & business media, 2003.

[ref52] 52. Ma X, Xie M, Wu W, Zeng B, Wang Y, Wu X. The novel fractional discrete multivariate grey system model and its applications. Appl Math Model. 2019; 70: 402–424. https://doi.org/10.1016/j.apm.2019.01.039
View Article
Google Scholar

[149] View Article

[150] Google Scholar

[ref53] 53. Ma X, Liu Z. The GMC (1, n) model with optimized parameters and its application. J grey Syst. 2017; 29(4): 122–138.
View Article
Google Scholar

[152] View Article

[153] Google Scholar

[ref54] 54. Ma X. A brief introduction to the Grey Machine Learning; 2018. Preprint. Available from: arXiv1805.01745.

[ref55] 55. Friedman JH. Multivariate adaptive regression splines. Ann Stat. 1991; 19(1): 1–67. https://doi.org/10.1002/9780470057339.vaa008
View Article
Google Scholar

[156] View Article

[157] Google Scholar

[ref56] 56. Fan J, Wu L, Zhang F, Cai H, Ma X, Bai H. Evaluation and development of empirical models for estimating daily and monthly mean daily diffuse horizontal solar radiation for different climatic regions of China. Renew Sustain Energy Rev. 2019; 105: 168–186. https://doi.org/10.1016/j.rser.2019.01.040
View Article
Google Scholar

[159] View Article

[160] Google Scholar

[ref57] 57. Fan J, Wu L, Zhang F, Cai H, Zeng W, Wang X, et al. Empirical and machine learning models for predicting daily global solar radiation from sunshine duration: A review and case study in China. Renew Sustain Energy Rev. 2019; 100: 186–212. https://doi.org/10.1016/j.rser.2018.10.018
View Article
Google Scholar

[162] View Article

[163] Google Scholar

[ref58] 58. Guo X, Sun X, Ma J. Prediction of daily crop reference evapotranspiration (ET0) values through a least-squares support vector machine model. Hydrol Res. 2011; 42: 268–274. https://doi.org/10.2166/nh.2011.072
View Article
Google Scholar

[165] View Article

[166] Google Scholar

[ref59] 59. Fan J, Wang X, Wu L, Zhou H, Zhang F, Yu X, et al. Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China. Energy Convers Manag. 2018; 164: 102–111. https://doi.org/10.1016/j.enconman.2018.02.087
View Article
Google Scholar

[168] View Article

[169] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Case study and site description

Data collection and analysis

FAO-56 Penman–Monteith equation

Machine learning models for predicting daily reference evapotranspiration

Multilayer perceptron neural networks (MLP).

Generalized regression neural network (GRNN).

Adaptive neuro-fuzzy inference system (ANFIS).

M5 model tree (M5Tree).

Extreme gradient boosting (XGBoost).

Support vector machine (SVM).

Kernel-based nonlinear extension of Arps decline model (KNEA).

Multivariate adaptive regression spline (MARS).

Input combinations and K-fold cross-validation

Statistical evaluation

Results and discussion

Comparison of prediction accuracy of eight machine learning models across China

Comparison of prediction accuracy of eight machine learning models under two input combinations

Comparison of model stability of eight machine learning models

Comprehensive evaluation of eight machine learning models

Conclusions

Supporting information

S1 Fig. Scatter plots of predicted ET0 values using the eight machine learning models against their corresponding FAO56-PM values during testing at Yinchuan and Erenhot in the semi-arid steppe of Inner Mongolia.

S2 Fig. Scatter plots of predicted ET0 values using the eight machine learning models against their corresponding FAO56-PM values during testing at Harbin and Shenyang in the (semi-)humid cold-temperate northeast China.

S3 Fig. Scatter plots of predicted ET0 values using the eight machine learning models against their corresponding FAO56-PM values during testing at Beijing and Zhengzhou in the semi-humid warm-temperate north China.

S4 Fig. Scatter plots of predicted ET0 values using the eight machine learning models against their corresponding FAO56-PM values during testing at Geermu and Lasa in the Qinghai-Tibetan Plateau.

S5 Fig. Scatter plots of predicted ET0 values using the eight machine learning models against their corresponding FAO56-PM values during testing at Wuhan and Guilin in the humid subtropical central China.

S1 Data.

Acknowledgments

References

S1 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Yinchuan and Erenhot in the semi-arid steppe of Inner Mongolia.

S2 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Harbin and Shenyang in the (semi-)humid cold-temperate northeast China.

S3 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Beijing and Zhengzhou in the semi-humid warm-temperate north China.

S4 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Geermu and Lasa in the Qinghai-Tibetan Plateau.

S5 Fig. Scatter plots of predicted ET₀ values using the eight machine learning models against their corresponding FAO56-PM values during testing at Wuhan and Guilin in the humid subtropical central China.