Skip to main content
Top
Published in: Water Resources Management 1/2021

17-11-2020

Ensemble Boosting and Bagging Based Machine Learning Models for Groundwater Potential Prediction

Authors: Amirhosein Mosavi, Farzaneh Sajedi Hosseini, Bahram Choubin, Massoud Goodarzi, Adrienn A. Dineva, Elham Rafiei Sardooi

Published in: Water Resources Management | Issue 1/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Due to the rapidly increasing demand for groundwater, as one of the principal freshwater resources, there is an urge to advance novel prediction systems to more accurately estimate the groundwater potential for an informed groundwater resource management. Ensemble machine learning methods are generally reported to produce more accurate results. However, proposing the novel ensemble models along with running comparative studies for performance evaluation of these models would be equally essential to precisely identify the suitable methods. Thus, the current study is designed to provide knowledge on the performance of the four ensemble models i.e., Boosted generalized additive model (GamBoost), adaptive Boosting classification trees (AdaBoost), Bagged classification and regression trees (Bagged CART), and random forest (RF). To build the models, 339 groundwater resources’ locations and the spatial groundwater potential conditioning factors were used. Thereafter, the recursive feature elimination (RFE) method was applied to identify the key features. The RFE specified that the best number of features for groundwater potential modeling was 12 variables among 15 (with a mean Accuracy of about 0.84). The modeling results indicated that the Bagging models (i.e., RF and Bagged CART) had a higher performance than the Boosting models (i.e., AdaBoost and GamBoost). Overall, the RF model outperformed the other models (with accuracy = 0.86, Kappa = 0.67, Precision = 0.85, and Recall = 0.91). Also, the topographic position index’s predictive variables, valley depth, drainage density, elevation, and distance from stream had the highest contribution in the modeling process. Groundwater potential maps predicted in this study can help water resources managers and policymakers in the fields of watershed and aquifer management to preserve an optimal exploit from this important freshwater.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Agarwal R, Garg PK (2016) Remote sensing and GIS based groundwater potential & recharge zones mapping using multi-criteria decision making technique. Water Resour Manag 30:243–260CrossRef Agarwal R, Garg PK (2016) Remote sensing and GIS based groundwater potential & recharge zones mapping using multi-criteria decision making technique. Water Resour Manag 30:243–260CrossRef
go back to reference Al-Abadi AM, Shahid S (2015) A comparison between index of entropy and catastrophe theory methods for mapping groundwater potential in an arid region. Environ Monit Assess 187(9):576CrossRef Al-Abadi AM, Shahid S (2015) A comparison between index of entropy and catastrophe theory methods for mapping groundwater potential in an arid region. Environ Monit Assess 187(9):576CrossRef
go back to reference Alotaibi NN, Sasi S (2016). Tree-based ensemble models for predicting the ICU transfer of stroke in-patients. In 2016 International Conference on Data Science and Engineering (ICDSE). IEEE, Piscataway, pp 1–6 Alotaibi NN, Sasi S (2016). Tree-based ensemble models for predicting the ICU transfer of stroke in-patients. In 2016 International Conference on Data Science and Engineering (ICDSE). IEEE, Piscataway, pp 1–6
go back to reference Aniya M (1985) Landslide-susceptibility mapping in the Amahata river basin, Japan. Ann Assoc Am Geogr 75(1):102–114CrossRef Aniya M (1985) Landslide-susceptibility mapping in the Amahata river basin, Japan. Ann Assoc Am Geogr 75(1):102–114CrossRef
go back to reference Ashraf MAM, Yusoh R, Sazalil MA, Abidin MHZ (2018) Aquifer Characterization and groundwater potential evaluation in sedimentary rock formation. In Journal of Physics: Conference Series, vol 995, No. 1. IOP Publishing, Bristol, p 012106 Ashraf MAM, Yusoh R, Sazalil MA, Abidin MHZ (2018) Aquifer Characterization and groundwater potential evaluation in sedimentary rock formation. In Journal of Physics: Conference Series, vol 995, No. 1. IOP Publishing, Bristol, p 012106
go back to reference Beucher A, Møller AB, Greve MH (2017) Artificial neural networks and decision tree classification for predicting soil drainage classes in Denmark. Geoderma 320:30–42 Beucher A, Møller AB, Greve MH (2017) Artificial neural networks and decision tree classification for predicting soil drainage classes in Denmark. Geoderma 320:30–42
go back to reference Breiman L (1996) Bagging predictors. Mach Learn 24:123–40 Breiman L (1996) Bagging predictors. Mach Learn 24:123–40
go back to reference Chatterjee S, Hadi AS, Price B (2000) Regression analysis by example (3rd ed.). Wiley, Hoboken. ISBN 978-0-471-31946-7 Chatterjee S, Hadi AS, Price B (2000) Regression analysis by example (3rd ed.). Wiley, Hoboken. ISBN 978-0-471-31946-7
go back to reference Chen W, Yeo CK, Lau CT, Lee BS (2015) Real-time twitter content polluter detection based on direct features. In 2015 2nd International Conference on Information Science and Security (ICISS). IEEE, Piscataway, pp 1–4 Chen W, Yeo CK, Lau CT, Lee BS (2015) Real-time twitter content polluter detection based on direct features. In 2015 2nd International Conference on Information Science and Security (ICISS). IEEE, Piscataway, pp 1–4
go back to reference Chen W, Li H, Hou E, Wang S, Wang G, Panahi M, Li T, Peng T, Guo C, Niu C, Xiao L, Wang J, Xie X, Ahmad BB (2018) GIS-based groundwater potential analysis using novel ensemble weights-of-evidence with logistic regression and functional tree models. Sci Total Environ 634:853–67CrossRef Chen W, Li H, Hou E, Wang S, Wang G, Panahi M, Li T, Peng T, Guo C, Niu C, Xiao L, Wang J, Xie X, Ahmad BB (2018) GIS-based groundwater potential analysis using novel ensemble weights-of-evidence with logistic regression and functional tree models. Sci Total Environ 634:853–67CrossRef
go back to reference Chowdhury A, Jha MK, Chowdary VM (2010) Delineation of groundwater recharge zones and identification of artificial recharge sites in West Medinipur district, West Bengal, using RS, GIS and MCDM techniques. Environ Earth Sci 59(6):1209CrossRef Chowdhury A, Jha MK, Chowdary VM (2010) Delineation of groundwater recharge zones and identification of artificial recharge sites in West Medinipur district, West Bengal, using RS, GIS and MCDM techniques. Environ Earth Sci 59(6):1209CrossRef
go back to reference Das S (2019) Comparison among influencing factor, frequency ratio, and analytical hierarchy process techniques for groundwater potential zonation in Vaitarna basin, Maharashtra, India. Groundw Sustain Dev 8:617–29CrossRef Das S (2019) Comparison among influencing factor, frequency ratio, and analytical hierarchy process techniques for groundwater potential zonation in Vaitarna basin, Maharashtra, India. Groundw Sustain Dev 8:617–29CrossRef
go back to reference Decker K, Heinrich M, Klein P, Kociu A, Lipiarski P, Pirkl H, Rank D, Wimmer H (1998) Karst springs, groundwater and surface runoff in the calcareous Alps: assessing quality and reliance of long-term water supply. IAHS Publ Ser Proc Rep Intern Assoc Hydrol Sci 248:149–156 Decker K, Heinrich M, Klein P, Kociu A, Lipiarski P, Pirkl H, Rank D, Wimmer H (1998) Karst springs, groundwater and surface runoff in the calcareous Alps: assessing quality and reliance of long-term water supply. IAHS Publ Ser Proc Rep Intern Assoc Hydrol Sci 248:149–156
go back to reference Feng C, Cui M, Hodge BM, Zhang J (2017) A data-driven multi-model methodology with deep feature selection for short-term wind forecasting. Appl Energy 190:1245–1257CrossRef Feng C, Cui M, Hodge BM, Zhang J (2017) A data-driven multi-model methodology with deep feature selection for short-term wind forecasting. Appl Energy 190:1245–1257CrossRef
go back to reference Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139CrossRef Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139CrossRef
go back to reference Gebre T, Ahmad I, Dar MA, Gadissa E, Teka AH, Tolosa AT, Brhane ES (2018) Mapping of groundwater potential zones using remote sensing and geographic information system: A case study of parts of Tigray, Ethiopia. Environ Geosci 25:133–40CrossRef Gebre T, Ahmad I, Dar MA, Gadissa E, Teka AH, Tolosa AT, Brhane ES (2018) Mapping of groundwater potential zones using remote sensing and geographic information system: A case study of parts of Tigray, Ethiopia. Environ Geosci 25:133–40CrossRef
go back to reference Gnanachandrasamy G, Zhou Y, Bagyaraj M, Venkatramanan S, Ramkumar T, Wang S (2018) Remote sensing and GIS based groundwater potential zone mapping in Ariyalur District, Tamil Nadu. J Geol Soc India 92:484–490CrossRef Gnanachandrasamy G, Zhou Y, Bagyaraj M, Venkatramanan S, Ramkumar T, Wang S (2018) Remote sensing and GIS based groundwater potential zone mapping in Ariyalur District, Tamil Nadu. J Geol Soc India 92:484–490CrossRef
go back to reference Hassan ZU, Kanth TA, Malik MI (2018) Groundwater potential zonation and prioritization of wular catchment of Kashmir using GIS based multi-criteria evaluation approach. Water Energy Int 60RNI:49–61 Hassan ZU, Kanth TA, Malik MI (2018) Groundwater potential zonation and prioritization of wular catchment of Kashmir using GIS based multi-criteria evaluation approach. Water Energy Int 60RNI:49–61
go back to reference Hastie TJ, Tibshirani RJ (2017) Generalized additive models. CRC Press, Boca RatonCrossRef Hastie TJ, Tibshirani RJ (2017) Generalized additive models. CRC Press, Boca RatonCrossRef
go back to reference Ho TK (1995) Random decision forests C3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. IEEE Computer Society, Washington, D.C., pp 278–82 Ho TK (1995) Random decision forests C3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. IEEE Computer Society, Washington, D.C., pp 278–82
go back to reference Hofner B, Mayr A, Schmid M (2016) GamboostLSS: An R package for model building and variable selection in the GAMLSS framework. J Stat Softw 74(1):1–31CrossRef Hofner B, Mayr A, Schmid M (2016) GamboostLSS: An R package for model building and variable selection in the GAMLSS framework. J Stat Softw 74(1):1–31CrossRef
go back to reference Johnson LE, Olsen BG (1998) Assessment of quantitative precipitation forecasts. Weather Forecast 13(1):75–83CrossRef Johnson LE, Olsen BG (1998) Assessment of quantitative precipitation forecasts. Weather Forecast 13(1):75–83CrossRef
go back to reference Kalantar B, Pradhan B, Naghibi SA, Motevalli A, Mansor S (2018) Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomatics Nat Hazards Risk 9(1):49–69CrossRef Kalantar B, Pradhan B, Naghibi SA, Motevalli A, Mansor S (2018) Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomatics Nat Hazards Risk 9(1):49–69CrossRef
go back to reference Kordestani MD, Naghibi SA, Hashemi H, Ahmadi K, Kalantar B, Pradhan B (2019) Groundwater potential mapping using a novel data-mining ensemble model. Hydrogeol J 27:211–224CrossRef Kordestani MD, Naghibi SA, Hashemi H, Ahmadi K, Kalantar B, Pradhan B (2019) Groundwater potential mapping using a novel data-mining ensemble model. Hydrogeol J 27:211–224CrossRef
go back to reference Kuhn M, Johnson K (2013) Applied predictive modeling, vol 26. Springer, New YorkCrossRef Kuhn M, Johnson K (2013) Applied predictive modeling, vol 26. Springer, New YorkCrossRef
go back to reference Lee S, Hong SM, Jung HS (2018) GIS-based groundwater potential mapping using artificial neural network and support vector machine models: the case of Boryeong city in Korea. Geocarto Int 33(8):847–861CrossRef Lee S, Hong SM, Jung HS (2018) GIS-based groundwater potential mapping using artificial neural network and support vector machine models: the case of Boryeong city in Korea. Geocarto Int 33(8):847–861CrossRef
go back to reference Lemmens A, Croux C (2006) Bagging and boosting classification trees to predict churn. J Mark Res 43(2):276–286CrossRef Lemmens A, Croux C (2006) Bagging and boosting classification trees to predict churn. J Mark Res 43(2):276–286CrossRef
go back to reference Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22 Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
go back to reference Manap AM, Sulaiman WN, Ramli MF, Pradhan B, Surip N (2013) A knowledge-driven GIS modeling technique for groundwater potential mapping at the Upper Langat Basin, Malaysia. Arab J Geosci 6(5):1621–1637 Manap AM, Sulaiman WN, Ramli MF, Pradhan B, Surip N (2013) A knowledge-driven GIS modeling technique for groundwater potential mapping at the Upper Langat Basin, Malaysia. Arab J Geosci 6(5):1621–1637
go back to reference Mayr A, Fenske N, Hofner B, Kneib T, Schmid M (2012) Generalized additive models for location, scale and shape for high dimensional data-a flexible approach based on boosting. J R Stat Soc Ser C Appl Stat 61:403–27CrossRef Mayr A, Fenske N, Hofner B, Kneib T, Schmid M (2012) Generalized additive models for location, scale and shape for high dimensional data-a flexible approach based on boosting. J R Stat Soc Ser C Appl Stat 61:403–27CrossRef
go back to reference Miraki S, Zanganeh SH, Chapi K, Singh VP, Shirzadi A, Shahabi H, Pham BT (2019) Mapping groundwater potential using a novel hybrid intelligence approach. Water Resour Manag 33(1):281–302CrossRef Miraki S, Zanganeh SH, Chapi K, Singh VP, Shirzadi A, Shahabi H, Pham BT (2019) Mapping groundwater potential using a novel hybrid intelligence approach. Water Resour Manag 33(1):281–302CrossRef
go back to reference Monserud RA, Leemans R (1992) Comparing global vegetation maps with the Kappa statistic. Ecol Model 62(4):275–293 Monserud RA, Leemans R (1992) Comparing global vegetation maps with the Kappa statistic. Ecol Model 62(4):275–293
go back to reference Motevalli A, Naghibi SA, Hashemi H, Berndtsson R, Pradhan B, Gholami V (2019) Inverse method using boosted regression tree and k-nearest neighbor to quantify effects of point and non-point source nitrate pollution in groundwater. J Clean Prod 228:1248–1263CrossRef Motevalli A, Naghibi SA, Hashemi H, Berndtsson R, Pradhan B, Gholami V (2019) Inverse method using boosted regression tree and k-nearest neighbor to quantify effects of point and non-point source nitrate pollution in groundwater. J Clean Prod 228:1248–1263CrossRef
go back to reference Murphree DH, Arabmakki E, Ngufor C, Storlie CB, McCoy RG (2018) Stacked classifiers for individualized prediction of glycemic control following initiation of metformin therapy in type 2 diabetes. Comput Biol Med 103:109–115CrossRef Murphree DH, Arabmakki E, Ngufor C, Storlie CB, McCoy RG (2018) Stacked classifiers for individualized prediction of glycemic control following initiation of metformin therapy in type 2 diabetes. Comput Biol Med 103:109–115CrossRef
go back to reference Naghibi SA, Dolatkordestani M, Rezaei A, Amouzegari P, Heravi MT, Kalantar B, Pradhan B (2019) Application of rotation forest with decision trees as base classifier and a novel ensemble model in spatial modeling of groundwater potential. Environ Monit Assess 191(4):248CrossRef Naghibi SA, Dolatkordestani M, Rezaei A, Amouzegari P, Heravi MT, Kalantar B, Pradhan B (2019) Application of rotation forest with decision trees as base classifier and a novel ensemble model in spatial modeling of groundwater potential. Environ Monit Assess 191(4):248CrossRef
go back to reference Nampak H, Pradhan B, Manap MA (2014) Application of GIS based data driven evidential belief function model to predict groundwater potential zonation. J Hydrol 513:283–300CrossRef Nampak H, Pradhan B, Manap MA (2014) Application of GIS based data driven evidential belief function model to predict groundwater potential zonation. J Hydrol 513:283–300CrossRef
go back to reference Prasad RK, Mondal NC, Banerjee P, Nandakumar MV, Singh VS (2008) Deciphering potential groundwater zone in hard rock through the application of GIS. Environ Geol 55(3):467–475CrossRef Prasad RK, Mondal NC, Banerjee P, Nandakumar MV, Singh VS (2008) Deciphering potential groundwater zone in hard rock through the application of GIS. Environ Geol 55(3):467–475CrossRef
go back to reference Quinlan JR (1996) Bagging, boosting, and C4. 5. AAAI/IAAI 1:725–730 Quinlan JR (1996) Bagging, boosting, and C4. 5. AAAI/IAAI 1:725–730
go back to reference Sameen MI, Pradhan B, Lee S (2019) Self-learning random forests model for mapping groundwater yield in data-scarce areas. Nat Resour Res 28:757–775CrossRef Sameen MI, Pradhan B, Lee S (2019) Self-learning random forests model for mapping groundwater yield in data-scarce areas. Nat Resour Res 28:757–775CrossRef
go back to reference Sandman A, Isaeus M, Bergström U, Kautsky H (2008) Spatial predictions of Baltic phytobenthic communities: Measuring robustness of generalized additive models based on transect data. J Mar Syst 74:S86–S96CrossRef Sandman A, Isaeus M, Bergström U, Kautsky H (2008) Spatial predictions of Baltic phytobenthic communities: Measuring robustness of generalized additive models based on transect data. J Mar Syst 74:S86–S96CrossRef
go back to reference Sidle RC, Ochiai H (2006) Landslides: Processes, prediction, and land use. Water Resources Monogr 18. American Geophysical Union, Washington, D.C Sidle RC, Ochiai H (2006) Landslides: Processes, prediction, and land use. Water Resources Monogr 18. American Geophysical Union, Washington, D.C
go back to reference Songara JC, Joshipura NM, Mehmood K, Prakash I (2015a) Assessment and management of watershed of Machhu Dam III, Morbi, Gujarat using geoinformatics technology. Int J Adv Eng Res Dev Songara JC, Joshipura NM, Mehmood K, Prakash I (2015a) Assessment and management of watershed of Machhu Dam III, Morbi, Gujarat using geoinformatics technology. Int J Adv Eng Res Dev
go back to reference Songara JC, Kadivar HT, Joshipura NM, Prakash I (2015b) Estimation of surface runoff of Machhu Dam III Chatchment Area, Morbi, Gujarat, India, using curve number method and GIS. Int J Sci Res Dev 3(3):2038–2043 Songara JC, Kadivar HT, Joshipura NM, Prakash I (2015b) Estimation of surface runoff of Machhu Dam III Chatchment Area, Morbi, Gujarat, India, using curve number method and GIS. Int J Sci Res Dev 3(3):2038–2043
go back to reference Stanski HR, Wilson LJ, Burrows WR (1989) Survey of common verification methods in meteorology. World Weather Watch Technical Report No. 8, TD No. 358, World Meteorological Organization, Geneva, 114 pp Stanski HR, Wilson LJ, Burrows WR (1989) Survey of common verification methods in meteorology. World Weather Watch Technical Report No. 8, TD No. 358, World Meteorological Organization, Geneva, 114 pp
go back to reference Thuiller W, Lafourcade B (2009) BIOMOD: species/climate modelling functions. R Package Version 1.1-3/r118 Thuiller W, Lafourcade B (2009) BIOMOD: species/climate modelling functions. R Package Version 1.1-3/r118
go back to reference Wang S, Chen S (2019) Insights to fracture stimulation design in unconventional reservoirs based on machine learning modeling. J Petrol Sci Eng 174:682–695CrossRef Wang S, Chen S (2019) Insights to fracture stimulation design in unconventional reservoirs based on machine learning modeling. J Petrol Sci Eng 174:682–695CrossRef
Metadata
Title
Ensemble Boosting and Bagging Based Machine Learning Models for Groundwater Potential Prediction
Authors
Amirhosein Mosavi
Farzaneh Sajedi Hosseini
Bahram Choubin
Massoud Goodarzi
Adrienn A. Dineva
Elham Rafiei Sardooi
Publication date
17-11-2020
Publisher
Springer Netherlands
Published in
Water Resources Management / Issue 1/2021
Print ISSN: 0920-4741
Electronic ISSN: 1573-1650
DOI
https://doi.org/10.1007/s11269-020-02704-3

Other articles of this Issue 1/2021

Water Resources Management 1/2021 Go to the issue