Skip to main content
Erschienen in: Earth Science Informatics 2/2024

Open Access 04.03.2024 | RESEARCH

Comparison of tree-based ensemble learning algorithms for landslide susceptibility mapping in Murgul (Artvin), Turkey

verfasst von: Ziya Usta, Halil Akıncı, Alper Tunga Akın

Erschienen in: Earth Science Informatics | Ausgabe 2/2024

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Turkey’s Artvin province is prone to landslides due to its geological structure, rugged topography, and climatic characteristics with intense rainfall. In this study, landslide susceptibility maps (LSMs) of Murgul district in Artvin province were produced. The study employed tree-based ensemble learning algorithms, namely Random Forest (RF), Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), and eXtreme Gradient Boosting (XGBoost). LSM was performed using 13 factors, including altitude, aspect, distance to drainage, distance to faults, distance to roads, land cover, lithology, plan curvature, profile curvature, slope, slope length, topographic position index (TPI), and topographic wetness index (TWI). The study utilized a landslide inventory consisting of 54 landslide polygons. Landslide inventory dataset contained 92,446 pixels with a spatial resolution of 10 m. Consistent with the literature, the majority of landslide pixels (70% – 64,712 pixels) were used for model training, and the remaining portion (30% – 27,734 pixels) was used for model validation. Overall accuracy, precision, recall, F1-score, root mean square error (RMSE), and area under the receiver operating characteristic curve (AUC-ROC) were considered as validation metrics. LightGBM and XGBoost were found to have better performance in all validation metrics compared to other algorithms. Additionally, SHapley Additive exPlanations (SHAP) were utilized to explain and interpret the model outputs. As per the LightGBM algorithm, the most influential factors in the occurrence of landslide in the study area were determined to be altitude, lithology, distance to faults, and aspect, whereas TWI, plan and profile curvature were identified as the least influential factors. Finally, it was concluded that the produced LSMs would provide significant contributions to decision makers in reducing the damages caused by landslides in the study area.
Hinweise
Communicated by: H. Babaie

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

Landslides are one of the most common natural disasters in areas with rugged topography. Landslides triggered by seismic activities, heavy rainfall, or human activities cause significant economic damage, environmental destruction, and loss of life, especially in areas with steep slopes. Therefore, reducing and preventing landslide-related damages continue to be important research topics in disaster management (Zhang et al. 2022a). To reduce landslide-related damages and losses, landslide susceptibility (LS) assessment should be conducted in areas prone to landslides, and landslide susceptibility maps (LSMs) for these areas should be generated (Das et al. 2023). It is possible to identify areas with a high probability of future landslides through LS mapping (Ye et al. 2022).
Various statistical methods have been used to generate LSMs. Logistic regression (LR) (Yilmaz 2009; Kavzoglu et al. 2014; Dağ et al. 2020), frequency ratio (FR) method (Akgun et al. 2008; Demir 2019; Akinci and Yavuz Ozalp 2021), and weights of evidence (Lee and Choi 2004; Sifa et al. 2020; Wang et al. 2020a) are the most commonly used statistical methods. However, Wei et al. (2022) mentioned that while statistical methods are relatively easy to use, they fall short in explaining the complex, inconsistent, and nonlinear relationships between landslide events and the conditioning factors that influence them. Therefore, in recent years, machine learning (ML) and deep learning-based models have gained significant attention in this field.
In the past decade, individual or stand-alone ML algorithms such as artificial neural networks (Pradhan and Lee 2010; Aditian et al. 2018; Akinci 2022), decision trees (Tien Bui et al. 2012; Pradhan 2013), naive bayes (Chen et al. 2018; Pourghasemi et al. 2018), and support vector machines (SVM) (Goetz et al. 2015; Colkesen et al. 2016; Akinci and Zeybek 2021) have been widely used in LS mapping. On the other hand, several studies in the literature have reported that ensemble learning algorithms, such as adaptive boosting (AdaBoost), random forest (RF), gradient boosting machine (GBM), light gradient boosting machine (LightGBM), categorical boosting (CatBoost), and extreme gradient boosting (XGBoost), outperform individual ML algorithms in terms of prediction accuracy, stability, and robustness (Kavzoglu and Teke 2022; Liu et al. 2022; Wei et al. 2022). Ensemble learning algorithms generate multiple models using different samples of the same dataset to produce more accurate solutions and improve prediction performance by combining the predictions from these models. Sahin (2022) and Ye et al. (2022) emphasized that CatBoost, LightGBM, and XGBoost are rarely used and not extensively evaluated in LS mapping studies, mainly due to being relatively new methods compared to other ML models. Although tree-based ensemble learning algorithms demonstrate better performance than single models, it has not been determined yet which is the most suitable algorithm for LS mapping (Wei et al. 2022; Ye et al. 2022). Different studies highlight the prominence of different algorithms on the attributes of the research location, the conditioning factors used, and the values of the algorithms’ hyperparameters. Therefore, more studies comparing the performance of different algorithms are needed for a comprehensive evaluation.
In Turkey, due to its geological, geomorphological, topographic, and climatic characteristics, regions such as the Black Sea Region, Eastern Anatolia, and Central Anatolia are frequently affected by landslides. Approximately 20% of landslides in Turkey occur in the Eastern Black Sea Region. The provinces with the highest happening of landslides in this region are Trabzon, Rize, Giresun, and Artvin, respectively. According to the statistical data from the Disaster and Emergency Management Presidency (AFAD) covering the years 1950–2019, Trabzon experienced 1517 landslides, Rize had 1319, Giresun had 913, and Artvin had 765 landslide events (AFAD 2020). In the study by Dalkes and Korkmaz (2023), LSMs of Akçaabat and Düzköy districts of Trabzon province were produced using Analytic Hierarchy Process (AHP) and FR methods. Upon comparison of the LS maps produced by both methods, it was determined that the FR method yielded more accurate results than the AHP method in identifying the locations of the observed landslides in the study area. Yavuz Ozalp et al. (2023) produced LSMs of Ardeşen and Fındıklı districts of Rize province using tree-based ensemble learning algorithms such as RF, GBM, CatBoost and XGBoost. Using the ROC curve and AUC metric, the researchers found that CatBoost performed slightly better than other models. In the study by Kaya Topaçli et al. (2024), LR and RF models were used to produce LSMs of the Bolaman river basin of Ordu province, located in the Eastern Black Sea Region in Turkey. The study area is one of the basins where landslides occur most frequently in Turkey, as in many places in the Eastern Black Sea Region. The validation results of the study showed that RF is superior to LR in terms of performance. Landslides that inflict harm upon structures and infrastructure, also economic losses and loss of life, can be observed in all districts of Artvin. However, a review of the literature reveals that LS assessments have been conducted in the districts of Arhavi, Hopa, Kemalpaşa, Merkez, Ardanuç, and Şavşat in Artvin (Akinci et al. 2020, 2021; Akinci and Yavuz Ozalp 2021; Akinci and Zeybek 2021), while no studies have been conducted in the districts of Borçka, Murgul, and Yusufeli.
The aim of this study was to produce susceptibility maps showing areas prone to landslides in Murgul district of Artvin province and to identify the main factors contributing to landslide susceptibility. Prediction models using RF, LightGBM, CatBoost and XGBoost ML algorithms were used for susceptibility mapping. The prediction capabilities or performances of the susceptibility models were evaluated using metrics such as overall accuracy, precision, recall, F1-score, root mean square error (RMSE), and AUC-ROC. Furthermore, the SHapley Additive exPlanations (SHAP) approach, which aims to explain the local behavior of black box ML models, was used to improve the interpretability of model predictions. When the current literature is reviewed, it is seen that there are limited number of studies using SHAP approach in LS modeling (Pradhan et al. 2023; Sun et al. 2023a; Teke and Kavzoglu 2023; Vega et al. 2023; Youssef et al. 2023; Zhang et al. 2023). Therefore, the innovative aspect and contribution of this study to the literature is the use of SHAP in LS modeling.

Material and methods

The aim of this study is to produce LSMs of the study area using tree-based ensemble learning algorithms and to compare the performance of the ML models used. The study can be broadly divided into 6 steps: (1) prepare spatial data for landslide conditioning factors; (2) test the independence of conditioning factors using multicollinearity analysis; (3) collection of landslide inventory data, creation of training and validation data sets; (4) produce LSMs using RF, GBM, CatBoost, and XGBoost models; (5) evaluate the performance of the models using validation metrics; (6) interpret the models using SHAP values and explain the formation mechanisms of landslides in the study area. This study was carried out using ArcGIS 10.5, SAGA GIS 7.9, Python and Scikit-learn library. A flowchart summarizing the methodology applied in this study, inspired by Yu et al. (2023), is shown in Fig. 1.

Study area and geological structure

The study area consists of a bounding box covering the Murgul district. This area is located between 41° 7′ 1.39″—41° 22′ 24.86″ north latitude and 41° 27′ 55.41″—41° 41′ 1.53″ east longitude, with a total area of 48,938.23 hectares. The average elevation in the study area is 1347 m, and the elevation ranges from 100 to 3370 m (Fig. 2). The study area has a highly rugged topography, and the slope values change from 0° to 76.33°. The mean slope value in the study area is 29.09°. In the study area, 4.09% of the area has a slope below 10°, 15.78% has a slope between 10° and 20°, and 80.13% has a slope above 20°.
The study area includes 13 villages along with the town center of Murgul. Two of these villages, Civan and Akpınar, are administratively affiliated with the Borçka district (Fig. 2). Based on the data from the Turkish Statistical Institute (TURKSTAT), the total population of Murgul district in 2021 is 6522 (TURKSTAT 2023). Out of this population, 5020 reside in the town center, while 1502 live in the villages. With the inclusion of the 2 villages affiliated with the Borçka district, the total population in the study area reaches 6896 people.
The study area is characterized by a Black Sea climate. Based on the meteorological measurements collected from the General Directorate of Meteorology for the years 2015–2022, the average temperature in Murgul district is 12.76 °C. The yearly mean rainfall in Murgul is 980.63 mm. The highest temperature recorded in the district was 31 °C in August 2022, while the lowest temperature was -3 °C in December 2016.
In this study, a 1/100,000 scale geological map obtained from the General Directorate of Mineral Research and Exploration (GDMRE) was used (Keskin 2013a; 2013b). Based on this map, there are 18 different lithological units in the study area (Fig. 3). The study area comprises lithological units ranging in age from Late Cretaceous (Turonian-Coniacian) to Quaternary (Table 1). Four lithological units account for 78.52% of the study area: Kızılkaya Formation (Kk) covers 24.28%, Kabaköy Formation (Tek) covers 23.66%, Çatak Formation (Kç) covers 20.6%, and Çağlayan Formation (Kça) covers 9.98%. The Santonian-aged Kızılkaya Formation (Kk) consists of rhyodacitic, dacitic lavas, and pyroclastics.
Table 1
Lithological units of the study area
Pixel value
Symbol
Age / Formation name
The content of the formation
1
Qal
Quaternary / -
Alluvium
2
Qym
Quaternary / -
Slope debris
3
Tekçd
Middle-Upper Eocene / -
Diorite
4
Tekça
Middle-Upper Eocene / -
Dacite, rhyodacite
5
Tet
Middle Eocene / Taşpınar formation
Andesitic and dacitic volcanics, volcano-clastic turbiditic rocks
6
Tek
Middle Eocene / Kabaköy formation
Andesite, basaltic lava and pyroclastics, sandy limestone, sandstone, marl, tuff
7
Tee
Middle Eocene /
Erenler formation
Mudstone, claystone, sandstone alternation
8
Tpeb
Paleocene-Lower Eocene / Bakırköy formation
Siltstone, claystone, sandstone, clayey limestone, marl
9
Kk1
Upper Cretaceous-Paleocene / Kaçkar granitoid-I
Granite, granodiorite, quartz diorite, adamellite, gabbro, diabase
10
Kk1kd
Upper Cretaceous-Paleocene / -
Quartz diorite, diorite
11
KTc
Maastrichtian-Danian / Cankurtaran formation
Sandy limestone, mictiric limestone, tuff, marl, volcanics sandstone, agglomerate
12
KTct3
Maastrichtian-Danian / -
Tuff, marl, limestone, sandstone
13
KTck
Maastrichtian-Danian / -
Limestone (gray-red colored)
14
KTa
Maastrichtian-Danian / Ağıllar formation
Reefal limestone, sandy limestone
15
Kçb
Maastrichtian /
Çayırbağ formation
Dacite, rhyolite, rhyodacitic lava and pyroclastics
16
Kça
Campanian–Maastrichtian / Çağlayan formation
Basaltic, andesitic lava and pyroclastics, mudstone, sandstone
17
Kk
Santonian /
Kızılkaya formation
Rhyodacitic, dacitic lava and pyroclastics
18
Turonian-Coniacian / Çatak formation
Basalt, andesitic lava and pyroclastics, clayey limestone, marl, siltstone, claystone
The Kabaköy Formation (Tek), which is of Middle Eocene age, starts with clastic and carbonates and includes andesitic, basaltic lavas, and pyroclastics, as well as conglomerates, sandy limestone, sandstone, marl, and tuff. The Late Cretaceous (Turonian-Coniacian) Çatak Formation (Kç) comprises basaltic andesitic lavas and pyroclastics, along with argillaceous limestone, marl, siltstone, and shale. The Campanian–Maastrichtian-aged Çağlayan Formation (Kça) is composed of basaltic andesitic lavas and pyroclastics, as well as mudstone and sandstone (Keskin 2013a; 2013b). The rocks contained in the other lithological units in the study area are given in Table 1.

Landslide inventory

Landslide susceptibility (LS) mapping studies require landslide inventory data both for model training and validation stages. For models to generate reliable results, the inventory data needs to be up-to-date, accurate, and complete. The landslide inventory map (LIM) used in the study contains 54 landslide polygons (Fig. 2).
Among the landslide polygons, 11 of them were obtained from the 1/25,000 scale LIM generated by the GDMRE. In this inventory, landslide polygons are classified as Type 1, Type 2, and Type 4. Type 1 represents inactive landslides, Type 2 represents active landslides, and Type 4 represents active flows (Erener et al. 2016; Duman and Çan 2023). The remaining 43 landslide polygons used in the study were obtained from the map indicating the inventory of landslides produced by the Artvin Provincial Disaster and Emergency Directorate. When examining landslides and related data in Turkey, it is known that predominantly translational and rotational landslides, as well as complex landslides combining multiple types, occur (AFAD 2020). According to Varnes’ (1978) slope movement classification, out of the 43 landslides, 1 is categorized as rotational slide, 8 categorized as translational slide, 24 as flow, 6 as debris flow, and 4 as complex landslide. The total area covered by the landslide polygons is 9,241,303.99 m2. The landslide polygons cover approximately 2% of the study area. The smallest and the largest landslide polygons in the study area have areas of 38.32 m2 and 2,809,890 m2, respectively.
A total of 54 landslide polygons in the study area were transformed into raster format, utilizing a spatial resolution of 10 m. As a result of this conversion, the “1” value was assigned to 92,446 pixels, which are referred to as positive examples. To create negative examples, a matching quantity of pixels without landslides were randomly selected in the implementation of ML models in Python, and these pixels were assigned a value of “0”. Subsequently, the total of 184,892 pixels, consisting of landslide and non-landslide examples, was divided into two datasets in a 70/30 ratio, following the literature (Sahin 2022; Youssef and Pourghasemi 2021; Akinci 2022; Liu et al. 2022; Wei et al. 2022). These datasets were employed for the training and validation of the model.

Landslide conditioning factors

In this study, LSMs were generated using 13 factors, including land cover, aspect, slope, lithology, elevation, plan curvature, profile curvature, distance to drainage networks, distance to faults, distance to roads, slope length, topographic position index (TPI), and topographic wetness index (TWI). These factors were determined based on the availability or producibility of spatial data, the geological and environmental characteristics of the study area, and relevant literature. Raster-based factor maps were produced at a spatial resolution of 10 m using ArcGIS 10.5 and SAGA GIS 7.9 software (Fig. 4 and Fig. 5). Digital topographic maps of the study area were obtained from the General Directorate of Mapping. The digital elevation model (DEM) of the study area with a resolution of 10 m was produced by using the contour lines in these 1/25.000 scale topographic maps. Aspect, slope, elevation, plan and profile curvature were prepared in ArcGIS 10.5 software using this DEM. Additionally, slope length, TPI and TWI were produced using the same DEM in SAGA GIS software. As explained in Section "Study Area and Geological Structure", the lithological units and fault lines were obtained from the 1:100,000 geological map provided by GDRME. The map displaying the distance to faults in the study area was generated using the “Euclidean Distance” function in ArcGIS software. The study area’s drainage network was generated using SAGA GIS software based on the DEM. Additionally, the “Euclidean Distance” function of ArcGIS software was used to produce the distance map to the drainage networks. A comprehensive digital road network dataset including highways, village roads and forest roads in the study area was obtained from Artvin Regional Directorate of Forestry. The map representing the distance to the roads was produced using the “Euclidean Distance” function of ArcGIS software, as in other distance maps. On the other hand, the 10 m resolution land cover dataset of the study area was obtained from ESRI (https://​livingatlas.​arcgis.​com/​landcover/​). The relationship between conditioning factors and landslide occurrences has been well explained in numerous studies (Gomez and Kavzoglu 2005; Kavzoglu et al. 2014; Pourghasemi and Rahmati 2018; Zhao et al. 2019; Dağ et al. 2020; Youssef and Pourghasemi 2021; Ye et al. 2022), hence detailed discussions on this topic are not provided in this article. Instead, basic statistical data related to conditioning factors are presented (Table 2).
Table 2
Statistical description of the conditioning factors and their sub-classes
Factor
Min
Max
Sub-classes
Reference
Altitude (m)
100
3370
1: 100–427, 2: 427–754,
3: 754–1081, 4: 1081–1408, 5: 1408–1735,
6: 1735–2062, 7: 2062–2389, 8: 2389–2716,
9: 2716–3043, 10: 3043–3370
Chen et al. 2017
Kilicoglu 2021
Akinci 2022
Yavuz Ozalp et al. 2023
Aspect
-
-
1: Flat, 2: North,
3: Northeast, 4: East,
5: Southeast, 6: South,
7: Southwest, 8: West,
9: Northwest
Sun et al. 2022
He et al. 2023
Vega et al. 2023
Yavuz Ozalp et al. 2023
Distance to drainage (m)
0
928
1: 0–100, 2: 100–200,
3: 200–300, 4: 300–400, 5: 400–500, 6: 500–600, 7: 600–700, 8: 700–800, 9: 800–928
Akinci 2022
Sun et al. 2022
Arabameri et al. 2020
Yavuz Ozalp et al. 2023
Distance to
Faults (m)
0
7723.34
1: 0–1000, 2: 1000–2000, 3: 2000–3000, 4: 3000–4000, 5: 4000–5000,
6: 5000–6000, 7: 6000–7000, 8: 7000–7723.34,
Feizizadeh et al. 2014
Akinci et al. 2020
Akinci 2022
Yavuz Ozalp et al. 2023
Distance to
Roads (m)
0
2197.84
1: 0–200, 2: 200–400,
3: 400–600, 4: 600–800, 5: 800–1000, 6: 1000–1200, 7: 1200–1400,
8: 1400–1600, 9: 1600–1800, 10: 1800–2197.84
Akinci 2022
Zhang et al. 2022a
He et al. 2023
Yavuz Ozalp et al. 2023
Lithology
-
-
Explained in Table 1
 
Land cover
-
-
1: Water, 2: Trees,
3: Grass, 5: Crops,
6: Scrub/shrub, 7: Built Area, 8: Bare ground,
9: Snow/Ice, 10: Clouds
Lv et al. 2022
Roy et al. 2023
Yavuz Ozalp et al. 2023
Yu et al. 2023
Plan
curvature
-24.50
33.33
1: -24.50 – -0.001,
2: -0.001 – 0.001,
3: 0.001 – 33.33
Kilicoglu 2021
Akinci 2022
Yavuz Ozalp et al. 2023
Profile
curvature
-40.17
36.42
1: -40.17 – -0.001,
2: -0.001 – 0.001,
3: 0.001 – 36.42
Kilicoglu 2021
Akinci 2022
Yavuz Ozalp et al. 2023
Slope (o)
0
76.33
1: 0–5, 2: 5–10, 3: 10–15, 4: 15–20, 5: 20–25, 6: 25–30, 7: 30–35,
8: 35–40, 9: 40–45,
10: 45–76.33
Kilicoglu 2021
Akinci 2022
Sun et al. 2022
Yavuz Ozalp et al. 2023
Slope length
0
3577.64
1: 0–56.12, 2: 56.12–140.30, 3: 140.30–252.54, 4: 252.54–392.84, 5: 392.84–561.20, 6: 561.20–785.68, 7: 785.68–1080.31, 8: 1080.31–1487.18, 9: 1487.18–2090.47, 10: 2090.47–3577.64
Hong et al. 2015
Akinci 2022
Ghasemian et al. 2022
Yavuz Ozalp et al. 2023
TPI
-70.88
79.58
1: -70.88 – -21.32, 2: -21.32 – -13.65, 3: -13.65 – -7.75, 4: -7.75 – -3.03, 5: -3.03 – 1.10, 6: 1.10 – 5.23, 7: 5.23 – 9.95, 8: 9.9.5 – 15.85, 9: 15.85 – 25.29, 10: 25.29 – 79.58
Arabameri et al. 2020
Sahin 2020
Akinci 2022
Yavuz Ozalp et al. 2023
TWI
2.26
26.18
1: 2.26–4.79, 2: 4.79–5.92, 3: 5.92–6.86,
4: 6.86–7.80, 5: 7.80–8.92, 6: 8.92–10.33,
7: 10.33–12.11, 8: 12.11–14.46, 9: 14.46–18.12, 10: 18.12–26.18
Sahin 2020
Kilicoglu 2021
Akinci 2022
Yavuz Ozalp et al. 2023

Machine learning algorithms used in the study

Random forest (RF)

The Random Forest (RF) algorithm, originally proposed by Breiman (2001), is a type of ML algorithm designed for nonparametric multivariate classification. It has been widely adopted in LS mapping studies, and has been discussed in detail by Catani et al. (2013). The Random Forest (RF) algorithm is a well-known technique in the field of ensemble learning. It is frequently utilized in classification and regression tasks. In contrast to a single decision tree, which is susceptible to overfitting and may exhibit high variance or bias (Taalab et al. 2018; Park and Kim 2019), RF generates multiple instances of decision trees and aggregates their predictions to arrive at a final classification (Youssef et al. 2016). This approach allows the algorithm to mitigate the weaknesses of individual trees and improve predictive performance, making it a widely-used tool in the field of ML. Decision trees are created using randomly selected subsets of the training data. The final prediction produced by RF is achieved by aggregating the predictions of all the decision trees (Akinci et al. 2020).
Two parameters need to be defined when creating an RF, the number of decision trees (ntree) and the number of variables or factors used at each node of the decision tree (mtry). Although there is no definitive rule for selecting the number of trees in RF, augmenting the number of trees does not guarantee an enhancement in the model’s accuracy (Taalab et al. 2018). Conversely, the variable numbers utilized at each node of the decision tree should be equal to the square root of the total number of variables (Chen et al. 2020). The out-of-bag (OOB) error, which is the percentage of misclassifications over all out-of-bag factors, is used to estimate the generalization error and evaluate the importance of variables (Achour and Pourghasemi 2020; Cao et al. 2020). In this study, RF method implemented using scikit-learn Random Forest regressor with the ntree and mtry parameters set to 100 and 13, respectively. A tenfold cross-validation approach was used to validate the consistency of the model’s results.

Extreme gradient boosting (XGBoost)

XGBoost, introduced by Chen and Guestrin (2016) is a combination of the gradient boosting algorithm and the decision tree models (Wei et al. 2022; Cao et al. 2020). The XGBoost has gained popularity in LS mapping studies (Kavzoglu and Teke 2022). The main advantage of the XGBoost is its performance in the terms of runtime speed and accuracy (Wei et al. 2022). The XGBoost algorithm utilizes a gradient boosting technique, which constructs a tree by splitting features and recursively adding trees (Zhang et al. 2020a, b). For each time a new tree is added, a new function is yielded by fitting the residual value of the previous predictions. The tree is constructed by training the model, hence, the leaf node of the tree stores a score, and the sample’s predicted value is the sum of the scores of all nodes (Ye et al. 2022). The aim of the model is to minimize the difference between predicted value and true value by minimizing the loss function of the training data as shown in Eq. 1 (Wang et al. 2020b).
$${\varphi }^{t}=\sum_{i=1}^{n}l\left({y}_{i},\left({\widehat{y}}_{i}^{\left(t-1\right)}+{f}_{t}\left({x}_{i}\right)\right)\right)+\Omega \left({f}_{t}\right)$$
(1)
where i stands for the number of a given predicted value ŷ (i = 1, 2, 3, ⋯, n); n stands for the total number of y values; t stands for the iteration number; l(yi, ŷi) stands for the loss function between actual value yi and the predicted value ŷi; Xi stands for the features of the i’th sample; ft(Xi) stands for the base learner added to the tth iteration; ( ft) stands for regularization and finally t stands for objective function.
ML algorithms have some parameters that need to be tuned during the training phase. These parameters, called hyperparameters, significantly affect the accuracy, performance and processing speed of the model (Yavuz Ozalp et al. 2023). XGBoost includes several hyperparameters that can be tuned to improve the performance of the model. These include “n_estimators” (the maximum number of iterations or trees), “max_depth” (the maximum depth of the trees), “eta” (the learning rate), “gamma” (the regularization parameter), “colsample_bytree” (the number of features or variables supplied to a tree), “min_child_weight” (the minimum sum of instance weight needed in a child), and “subsample” (the number of samples or observations supplied to a tree).

Light gradient boosting machine (LightGBM)

Light gradient boosting (LightGBM), firstly proposed by Ke et al. (2017), is a gradient boosting approach. LightGBM is designed to overcome performance issues encountered by gradient boosting decision trees in data intensive applications (Zhang et al. 2022b). Thanks to its support for efficient concurrent training, faster training speed, reduced memory usage and distributed computation capabilities, it can handle big data efficiently and can be considered an improvement to the XGBoost (Dai et al. 2021).
The efficiency of LightGBM comes from two novel optimization techniques namely Gradient-based One-Side Sampling (GOSS) that diminishes the number of data instances and Exclusive Feature Bundling (EFB) which decreases the number of features (Zhang et al. 2022b). GOSS down-samples the data instances by discarding a portion of instances with small gradients and retaining instances with large gradients to evaluate information gain. More detailed information can be found in Ke et al. (2017). EFB reduces the number of features by grouping related features together. To be able to bundle features without compromising accuracy, conflict rate is used to determine whether a feature should be bundled or not (Fang et al. 2021).
When creating the LightGBM model, some hyperparameters need to be tuned. These basic hyperparameters are “boosting_type”, “num_leaves”, “max_depth”, “num_iterations”, and “learning_rate” (Zhou et al. 2022; Omotehinwa et al. 2023). The “boosting_type” parameter defines the gradient boosting method to be run. Valid values are “gbdt”, “rf”, “dart”, and “goss”, but the default is “gbdt”. The parameter “num_leaves” refers to the maximum number of leaves in a tree. The “max_depth” parameter controls the maximum depth of each tree. The parameter “num_iterations” determines the number of boosting iterations, or trees to build. The “learning_rate” determines the speed at which the model’s weights are updated after processing each batch of training examples.

Categorical boosting (CatBoost)

Categorical boosting (CatBoost), firstly introduced by Prokhorenkova et al. (2018), is another improved gradient boosting technique. CatBoost can process categorical data along with numerical data and needs less training data compared to other ML methods (Sahin 2022). Instead of repeatedly utilizing the same data in constructing trees, which leads to overfitting in Gradient Boosting, CatBoost uses an ordered boosting technique to combat this problem. Hence, combining the ordered boosting with the process of categorical values prevents a prediction shift stems from the special type of target leakage (Pham et al. 2022). More details about the algorithm can be found at Prokhorenkova et al. (2018).
CatBoost has seven commonly used hyperparameters: iterations, depth, learning_rate, l2_leaf_reg, random_strength, rsm, and border_count (Yavuz Ozalp et al. 2023). The “iterations” parameter specifies the maximum number of trees to be used during training. The “depth” parameter defines the depth of each decision tree. The “learning_rate” determines the step size at each iteration while moving toward a minimum of a loss function. The “l2_leaf_reg” is the coefficient for the L2 regularization term of the cost function. The parameter “random_strength”, which is used to prevent overfitting in the model, expresses the amount of randomness to be used to score splits when the tree structure is selected. The “rsm”, random subspace method, refers to the percentage of features to be used in each split selection when features are randomly re-selected. Lastly, the “border_count” refers to the number of splits for numerical features (AWS 2024).

Multicollinearity analysis

To enhance the accuracy of a landslide susceptibility analysis using an ML model, it is crucial to test the independence of the model’s input variables (Yu et al. 2023). Multicollinearity analysis is conducted to identify any linear correlation between the conditioning factors. The strong linear relationship between the variables can cause prediction results to be inaccurate and decrease the model’s accuracy (Song et al. 2023). In LS mapping studies, the most commonly used indicators for multicollinearity are tolerance (TOL) and variance inflation factor (VIF), which can be calculated using Eqs. 2 and 3 (Kavzoglu et al. 2014; Bai et al. 2015; Arabameri et al. 2020; Wang et al. 2020a, b; Yi et al. 2020; Wei et al. 2022). The VIF is a measure of the increase in the variance of a regression coefficient due to multicollinearity. The TOL is the reciprocal of the VIF value and can also be used to test for multicollinearity between variables (Yu et al. 2023). In Eq. 2, R2 represents the proportion of variance in the target variable (Ye et al. 2022). If the VIF value is greater than 10 or the TOL value is less than 0.1, it indicates a multicollinearity problem, and the multicollinear variables should be removed from the susceptibility models.
$$TOL=1-{R}^{2}$$
(2)
$$VIF= \frac{1}{TOL}= \frac{1}{1-{R}^{2}}$$
(3)

Performance assessment metrics

Akinci and Akinci (2023) emphasized that validation is an essential procedure to evaluate the performance of the models. An LSM that has not been validated holds no scientific value. Sahin (2022) stated that different accuracy metrics can be used to evaluate the performance of ML models. In this study, overall accuracy, precision, recall, F1-score, RMSE and area under the receiver operating characteristic curve (AUC-ROC) metrics have been used to validate the results and compare and evaluate the performance of four different ML models. Except for RMSE, the performance assessment metrics given in Table 3 are calculated using the components of the confusion matrix.
Table 3
Performance assessment metrics
Metric
Equation
Description
Overall accuracy (OA)
\(OA= \frac{TP+TN}{TP+TN+FP+FN}\)
The ratio of landslides and non-landslides that are correctly classified. This shows how well the landslide model works (Ghasemian et al. 2022)
Precision
\(Precision= \frac{TP}{TP+FP}\)
Precision, also called the positive predictive value, is the fraction of relevant instances (TP) amongst the retrieved instances (Azarafza et al. 2021)
Recall
\(Recall= \frac{TP}{TP+FN}\)
Recall (or sensitivity) is the measure at which pixels corresponding to landslides are correctly classified as a landslide having occurred. Its value is ideal as long as it is closer to 1 (Bravo-López et al. 2022)
F1-score
\(F1-score=2 x\frac{Precision\;x\;Recall}{Precision\;+\;Recall}\)
F1-score, one of the metrics commonly used in the evaluation of ML models, is actually a harmonic mean of precision and recall (Yu et al. 2023). The value of F1-score ranges from 0 to 1. A model is reliable if the value of F1-score is close to 1 (Ye et al. 2022)
AUC-ROC
\(True\;positive\;rate\;(TPR)= \frac{TP}{TP\;+\;FN}\)
\(False\;positive\;rate\;(FPR)= \frac{FP}{FP\;+\;TN}\)
The ROC curve plots the FPR on the X axis and the TPR on the Y axis. It shows the trade‐off between the two rates (Pourghasemi et al. 2012). The area under the ROC curve (AUC) is an indicator to check the prediction performance of the model (Yilmaz 2009)
RMSE
\(RMSE= \sqrt{\frac{1}{N}} \sum\limits_{i=1}^{N}{({y}_{obs}-{y}_{pred})}^{2}\)
RMSE is used to measure the prediction errors of models (Nguyen et al. 2019). An RMSE value close to 0 indicates that the ML model has a good performance (Ado et al. 2022)
In the equations in Table 3, TP (true positive) and TN (true negative) express the number of pixels correctly classified as landslide and non-landslide, respectively; FP (false positive) and FN (false negative) denote the number of pixels misclassified as landslide and non-landslide, respectively (Ye et al. 2022).

SHapley additive exPlanation (SHAP)

In applications such as LS mapping, it is crucial to understand why and how a model makes a particular prediction, as well as prediction accuracy. However, the increasing arithmetic power and complexity of machine learning models makes it difficult to understand their internal mechanisms, local behaviors and decision-making processes (Zhang et al. 2023). The new generation of AI models, called explainable or interpretable AI (XAI), aims to explain the local behavior of black box models (Youssef et al. 2023). Lundberg and Lee (2017) proposed SHAP (SHapley Additive exPlanation) to provide explanations for the prediction reasons of various machine learning models, particularly opaque black box models. The SHAP method quantifies the impact of each feature on the model’s prediction. It achieves this by calculating the sum of the Shapley values of each input feature. This improves comprehension of how a model generates predictions (Zhang et al. 2023; Teke and Kavzoglu 2023).

Results

Multicollinearity analysis

Multicollinearity, in its simplest definition, is the presence of high correlation or linear relationship between independent variables in a regression model. Multicollinearity causes the results obtained from the model to be inaccurate. Therefore, multicollinearity analysis is applied in LS studies to test whether there is a high correlation between conditioning factors. This analysis has two natural results: i) there is no multicollinearity among the factors, ii) there is multicollinearity among some factors and the factors found to be correlated should be removed from the model. There are many studies with these two results in the literature. For example, in the studies conducted by He et al. (2023), Song et al. (2023), Vega et al. (2023) and Yu et al. (2023), it was found that there was no serious multicollinearity problem between conditioning factors.
Table 4 displays the outcomes of the multicollinearity analysis conducted on the conditioning factors employed in this study. The initial findings indicate a significant correlation between slope and TRI. Consequently, TRI was eliminated from the model, and a subsequent multicollinearity analysis was conducted on the remaining 13 factors. The highest VIF value 3.918890 and the lowest TOL value was 0.255174 (Table 4). Since the highest VIF value was less than 10 and the lowest TOL value is higher than 0.1, there was no multicollinearity problem between the conditioning factors used in this study.
Table 4
Multicollinearity Analysis Results
Landslide Conditioning Factors
VIF
TOL
Slope
1.694949
0.589988
Aspect
1.080109
0.925832
TWI
3.918890
0.255174
TPI
1.483154
0.674239
Altitude
1.410740
0.708848
Plan Curvature
1.301666
0.768246
Profile Curvature
1.140360
0.876916
Distance to Drainage
1.139817
0.877334
Distance to Faults
1.076383
0.929037
Distance to Roads
1.229125
0.813587
Lithology
1.294081
0.772749
Land Cover
1.031502
0.969460
Slope Length
3.059941
0.326804
In the LS study conducted by Yavuz Ozalp et al. (2023) in Ardeşen and Fındıklı districts of Rize (Turkey), multicollinearity analysis was performed for fifteen conditioning factors. As in this study, the researchers found that there was a high linear correlation between slope and TRI, and TRI was excluded from the susceptibility models. Wang et al. (2022) found collinearity between the factors of slope, elevation variation coefficient, surface cutting depth, relief amplitude, and surface roughness. Since slope is very effective in the occurrence of landslides, the other four factors were eliminated.

Landslide susceptibility mapping

In this study, firstly, landslide susceptibility index (LSI) maps of the study area were produced using RF, CatBoost, XGBoost and LightGBM algorithms. The algorithms were implemented in python programming language using the scikit-learn library. The XGBoost algorithm was implemented with default hyperparameter values (n_estimators = 100, max_depth = 6, eta = 0.3, colsample_bytree = 1, min_child_weight = 1, subsample = 1, gamma = 0). In order not to favour one algorithm over the others and to make an objective comparison, the number of trees was set to 100, the maximum tree depth to 0.6 and the learning rate to 0.3 for the other algorithms. The LSI shows the degree of susceptibility of each pixel in the study area to landslide formation. The higher the LSI value in a pixel, the higher the probability of landslide occurrence in that pixel, the lower the LSI, the lower the probability of landslide occurrence (Wubalem 2021). After the LSIs were created, they were transferred to ArcGIS software where they were reclassified and divided into five different susceptibility classes: very low, low, medium, high and very high. Thus, LSMs of the study area were obtained (Fig. 6). This classification was achieved by utilizing the natural breaks algorithm (Jenks 1967).
Table 5 presents the areal distribution ratios of susceptibility classes for each ML model. According to the areal distribution of susceptibility classes in Table 5, most of the study area is not susceptible to the landslides. 85.89% of the study area in RF, 71.86% of the study area in CatBoost, 70.16% of the study area in XGBoost and 77.31% of the study area in LightGBM is classified as very low and low susceptibility. Only 8.74% of the study area in RF, 12.85% of the study area in CatBoost, 11.37% of the study area in XGBoost and 10.43% of the study area in LightGBM is highly and very highly susceptible to the landslides.
Table 5
Comparison of the results of the ML models
Model
Susceptibility
class
Landslide
probability
PoA
(%)
NLP
PLP
(%)
FR
RF
Very low
0 – 0.08
76.50
753
0.815
0.0107
Low
0.08 – 0.267
9.39
2303
2.491
0.2653
Moderate
0.267 – 0.506
5.37
3566
3.857
0.7182
High
0.506 – 0.773
4.21
10,829
11.714
2.7824
Very high
0.773 – 1.000
4.53
74,995
81.123
17.9079
CatBoost
Very low
-0.816 – -0.057
15.95
6
0.007
0.0004
Low
-0.057 – 0.101
55.91
29
0.031
0.0006
Moderate
0.101 – 0.335
15.29
320
0.346
0.0226
High
0.335 – 0.677
6.90
4309
4.661
0.6755
Very high
0.677 – 1.311
5.95
87,782
94.955
15.9588
XGBoost
Very low
-1.214 – -0.093
8.77
127
0.138
0.0157
Low
-0.093 – 0.059
61.39
490
0.530
0.0086
Moderate
0.059 – 0.288
18.47
2165
2.342
0.1268
High
0.288 – 0.658
6.44
10,517
11.376
1.7665
Very high
0.658 – 1.561
4.93
79,147
85.614
17.3659
LightGBM
Very low
-0.811 – -0.063
13.64
158
0.171
0.0125
Low
-0.063 – 0.099
63.67
968
1.047
0.0164
Moderate
0.099 – 0.361
12.26
2060
2.228
0.1817
High
0.361 – 0.721
5.18
7966
8.617
1.6635
Very high
0.721 – 1.488
5.25
81,294
87.937
16.7499
PoA: Percentage of area, NLP: number of landslide pixels, PLP: Percentage of landslide pixels, FR: Frequency ratio

Performance assessment and comparison

In this study, various metrics including overall accuracy (OA), precision, recall, F1-score, RMSE and area under the receiver operating characteristic (ROC) curve (AUC) were employed to compare and evaluate the performance of the LS models. Model assessment metrics were applied for both training and validation stages. In the evaluation made in terms of OA, it was determined that the model with the lowest accuracy value for both training and validation stages was RF (Table 6). Although CatBoost, XGBoost and LightGBM have close accuracy values, LightGBM provided slightly better accuracy than other models in both training and validation stages. LightGBM also outperformed the other models in terms of precision, recall and F1-score. However, it is a known fact that ML models may perform differently depending on both the characteristics of the study area and the conditioning factors used.
Table 6
Performance of the ML models in training and validation stage
Stage
Metrics
RF
CatBoost
XGBoost
LightGBM
Training
Accuracy (%)
82.556
91.948
92.885
92.992
Precision
0.82656
0.92143
0.92917
0.93124
Recall
0.82553
0.91942
0.92884
0.92996
F1-score
0.82541
0.91941
0.92883
0.92988
RMSE
0.3572
0.2598
0.2411
0.2437
Validation
Accuracy (%)
82.338
91.831
92.641
92.765
Precision
0.82433
0.92024
0.92667
0.92913
Recall
0.82344
0.91845
0.92643
0.92755
F1-score
0.82326
0.91828
0.92640
0.92759
RMSE
0.3576
0.2628
0.2469
0.2481
One of the commonly used metrics to measure the accuracy of regression models is RMSE (Trinh et al. 2023). RMSE is used to measure the prediction error of the model, and the closer it is to 0, the better the performance of the model (Nguyen et al. 2019; Ado et al. 2022; Trinh et al. 2023). Compared to precision, recall, and F1-score, XGBoost outperformed the other models with the lowest RMSE values.
In this study, the AUC metric, which is the most widely used performance assessment metric in LS mapping studies, was also used. Figure 7 shows the ROC curves and AUC values of the models. When Fig. 7 is examined, it is seen that the XGBoost model has the highest AUC value (0.9773), followed by the LightGBM (0.9751), CatBoost (0.9708) and RF (0.8976) models, respectively. As with other evaluation metrics, RF lagged behind other models in terms of AUC. In the study by Sahin (2022), RF also lagged behind other ensemble learning algorithms such as CatBoost, XGBoost and LightGBM in terms of AUC value.

Interpretation of models with SHAP

In LS mapping studies, it has been observed that some factors may have minimum importance in model predictions, while others may be more effective in this process. Yu et al. (2023) emphasized that determining the relative importance of conditioning factors can help to better understand the causes of landslides in a region. The importance values of conditioning factors for RF, CatBoost, XGBoost, and LightGBM are presented in Fig. 8. The importance values of the conditioning factors vary between the different models. However, lithology, altitude, distance to faults, and aspect consistently rank in the top 4, while slope length, TWI, plan curvature, and profile curvature rank in the bottom 4.
The effects of conditioning factors on model outputs are interpreted not only by interpreting Fig. 8 but also by using SHAP values. Kavzoglu and Teke (2022) emphasized that unlike the feature importance function of ML algorithms, SHAP can determine whether the factors contribute positively or negatively to the model outputs. SHAP summary plot, one of the graphs offered by the SHAP library, is used to visualize the effect of each factor of the model on the prediction. Figure 9 and Fig. 10 shows the SHAP values of the thirteen conditioning factors for RF, CatBoost, XGBoost and LightGBM. As can be seen in the SHAP summary plots, although the effects of the conditioning factors on the model outputs vary in different models, altitude, lithology, aspect, distance to faults and slope were found to be the most effective factors. On the other hand, SHAP values revealed that slope length, TWI, profile curvature and plan curvature were least effective compared to other landslide conditioning factors. The results appear to be partially consistent with previous studies. For example, in the LS mapping study carried out by Akinci (2022) in Arhavi, Hopa and Kemalpaşa districts of Artvin, it was determined that TWI, slope length and curvature were the least important factors in the occurrence of landslides. In the study conducted by Akinci and Zeybek (2021), landslide susceptibility maps of Ardanuç district of Artvin province were produced using LR, SVM and RF models. In this study, the researchers determined that lithology, altitude and distance to the road parameters were the most effective factors, while TWI and curvature parameters were the least effective factors. In the study by Can et al. (2021), the predictions of the XGBoost-based landslide susceptibility model were interpreted using the SHAP summary plot. Similar to this study, the researchers determined that lithology and altitude have a greater impact on the prediction results, while profile curvature has the least impact on the prediction results.
On the other hand, we interpreted the local contribution of the conditioning factors to the model outputs using SHAP waterfall plots (refer to Fig. 11 and Fig. 12). In the waterfall plot, the red arrow indicates that the SHAP value of the factor is greater than zero, meaning that the factor provides a positive gain to the landslide, while the blue arrow indicates a negative gain (Chang et al. 2023). Figure 11 shows that the distance to faults in the RF model and aspect in the CatBoost model had the largest negative impact on the occurrence of landslides in the study area. Lithology was also found to be a significant factor in both models. According to Fig. 12, altitude, aspect, distance to faults, distance to drainage and lithology in the XGBoost model (Fig. 12a) and distance to faults, altitude, land cover, aspect and slope in the LightGBM model (Fig. 12b) were determined as the factors with the largest positive contribution to the occurrence of landslides in the study area.

Discussion

Comparison of factors

ML algorithms have different decision-making mechanisms. Therefore, in ML models using the same factors, the contributions of the factors to the model predictions are also different. However, ML models can only indicate the relative importance of factors on prediction results through feature importance functions. This function is insufficient to interpret how the models work when making predictions, which is why ML models are often referred to as “black box” models. To overcome this problem and explain the decision-making mechanism of ML-based landslide susceptibility models, methods such as SHAP should be used (Lu et al. 2023; Pradhan et al. 2023). The interpretability of a model is directly proportional to the ease of understanding the reasoning behind a specific decision. For this reason, we see that the SHAP method is used in studies such as LS mapping (Pradhan et al. 2023; Vega et al. 2023; Zhang et al. 2023) and wildfire susceptibility mapping (Iban and Sekertekin 2022).
When SHAP summary plots for XGBoost and LightGBM models are analyzed, it is seen that elevation and lithology are the most effective factors in model outputs (Fig. 10). This means that areas with more anthropogenic activities and weak lithological units are prone to landslides. Agricultural areas are often concentrated at specific altitudes and aspects within settlements, leading to increased human activity in these regions. Uncontrolled irrigation and excavation works can trigger landslides in these areas.

Comparison of models

In this study, the prediction performance of ensemble learning algorithms were evaluated using accuracy metrics (OA, F1-score, precision, recall, RMSE and AUC) commonly used in LS mapping studies. There are many recent studies in the LS mapping literature that use these metrics to assess the accuracy of susceptibility models (Kavzoglu and Teke 2022; He et al. 2023; Sun et al. 2023a, b; Vega et al. 2023; Yu et al. 2023; Zhang et al. 2023). Kavzoglu and Teke (2022) suggest that OA is a reliable metric for measuring model robustness in LS mapping studies. Ye et al. (2022) explain that the F1-score ranges from 0 to 1, and a value close to 1 indicates model reliability. As with OA, the model with the lowest F1-score was again the RF (Table 6). It was seen that the highest F1-score belonged to the LightGBM model and this model had a high classification capacity for both the training and validation dataset. In summary, LightGBM was found to be slightly superior to other models in all metrics except AUC. In the LS mapping study conducted by Sun et al. (2023b), OA, precision, recall and F1-score metrics were used to evaluate the accuracy of XGBoost and LightGBM models, and as in this study, LightGBM gave better results than XGBoost in all metrics. Tree-based ensemble learning algorithms are widely used in flood and forest fire susceptibility mapping apart from landslide susceptibility. For example, Saber et al. (2022) was used RF, LightGBM and CatBoost models for flash flood susceptibility prediction and determined that LightGBM outperformed other models in terms of evaluation metrics and processing time.
Another statistical criterion used in the study was AUC, which represents the area under the ROC curve. The AUC value of an ROC curve ranges from 0.5 to 1, and an AUC value close to 1 indicates that the model is excellent (Sahin 2022). According to Chen et al. (2017), Jiao et al. (2019), and Wu et al. (2020), the AUC value is generally classified in 5 ways: poor (0.5–0.6), average (0.6–0.7), good (0.7–0.8), very good (0.8–0.9), and excellent (0.9–1). According to this classification, RF performed very good while CatBoost, LightGBM and XGBoost performed excellent. When evaluated in terms of AUC, it is seen that this study is consistent with other studies in the literature. Because in many studies in the literature, it has been stated that ensemble learning algorithms are more successful than single ML algorithms (Kavzoglu and Teke 2022).
On the other hand, in order to assess the practical usability of an LSM, it is important to consider the accuracy of the ML model as well as the rationality of the generated LSM. To ensure rationality, an LSM needs to be able to accurately classify existing landslides in the study area. In this study, the LSMs generated were compared with the LIM, and the distribution of landslide pixels across the susceptibility classes in the LSMs was determined (Table 5). Such an LSM is considered rational if the existing landslides in the LIM remain in areas of high susceptibility as much as possible, and the areas of very high susceptibility in the LSM are as small as possible (Guo et al. 2021). In the RF, CatBoost, XGBoost and LightGBM models, the percentages of very high susceptible areas were determined as 4.53%, 5.95%, 4.93% and 5.25%, and percentages of landslide were determined as 81.123%, 94.955%, 85.614% and 87.937%, respectively. The frequency ratios were 17.9079, 15.9588, 17.3659, and 16.7499 (Table 5). LSMs showed a similar trend in general. As the susceptibility levels increased, the frequency ratios also tended to increase. In all models, the highest number of landslides was observed in the very high susceptible class. This comparison showed that the landslide occurrence rate gradually increased from very low prone areas to very high prone areas and the LSMs produced were reasonable. However, the LSM produced by the CatBoost model seems to be more rational and reasonable than other models. Because in this model, approximately 99.6% of the existing landslides remain in high and very highly landslide susceptible areas. In the LS mapping study by Yavuz Ozalp et al. (2023), RF, GBM, XGBoost and CatBoost algorithms were used. The researchers also evaluated the rationality of the LSMs produced by these algorithms. The evaluation results showed that the LSMs produced by CatBoost and XGBoost models are reasonable and rational for the study area.

Limitations and future suggestions

The ML models used in the study were able to predict the probability of landslide occurrence in the study area well. However, it is considered that the study has an important limitation. As it is known, ML models need accurate, up-to-date and complete data during the training phase. If the data is not complete or up-to-date, the model may perform poorly because it is not trained correctly. In this context, landslide inventory data are of vital importance in ML-based landslide susceptibility mapping studies. The most important constraint in this study is related to the up-to-dateness of landslide inventory data. Because Artvin Provincial Disaster and Emergency Directorate last updated the landslide inventory map of the study area in November 2016. Since landslides that occurred in recent years were not included in the inventory map, the models may have been trained with missing data. This situation was also mentioned in the Artvin Provincial Disaster Risk Reduction Plan and updating the landslide inventory maps was added to the action plan. Therefore, future studies in the study area should primarily focus on updating landslide inventory maps using field surveys and high-resolution satellite images. In addition, deep learning algorithms such as CNN and RNN can be used in future studies to produce highly accurate landslide susceptibility maps and the performance of deep learning algorithms can be compared with tree-based ensemble learning algorithms.

Conclusion

Artvin is a province prone to landslides due to its highly rugged topography. Landslides constitute a large part of the natural disasters occurring throughout the province. In the provincial disaster risk reduction plan, it is stated that landslides are the priority disaster type in the region and various strategic actions are proposed for risk reduction. Landslides are observed in all districts of Artvin. Landslides in the region cause damage to transport networks, energy transmission and drinking water lines, degradation in agricultural and forest areas, damage to buildings and loss of life. Therefore, landslide susceptibility mapping is of crucial importance to reduce the damages caused by landslides in all districts of Artvin province. In this study, the landslide susceptibility of Murgul district of Artvin province was analyzed. Four different ML models, random forest, XGBoost, LightGBM and CatBoost were employed to generate LSMs. The models considered multiple factors, including land cover, aspect, slope, lithology, elevation, plan curvature, profile curvature, distance to drainage networks, distance to faults, distance to roads, slope length, topographic position index (TPI), and topographic wetness index (TWI). The LIM provided by the GDMRE and Artvin Provincial Directorate of Disaster and Emergency, which documented 54 landslide polygons, was utilized in the analysis. The resulting LSMs classified the study area into five classes: very low, low, medium, high, and very high, based on the natural breaks (jenks) classification. The performance of these models was evaluated using metrics like overall accuracy, precision, recall, F1-score, RMSE and AUC-ROC. The models that performed the best were XGBoost in terms of RMSE and AUC values, and LightGBM in terms of accuracy and F1-score. The LSMs produced through XGBoost and LightGBM are highly valuable for landslide risk assessment and risk mitigation studies in the study area. The maps produced can guide engineers for disaster and emergency planning, decision makers and urban planners for sustainable land use planning and insurance companies for natural disaster insurance.

Declarations

Conflict of interest

The authors declare no conflict of interest.

Competing interests

The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
Zurück zum Zitat Achour Y, Pourghasemi HR (2020) How do machine learning techniques help in increasing accuracy of landslide susceptibility maps? Geosci Front 11:871–883CrossRef Achour Y, Pourghasemi HR (2020) How do machine learning techniques help in increasing accuracy of landslide susceptibility maps? Geosci Front 11:871–883CrossRef
Zurück zum Zitat Aditian A, Kubota T, Shinohara Y (2018) Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology 318:101–111ADSCrossRef Aditian A, Kubota T, Shinohara Y (2018) Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology 318:101–111ADSCrossRef
Zurück zum Zitat Akgun A, Dag S, Bulut F (2008) Landslide susceptibility mapping for a landslide-prone area (Findikli, NE of Turkey) by likelihood-frequency ratio and weighted linear combination models. Environ Geol 54:1127–1143ADSCrossRef Akgun A, Dag S, Bulut F (2008) Landslide susceptibility mapping for a landslide-prone area (Findikli, NE of Turkey) by likelihood-frequency ratio and weighted linear combination models. Environ Geol 54:1127–1143ADSCrossRef
Zurück zum Zitat Akinci HA, Akinci H (2023) Machine learning based forest fire susceptibility assessment of Manavgat district (Antalya). Turkey Earth Science Informatics 16(1):397–414ADSCrossRef Akinci HA, Akinci H (2023) Machine learning based forest fire susceptibility assessment of Manavgat district (Antalya). Turkey Earth Science Informatics 16(1):397–414ADSCrossRef
Zurück zum Zitat Akinci H, Yavuz Ozalp A (2021) Landslide susceptibility mapping and hazard assessment in Artvin (Turkey) using frequency ratio and modified information value model. Acta Geophys 69:725–745ADSCrossRef Akinci H, Yavuz Ozalp A (2021) Landslide susceptibility mapping and hazard assessment in Artvin (Turkey) using frequency ratio and modified information value model. Acta Geophys 69:725–745ADSCrossRef
Zurück zum Zitat Akinci H, Zeybek M, Dogan S (2021) Evaluation of landslide susceptibility of Şavşat District of Artvin Province (Turkey) using machine learning techniques. In: Landslides. IntechOpen Akinci H, Zeybek M, Dogan S (2021) Evaluation of landslide susceptibility of Şavşat District of Artvin Province (Turkey) using machine learning techniques. In: Landslides. IntechOpen
Zurück zum Zitat Bai SB, Lu P, Wang J (2015) Landslide susceptibility assessment of the Youfang catchment using logistic regression. J Mt Sci 12:816–827CrossRef Bai SB, Lu P, Wang J (2015) Landslide susceptibility assessment of the Youfang catchment using logistic regression. J Mt Sci 12:816–827CrossRef
Zurück zum Zitat Bravo-López E, Fernández Del Castillo T, Sellers C, Delgado-García J (2022) Landslide susceptibility mapping of landslides with artificial neural networks: Multi-approach analysis of backpropagation algorithm applying the neuralnet package in Cuenca. Ecuador Remote Sensing 14(14):3495. https://doi.org/10.3390/rs14143495ADSCrossRef Bravo-López E, Fernández Del Castillo T, Sellers C, Delgado-García J (2022) Landslide susceptibility mapping of landslides with artificial neural networks: Multi-approach analysis of backpropagation algorithm applying the neuralnet package in Cuenca. Ecuador Remote Sensing 14(14):3495. https://​doi.​org/​10.​3390/​rs14143495ADSCrossRef
Zurück zum Zitat Cao J, Zhang Z, Du J, Zhang L, Song Y, Sun G (2020) Multi-geohazards susceptibility mapping based on machine learning—A case study in Jiuzhaigou. China Nat Hazards 102:851–871CrossRef Cao J, Zhang Z, Du J, Zhang L, Song Y, Sun G (2020) Multi-geohazards susceptibility mapping based on machine learning—A case study in Jiuzhaigou. China Nat Hazards 102:851–871CrossRef
Zurück zum Zitat Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat Hazards Earth Syst Sci 13(11):2815–2831ADSCrossRef Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Nat Hazards Earth Syst Sci 13(11):2815–2831ADSCrossRef
Zurück zum Zitat Chen W, Xie X, Wang J, Pradhan B, Hong H, Tien Bui D, Duan Z, Ma JA (2017) comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160CrossRef Chen W, Xie X, Wang J, Pradhan B, Hong H, Tien Bui D, Duan Z, Ma JA (2017) comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160CrossRef
Zurück zum Zitat Chen W, Zhang S, Li R, Shahabi H (2018) Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naïve Bayes tree for landslide susceptibility modeling. Sci Total Environ 644:1006–1018ADSCrossRefPubMed Chen W, Zhang S, Li R, Shahabi H (2018) Performance evaluation of the GIS-based data mining techniques of best-first decision tree, random forest, and naïve Bayes tree for landslide susceptibility modeling. Sci Total Environ 644:1006–1018ADSCrossRefPubMed
Zurück zum Zitat Chen T, Zhu L, Niu RQ, Trinder CJ, Peng L, Lei T (2020) Mapping landslide susceptibility at the Three Gorges Reservoir, China, using gradient boosting decision tree, random forest and information value models. J Mt Sci 17:670–685CrossRef Chen T, Zhu L, Niu RQ, Trinder CJ, Peng L, Lei T (2020) Mapping landslide susceptibility at the Three Gorges Reservoir, China, using gradient boosting decision tree, random forest and information value models. J Mt Sci 17:670–685CrossRef
Zurück zum Zitat Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794 Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794
Zurück zum Zitat Dai L, Zhu M, He Z, He Y, Zheng Z, Zhou G (2021) Landslide risk classification based on ensemble machine learning. In: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pp 3924–3927 Dai L, Zhu M, He Z, He Y, Zheng Z, Zhou G (2021) Landslide risk classification based on ensemble machine learning. In: 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, pp 3924–3927
Zurück zum Zitat Dalkes M, Korkmaz MS (2023) Comparison of Analytic Hierarchy Process and Frequency Ratio Methods in Landslide Susceptibility Analysis: Example of Akçaabat and Düzköy districts of Trabzon province. Journal of Natural Hazards and Environment 9(1):16–38. https://doi.org/10.21324/dacd.1105000CrossRef Dalkes M, Korkmaz MS (2023) Comparison of Analytic Hierarchy Process and Frequency Ratio Methods in Landslide Susceptibility Analysis: Example of Akçaabat and Düzköy districts of Trabzon province. Journal of Natural Hazards and Environment 9(1):16–38. https://​doi.​org/​10.​21324/​dacd.​1105000CrossRef
Zurück zum Zitat Das S, Sarkar S, Kanungo DP (2023) A critical review on landslide susceptibility zonation: recent trends, techniques, and practices in Indian Himalaya. Nat Hazards 115:23–72CrossRef Das S, Sarkar S, Kanungo DP (2023) A critical review on landslide susceptibility zonation: recent trends, techniques, and practices in Indian Himalaya. Nat Hazards 115:23–72CrossRef
Zurück zum Zitat Duman TY, Çan T (2023) Characteristics of landslides and assessment of deep-seated landslide susceptibility in Northern Turkey. Characteristics of landslides and assessment of deep-seated landslide susceptibility in Northern Turkey. Mediterranean Geoscience Reviews 5:131–157. https://doi.org/10.1007/s42990-023-00105-3CrossRef Duman TY, Çan T (2023) Characteristics of landslides and assessment of deep-seated landslide susceptibility in Northern Turkey. Characteristics of landslides and assessment of deep-seated landslide susceptibility in Northern Turkey. Mediterranean Geoscience Reviews 5:131–157. https://​doi.​org/​10.​1007/​s42990-023-00105-3CrossRef
Zurück zum Zitat Erener A, Mutlu A, Düzgün HS (2016) A comparative study for landslide susceptibility mapping using GIS-based multi-criteria decision analysis (MCDA), logistic regression (LR) and association rule mining (ARM). Eng Geol 203:45–55CrossRef Erener A, Mutlu A, Düzgün HS (2016) A comparative study for landslide susceptibility mapping using GIS-based multi-criteria decision analysis (MCDA), logistic regression (LR) and association rule mining (ARM). Eng Geol 203:45–55CrossRef
Zurück zum Zitat Jenks GF (1967) The data model concept in statistical mapping. Int Yearb Cartogr 7:186–190 Jenks GF (1967) The data model concept in statistical mapping. Int Yearb Cartogr 7:186–190
Zurück zum Zitat Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) LightGBM: A highly efficient gradient boosting decision tree. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), December 4–9, Long Beach, CA, USA, pp 3149–3157 Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) LightGBM: A highly efficient gradient boosting decision tree. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), December 4–9, Long Beach, CA, USA, pp 3149–3157
Zurück zum Zitat Keskin I (2013a) 1:100,000 Scale Geological Map of Turkey, No:178 Artvin-F46 Map Sheet. General Directorate of Mineral Research and Exploration, Geological Research Department, Ankara, Turkey. (in Turkish) Keskin I (2013a) 1:100,000 Scale Geological Map of Turkey, No:178 Artvin-F46 Map Sheet. General Directorate of Mineral Research and Exploration, Geological Research Department, Ankara, Turkey. (in Turkish)
Zurück zum Zitat Keskin I (2013b) 1:100,000 Scale Geological Map of Turkey, No:179 Artvin-E47 and F47 Map Sheet. General Directorate of Mineral Research and Exploration, Geological Research Department, Ankara, Turkey. (in Turkish) Keskin I (2013b) 1:100,000 Scale Geological Map of Turkey, No:179 Artvin-E47 and F47 Map Sheet. General Directorate of Mineral Research and Exploration, Geological Research Department, Ankara, Turkey. (in Turkish)
Zurück zum Zitat Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), December 4–9, Long Beach, CA, USA, pp 4768–4777 Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), December 4–9, Long Beach, CA, USA, pp 4768–4777
Zurück zum Zitat Pham K, Kim D, Le CV, Choi H (2022) Dual tree-boosting framework for estimating warning levels of rainfall-induced landslides. Landslides 19(9):2249–2262CrossRef Pham K, Kim D, Le CV, Choi H (2022) Dual tree-boosting framework for estimating warning levels of rainfall-induced landslides. Landslides 19(9):2249–2262CrossRef
Zurück zum Zitat Pourghasemi HR, Rahmati O (2018) Prediction of the landslide susceptibility: which algorithm, which precision? CATENA 162:177–192CrossRef Pourghasemi HR, Rahmati O (2018) Prediction of the landslide susceptibility: which algorithm, which precision? CATENA 162:177–192CrossRef
Zurück zum Zitat Pradhan B, Lee S (2010) Landslide susceptibility assessment and factor effect analysis: backpropagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modelling. Environ Model Softw 25:747–759CrossRef Pradhan B, Lee S (2010) Landslide susceptibility assessment and factor effect analysis: backpropagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modelling. Environ Model Softw 25:747–759CrossRef
Zurück zum Zitat Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) CatBoost: Unbiased boosting with categorical features. In: Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), December 2–8, Montréal, Canada Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) CatBoost: Unbiased boosting with categorical features. In: Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), December 2–8, Montréal, Canada
Zurück zum Zitat Saber M, Boulmaiz T, Guermoui M, Abdrabo KI, Kantoush SA, Sumi T et al (2022) Examining LightGBM and CatBoost models for wadi flash flood susceptibility prediction. Geocarto Int 37(25):7462–7487ADSCrossRef Saber M, Boulmaiz T, Guermoui M, Abdrabo KI, Kantoush SA, Sumi T et al (2022) Examining LightGBM and CatBoost models for wadi flash flood susceptibility prediction. Geocarto Int 37(25):7462–7487ADSCrossRef
Zurück zum Zitat Sahin EK (2022) Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocarto Int 37(9):2441–2465ADSCrossRef Sahin EK (2022) Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocarto Int 37(9):2441–2465ADSCrossRef
Zurück zum Zitat Sifa SF, Mahmud T, Abdullah Tarin M, Enamul Haque DM (2020) Event-based landslide susceptibility mapping using weights of evidence (WoE) and modified frequency ratio (MFR) model: a case study of Rangamati district in Bangladesh. Geology, Ecology, and Landscapes 4(3):222–235CrossRef Sifa SF, Mahmud T, Abdullah Tarin M, Enamul Haque DM (2020) Event-based landslide susceptibility mapping using weights of evidence (WoE) and modified frequency ratio (MFR) model: a case study of Rangamati district in Bangladesh. Geology, Ecology, and Landscapes 4(3):222–235CrossRef
Zurück zum Zitat Sun D, Ding Y, Zhang J, Wen H, Wang Y, Xu J, Zhou X (2022) Liu R (2022) Essential insights into decision mechanism of landslide susceptibility mapping based on different machine learning models. Geocarto Int 10(1080/10106049):2146763 Sun D, Ding Y, Zhang J, Wen H, Wang Y, Xu J, Zhou X (2022) Liu R (2022) Essential insights into decision mechanism of landslide susceptibility mapping based on different machine learning models. Geocarto Int 10(1080/10106049):2146763
Zurück zum Zitat Taalab K, Cheng T, Zhang Y (2018) Mapping landslide susceptibility and types using Random Forest. Big Earth Data 2:159–178CrossRef Taalab K, Cheng T, Zhang Y (2018) Mapping landslide susceptibility and types using Random Forest. Big Earth Data 2:159–178CrossRef
Zurück zum Zitat Teke A, Kavzoglu T (2023) Explainable artificial intelligence empowered landslide susceptibility mapping using Extreme Gradient Boosting (XGBoost). Advanced Engineering Days 6:74–76 Teke A, Kavzoglu T (2023) Explainable artificial intelligence empowered landslide susceptibility mapping using Extreme Gradient Boosting (XGBoost). Advanced Engineering Days 6:74–76
Zurück zum Zitat Varnes DJ (1978) Slope movement types and processes. In: Schuster RL, Krizek RJ (eds) Special Report 176: Landslides: Analysis and Control. TRB, National Research Council, Washington, DC, pp 11–33 Varnes DJ (1978) Slope movement types and processes. In: Schuster RL, Krizek RJ (eds) Special Report 176: Landslides: Analysis and Control. TRB, National Research Council, Washington, DC, pp 11–33
Zurück zum Zitat Ye P, Yu B, Chen W, Liu K, Ye L (2022) Rainfall-induced landslide susceptibility mapping using machine learning algorithms and comparison of their performance in Hilly area of Fujian Province. China Nat Hazards 113:965–995CrossRef Ye P, Yu B, Chen W, Liu K, Ye L (2022) Rainfall-induced landslide susceptibility mapping using machine learning algorithms and comparison of their performance in Hilly area of Fujian Province. China Nat Hazards 113:965–995CrossRef
Zurück zum Zitat Yilmaz I (2009) Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat—Turkey). Comput Geosci 35(6):1125–1138ADSCrossRef Yilmaz I (2009) Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat—Turkey). Comput Geosci 35(6):1125–1138ADSCrossRef
Zurück zum Zitat Youssef AM, Pourghasemi HR (2021) Landslide susceptibility mapping using machine learning algorithms and comparison of their performance at Abha Basin, Asir Region. Saudi Arabia Geosci Front 12(2):639–655CrossRef Youssef AM, Pourghasemi HR (2021) Landslide susceptibility mapping using machine learning algorithms and comparison of their performance at Abha Basin, Asir Region. Saudi Arabia Geosci Front 12(2):639–655CrossRef
Zurück zum Zitat Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM (2016) Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides 13:839–856CrossRef Youssef AM, Pourghasemi HR, Pourtaghi ZS, Al-Katheeri MM (2016) Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides 13:839–856CrossRef
Zurück zum Zitat Zhang RH, Wu CZ, Goh ATC, Böhlke T, Zhang WG (2020a) Estimation of diaphragm wall deflections for deep braced excavation in anisotropic clays using ensemble learning. Geosci Front 12(1):365–373CrossRef Zhang RH, Wu CZ, Goh ATC, Böhlke T, Zhang WG (2020a) Estimation of diaphragm wall deflections for deep braced excavation in anisotropic clays using ensemble learning. Geosci Front 12(1):365–373CrossRef
Zurück zum Zitat Zhang T, Fu Q, Wang H, Liu F, Wang H, Han L (2022a) Bagging-based machine learning algorithms for landslide susceptibility modeling. Nat Hazards 110:823–846CrossRef Zhang T, Fu Q, Wang H, Liu F, Wang H, Han L (2022a) Bagging-based machine learning algorithms for landslide susceptibility modeling. Nat Hazards 110:823–846CrossRef
Zurück zum Zitat Zhang W, Wu C, Tang L, Gu X, Wang L (2022b) Efficient time-variant reliability analysis of Bazimen landslide in the Three Gorges Reservoir Area using XGBoost and LightGBM algorithms. Gondwana Res 123:41–53ADSCrossRef Zhang W, Wu C, Tang L, Gu X, Wang L (2022b) Efficient time-variant reliability analysis of Bazimen landslide in the Three Gorges Reservoir Area using XGBoost and LightGBM algorithms. Gondwana Res 123:41–53ADSCrossRef
Metadaten
Titel
Comparison of tree-based ensemble learning algorithms for landslide susceptibility mapping in Murgul (Artvin), Turkey
verfasst von
Ziya Usta
Halil Akıncı
Alper Tunga Akın
Publikationsdatum
04.03.2024
Verlag
Springer Berlin Heidelberg
Erschienen in
Earth Science Informatics / Ausgabe 2/2024
Print ISSN: 1865-0473
Elektronische ISSN: 1865-0481
DOI
https://doi.org/10.1007/s12145-024-01259-w

Weitere Artikel der Ausgabe 2/2024

Earth Science Informatics 2/2024 Zur Ausgabe

Premium Partner