Skip to main content
Erschienen in: International Journal of Disaster Risk Science 6/2020

Open Access 19.11.2020 | Article

Naïve Bayes Classifier for Debris Flow Disaster Mitigation in Mount Merapi Volcanic Rivers, Indonesia, Using X-band Polarimetric Radar

verfasst von: Ratih Indri Hapsari, Bima Ahida Indaka Sugna, Dandung Novianto, Rosa Andrie Asmara, Satoru Oishi

Erschienen in: International Journal of Disaster Risk Science | Ausgabe 6/2020

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Debris flow triggered by rainfall that accompanies a volcanic eruption is a serious secondary impact of a volcanic disaster. The probability of debris flow events can be estimated based on the prior information of rainfall from historical and geomorphological data that are presumed to relate to debris flow occurrence. In this study, a debris flow disaster warning system was developed by applying the Naïve Bayes Classifier (NBC). The spatial likelihood of the hazard is evaluated at a small subbasin scale by including high-resolution rainfall measurements from X-band polarimetric weather radar, a topographic factor, and soil type as predictors. The study was conducted in the Gendol River Basin of Mount Merapi, one of the most active volcanoes in Indonesia. Rainfall and debris flow occurrence data were collected for the upper Gendol River from October 2016 to February 2018 and divided into calibration and validation datasets. The NBC was used to estimate the status of debris flow incidences displayed in the susceptibility map that is based on the posterior probability from the predictors. The system verification was performed by quantitative dichotomous quality indices along with a contingency table. Using the validation datasets, the advantage of the NBC for estimating debris flow occurrence is confirmed. This work contributes to existing knowledge on estimating debris flow susceptibility through the data mining approach. Despite the existence of predictive uncertainty, the presented system could contribute to the improvement of debris flow countermeasures in volcanic regions.

1 Introduction

Volcanic eruptions cause many direct and indirect hazards. Water flow caused by high precipitation on a volcano flank may be transformed into a debris flow. The rapid flow of volcanic material and water mixtures triggered by rainfall is a serious indirect hazard of volcanic eruptions. Debris flow that accompanied the 1991 Mount Pinatubo Philippine eruption killed 1500 people within 2 years. In Indonesia, the Mount Merapi eruption in 2010 caused 240 debris flow cases within 2 years. Debris flows occurred in almost all rivers, transported a large volume of material up to 15 km (Bélizal et al. 2013), and damaged 51 houses (Solikhin et al. 2015b).
Numerous studies have stated that high rainfall with a long duration is significantly associated with debris flow (Rodolfo and Arguden 1991; Lavigne and Thouret 2003; Bélizal et al. 2013). However, the characteristics of debris flow in natural rivers can be very complex, combining several relevant factors besides rainfall. Material mobilization may be caused by slope instability, which is related to soil structure (Wilford et al. 2004). In the case of fine-grained material, earth material can be mobilized in the subaerial flow that forms a debris flow (Varnes 1978). There is a risk of debris flow when intense rain occurs on a steep slope with large amount of deposit material (Takahashi 2007). Particularly for debris flow in volcanic rivers, deposit material thickness and properties such as soil suction, pore-water pressure, and shear resistance create complex circumstances for debris flow occurrence (Berti et al. 2012; Mead and Magill 2017).
Debris flow vulnerability assessment is imperative to identify the susceptibility of the drainage basin and develop mitigation priorities. Some studies on the debris behavior on Merapi Volcano were aimed at developing the mitigation system (Bélizal et al. 2013; Solikhin et al. 2015b). Modeling of the debris transport and integrating high-resolution rainfall observations with a lahar model were conducted by Syarifuddin et al. (2017). It was considered that a fine spatial rainfall measurement from X-band polarimetric weather (X-MP) radar in volcanic rivers would usefully extend the observation of small-scale rainfall in the typically narrow watershed that the debris flow is triggered in. Prediction of event and a warning play an important role in disaster mitigation to reduce the damages. However, none of the studies examined the prediction of debris flow susceptibility by introducing past rainfall information.
In the aftermath of volcanic eruption, timely and accurate prediction is a challenge to reliable decision making. More accurate predictive information can be obtained when the probability of past events is taken into account in estimating the next event likelihood. A previous study by Hapsari et al. (2017) used the ensemble radar-rainfall short-term prediction model in Merapi Rivers to develop a debris flow hazard map but did not include past information. However, a major problem with this kind of application is a lack of debris movement records at Merapi after a serious debris flow disaster from 2010 to 2011, including its characteristics. A reasonable approach to tackling this issue could be to apply simple probabilistic forecasting that could perform well when data are scarce.
The Bayesian approach is a probabilistic, simple classification algorithm that predicts the likelihood of upcoming events based on prior knowledge. There have been studies involving the Bayesian method in, for example, vulnerability assessments of water quantity and quality issues (Arthur et al. 2007; Pagano et al. 2014). The studies to date have tended to focus on susceptibility mapping rather than on the prediction of an event. The Naïve Bayesian Classifier (NBC) algorithm offers an effective way of providing a probabilistic prediction system in many practical applications because this method is simple (Rish 2001), requires only small training datasets for calibration, and is more intuitive and thus does not require complex knowledge of the incident feature, compared to other similar techniques such as neural networks and support vector machines. Some studies have produced weather prediction systems using this algorithm (Liu et al. 2015; Barde and Patole 2016), but no study integrates various relevant attributes in predicting debris flow occurrence.
This study focuses on the performance measure of the NBC algorithms in identifying and predicting the vulnerable region to a debris flow in the Merapi area, aiming to develop a debris flow disaster warning system with small amount of past information. River catchment profile and rainfall (derived from X-MP radar) are regarded as the independent variables that lead to the debris transport, that is, working rainfall, hourly rainfall, soil type, and slope angle parameters. Section 2 outlines the NBC algorithm and the development of the prediction model, as well as the data used for building the model and for validation. Section 3 explains the testing of the model and its performance, and discusses the NBC model improvement and the application in the practical scheme.

2 Methodology

This section provides an outline of the research methodology used to address the research questions including a description of study site, NBC framework, and the performance evaluation technique. Secondary data collection, radar rainfall data pre-processing, and debris flow susceptibility factors are also explained in this section.

2.1 Study Site

The Merapi Volcano is an active stratovolcano located in the center of Java Island, Indonesia (7°32′24′′S, 110°27′00′′E), rising to 2930 m above sea level (Fig. 1). The volcano is geographically shared by Central Java Province and Yogyakarta Special Province. The Merapi Volcano is historically the most active volcano in Indonesia, with small eruptions every 2–3 years and a greater eruption every 10–15 years. The Merapi flank is a densely populated area, where 70,000 people live in the third danger zone, the most hazardous zone according to the Decree of Central Java Governor number 6/20181 concerning Merapi contingency plan. Destructive eruptions occurred in 1872, 1953, 2006, and 2010. After the October 2010 eruption, 130 million m3 of material were ejected as lava and tephra material. The pyroclastic flow on the southern flanks caused high damage within a 22 km range. As many as 120 people living at 12 km distance from the summit were killed (Jenkins et al. 2013). The eruption caused mudflows in the subsequent rainy season. At the beginning of 2011, 60% of the material accumulation at the top was transported downstream by the rivers. The Gendol River downstream was highly impacted by debris flow on 1 May 2011. The paddy field and a village were destroyed by the flow. The debris flow buried 40,000 m2 of crops and damaged 51 houses (Bélizal et al. 2013). On average, the annual rainfall amount on the Merapi slope is 2600 to 3000 mm. Debris flow is promoted by the steepness of 100–16% slope at the altitude range between 1000 m and 2930 m.

2.2 Debris Flow Cases

Between 2015 and 2018, at least nine debris flow events were documented. The affected rivers were the Pabelan, Krasak, Putih, Gendol, and Blongkeng Rivers. The Gendol River is located on the southern flank of the Merapi and was chosen as the focus area of study because, as stated by the Research and Development Center of Geological Disaster Technology (BPPTKG) Yogyakarta, this river has the most streams vulnerable to debris flow besides the Pabelan River. In October 2010, a debris flow disaster on the Gendol River extended 20 km downslope and buried 21 houses. The Gendol River is the main branch of the Opak River. Solikhin et al. (2015a) investigated this area because this river system was subject to numerous debris flow occurrences in 2011 due to great pyroclastic deposits in the Opak River.
Figure 1 shows the Gendol watershed delineation where the water flows downstream from the Merapi summit through a tributaries network and transports the debris into the Opak River. The upper catchment of the Gendol River is the area where the highly concentrated flow reached this area. The delineated basin encompasses a 5.1 km2 drainage area with a 4.28 km length. Like on a typical stratovolcano, the river basin is characterized by a narrow watershed and a steep-walled valley. This basin shape leads to flash floods that promote debris transport along the river. The debris flow events in the Gendol River that are available as observable events occurred 17 February 2016 and 20 December 2017. In the 2016 case, the Gendol River experienced high rain of up to 23.0 mm/h and a high concentration of sediment flow. In the 2017 case, 18.7 mm/h rainfall triggered earth material flow in the river.

2.3 Rainfall Observation by X-MP Radar

Since 2015, the X-band polarimetric weather radar (X-MP radar) has been installed on Mount Merapi (7°37′12′′S, 110°25′12′′E). The X-MP radar is an appealing instrument for observing rainfall over a large area—in the Merapi area within a 30 km radius in fine resolutions—compared to the available rain gauges. The radar provides rainfall measurements with a 150 m spatial resolution and a 2 min temporal resolution, which is advantageous for observing small-scale events. The polarimetric feature offers various advantages, including its capability to distinguish scattered particles, to overcome the rain attenuation, and to minimize the effect of uncertain particle drop size distribution.
Rainfall intensity data from X-MP radar is acquired in the constant altitude plan position indicator/CAPPI format at the lowest elevation angle. The radar-rainfall algorithm applied by Merapi X-MP radar is the composite algorithm proposed by Park et al. (2005) that uses horizontal reflectivity and a specific differential phase as indicated below:
$$R\left( {Z_{H} } \right) = 7.07 \times 10^{ - 3} Z_{H}^{0.819}\,{\text{for }} Z_{H} \le 30 \,dB\,Z\,or\,K_{DP} \le 0.3^\circ\,{\text{km}}^{ - 1}$$
(1)
and
$$R\left( {K_{DP} } \right) = 19.63K_{DP}^{0.823} \;{\text{otherwise,}}$$
(2)
where R is rainfall intensity (mm/h), ZH is horizontal reflectivity (mm6m−3), and KDP is a specific differential phase (ºkm−1).

2.4 Debris Flow Susceptibility Factors

Earth material movement in natural rivers arises from a complex interaction between hydrological and basin physical factors. A data mining technique for the prediction of an event occurrence is applied to the datasets based on these risk factors. The risk factors are rainfall from historical data, topsoil characteristics, and slope angle.

2.4.1 Triggering Rainfall

Most studies on debris flow disasters have emphasized the use of a rainfall threshold to judge the possible occurrence of debris flow (Shieh et al. 2009; Brunetti et al. 2010). Hourly rainfall intensity and working rainfall parameters are broadly used as criteria to judge debris flow initiation. Working rainfall is rainfall preceding an event, calculated by accumulating rainfall in the 7 days prior to the hourly rainfall (Lavigne et al. 2000). This parameter plays a significant role in debris flow occurrence because the increase of pore-water pressure due to accumulated rainwater promotes soil instability. In this study, X-MP radar rainfall observations in the rainy season of 2016 and 2017 are included in the rainfall database.
Previous research by the authors investigated the rainfall threshold that is likely to trigger debris flow in the Gendol River. The threshold was developed by separating debris flow and non-debris flow events from hourly and 7-day working rainfall obtained through X-MP radar observations during 2015–2019 (Hapsari et al. 2019). There is a critical line that distinguishes a safe zone and an unsafe zone, and a vertical line that represents the standard rainfall to issue a warning. Figure 2 indicates the critical line for debris flow emergency judgment.

2.4.2 Terrain Characteristics

Slope steepness information is constructed from the Digital Elevation Model from the Shuttle Radar Topography Mission (SRTM) with a 30 m spatial resolution (Fig. 3a). In the upper Gendol River, the terrain slope angle ranges from 1.83° to 32.9°. The terrain data are resampled to be the same resolution as the base rainfall spatial data, at 0.15 km × 0.15 km. The level of slope susceptibility to failure for different slope angle classes is adapted from Niu et al. (2014), who assessed the susceptibility of slope failures. Rank 1 is assigned to the slope of 0°–3°; rank 2 is assigned to the slope of 3°–6°; rank 3 is assigned to the slope of 6°–10°; rank 4 is assigned to the slope of 10°–15°; and rank 5 is assigned to the slope of more than 15°.

2.4.3 Slope Stability

The structure of the deposited material has been identified as a contributing factor of debris flow mobilization. However, difficulties arise because of lack of data on the variability of deposited soil properties. It is believed that debris flow formation is closely related to the topsoil type. The soil data are drawn from the Harmonized World Soil Database from the UNESCO Digital Soil Map of the World with a 30 arc-second raster resolution. Figure 3b shows the soil map of the area. Andosols and Arenosols are the dominant soil groups in the Gendol watershed. The texture of the topsoil is classified as loam and loamy sand. The loam soil texture is composed of 42% sand, 39% silt, and 19% clay; the loamy sand soil texture is composed of 83% sand, 11% silt, and 6% clay. According to Zhao and Zhang (2014), saturated silt and fine sand are unstable in the undrained shearing condition. It can, therefore, be assumed that the Gendol River is vulnerable to debris movement.

2.5 The Naïve Bayes Classifier Model

Bayes’ theorem gives the probability of an event based on the prior information of a condition related to the event (Bayes 1763). The posterior probability of an event is the transformation of prior knowledge combined with new data after the evidence is considered into the posterior probability. The following equation represents the basic statement of the Bayesian probability:
$$\it Posterior = \frac{Likelihood \cdot Prior}{Evidence}.$$
(3)
The Naïve Bayes approach is a simple probabilistic classification method that calculates the likelihood by summarizing frequency and combination of the given dataset value. It is based on the Bayesian theorem if one assumes that all attributes that are given by the value of the variable classes are conditionally independent of each other. The basic formula of the NBC is:
$$P\left( {class |data} \right) = \frac{{P\left( {data|class} \right) \cdot P\left( {class} \right)}}{{P\left( {data} \right)}},$$
(4)
where P(class|data) is a posteriori probability or the probability of a class given an event after seeing the event; P(data|class) is a likelihood or probability of an event such that the event belongs to a particular class; P(class) is a priori probability or past event occurrence probability; and P(data) is the probability of that event in the whole dataset (typically omitted). In the following section, the development of the NBC and its adaptation for debris flow prediction based on the observable attributes is described.

2.5.1 Calibration Datasets

The datasets of debris flow occurrences and the attributes are divided for calibration and validation purposes. Rainfall intensity, working rainfall, topographical slope, and soil type are involved as mutually independent variables in the model calibration process. As the debris flow may occur in a small localized area, the predictive analytics are developed spatially on a grid basis. The study area consists of 167 grids of 0.15 km × 0.15 km. The rainfall data are divided into two categories—triggering and non-triggering rainfall. In this study, the time interval of radar rainfall observation used in the analysis is 30 min. For one debris flow case, three consecutive periods of triggering rainfall data are included. This is because the duration of the debris flows was about 1.5 h according to the historical record. Therefore, the number of data points for one debris flow case is 501 (167 × 3).
In the calibration dataset, there were one triggering and 237 non-triggering rainfall events during the rainy season of 2015–2016. For non-triggering rain, one single rain event is defined as the maximum hourly rainfall in a series of rain. Therefore, the number of data points is 39,579 (167 × 237). Table 1 presents the description of the calibration and validation datasets that involve temporally and spatially varying data. Validation is conducted with test datasets from the December 2017 debris flow case and the October 2016 no-debris flow cases consisting of 1002 data points. The number of data points for each slope and soil type classes is temporally constant.
Table 1
Data for calibration, validation, and categorization of debris flow in the upper Gendol River Basin, Indonesia
Data
Calibration
Validation
Event
Debris flow occurrence
40,080
1002
 Debris flow
501 (Feb 2016 case)
501 (Dec 2017 case)
 No-debris flow
39,579
501 (Oct 2016 case)
Predictors
 Rainfall data
40,080
1002
  Slope data
167
  Average
14.79°
  Range
2.46°–32.94°
  Rank 5 (0°–3°)
1
  Rank 4 (3°–6°)
17
  Rank 3 (6°–10°)
31
  Rank 2 (10°–15°)
41
  Rank 1 (> 15°)
77
Soil type data
167
 Andosols
19
 Arenosols
148

2.5.2 The Naïve Bayes Classifier Algorithm for Developing a Debris Flow Disaster Warning System

Prior and conditional probabilities of debris flow occurrences were estimated, using the calibration data (Table 1) to obtain the posterior probabilities. Based on the assumption of independence of the variables, the likelihood of the event belonging to classj is:
$$P\left( {data |class_{j} } \right) = P\left( {data_{1} |class_{j} } \right) \times P\left( {data_{2} |class_{j} } \right) \times \ldots \times P(data_{k} |class_{j} )$$
(5)
The frequency ratio dataset based on the data categorization is constructed by first calculating the number of occur and no-occur events for each predictor class. The probability of an event such that the event belongs to a class is then calculated. After this, the likelihood P(data|class) is obtained by multiplying the probability for three predictors ( Eq.5). After obtaining the prior probability from the ratio of debris flow cases to all cases, the posterior probability can be determined by using Eq. 4.
In a classification algorithm, the next phase after model calibration is validation (also called the testing phase). In this phase, another unseen dataset (see validation data in Table 1) is introduced to the model. The NBC attempts to predict the label of an individual example based on the attribute-label relationship learned from the calibration datasets and the corresponding class. The category label of debris flow status uses the dichotomous index, that is, whether debris flow occurs and does not occur that is determined from the posterior probability. The posterior probability of classj given a new event data’ will be:
$$P\left( {class_{j} |data^{\prime}} \right) = P\left( {X^{\prime}_{1} |class_{j} } \right) \times P\left( {X^{\prime}_{1} |class_{j} } \right) \times \ldots \times P\left( {X^{\prime}_{k} |class_{j} } \right).$$
(6)
In this procedure, the prediction outcome was obtained and presented in the form of a spatial distribution. The event predicted by the NBC is then compared with the actual occurrence/non-occurrence as recorded in the debris flow event inventory. The ability of the model to learn the data and make a prediction was assessed through accuracy measures. “Success” by the model means that the event was assigned to the correct category (occurrence or non-occurrence). In addition, tee widely used quantitative indices were applied to compare the prediction and observation—the critical success index (CSI), the probability of detection (POD), and the false alarm rate (FAR) that come with a contingency table (Roebber 2009):
$$CSI = \frac{{N_{hit} }}{{N_{hit} + N_{miss} + N_{false} }},$$
(7)
$$POD = \frac{{N_{hit} }}{{N_{hit} + N_{miss} }}\;{\text{and}}$$
(8)
$$FAR = \frac{{N_{false} }}{{N_{hit} + N_{false} }},$$
(9)
where Nhit is the number of hit events (the model predicts debris flow, and debris flow presents); Nmiss is the number of miss events (debris flow presents, but the model does not predict debris flow); and Nfalse is the number of false events (the model predicts debris flow, but there is no debris flow). For CSI and POD, 1 represents a perfect forecast, whereas for FAR, a perfect forecast is represented by 0. After satisfying performance is obtained, the debris flow prediction model and the disaster warning system based on the NBC approach is developed. Figure 4 illustrates the procedure of the debris flow warning system development using the NBC approach.

3 Results and Discussion

This section describes analysis and evaluation of the NBC approach for debris flow warning. A section that explains the model calibration and validation is presented first, followed by the discussion of performance and implication of the model. To corroborate the model results, catchment aerial photo for illustrating the potential of the catchment to initiate the debris flow occurrence is explained.

3.1 Building the Model

In an attempt to train the model in the calibration stage, the original datasets were converted into frequency ratio datasets based on the categorization. The results of the predictor classification are presented in Table 1. From the 167 grids, the number of grids with slopes of 0–0.0524, 0.0524–0.1051, 0.1051–0.1763, 0.1763–0.2679, and more than 0.2679 are 1, 17, 31, 41, and 77, respectively. Figure 3c presents the classification of the slope angle according to the designated rule. Regarding the soil type, the area is composed of 19 grids of Andosols and 148 grids of Arenosols.
The rainfall for seven days before specific hourly rainfall is accumulated to classify the rainfall-based debris flow critical line (Fig. 2). The average and range of hourly rainfall from calibration datasets are 0.702 mm/h and 0.000–81.500 mm/h, respectively. The average and range of working rainfall are 97.160 mm and 0.005–205.920 mm, respectively. The number of data points with the alert status of safe, warning, and emergency in the calibration datasets are 20,045, 20,019, and 16, respectively (Table 2).
Table 2
Description of rainfall calibration data in the upper Gendol River Basin, Indonesia
Description
Value
Average of hourly rainfall
0.702 mm/h
Range of hourly rainfall
0.000–81.500 mm/h
Average of working rainfall
97.160 mm/h
Range of working rainfall
0.005–205.920 mm/h
Number of safe data points
20,045
Number of warning data points
20,019
Number of emergency data points
16
The next step of the NBC algorithm was to analyze the likelihood P(B|A) of occurrence or non-occurrence for each predictor class. With 501 debris flow cases out of 40,080 total cases, the ratios of debris flow occurrence to total events or the prior probability P(B|A) are 0.0125 and 0.9875 for occurrence and non-occurrence, respectively. Through this procedure, a model that generalizes how the debris flow attributes relate to the disaster occurrence status has been obtained.

3.2 Model Testing for Validation

After the conditional probability of debris flow occurrence had been calculated considering the event attribute classification from the validation datasets, the prediction outcome was obtained as the class with the highest probability. In this section, some examples of validation results illustrated in spatial distribution are given. Figure 5 depicts the rainfall map presenting hourly rainfall on 16 February 2016, 14:00 local standard time (LST); 17 February 2016, 16:30 LST; 16 December 2017, 13:00 LST; and 20 December 2017, 16:30 LST. The working rainfall for these cases is shown in Fig. 6. The event on 16 February 2016 (Figs. 5a, 6a) was a case with strong hourly rainfall intensity but low working rainfall. On 17 February 2016 (Figs. 5b, 6b), the upper Gendol River Basin was subject to high rainfall intensity and high working rainfall.
The experimental results of testing the NBC using this case study are illustrated in debris flow hazard maps (Fig. 7). The Gendol River indicated by blue lines is overlaid with the maps to allow for a more detailed qualitative assessment. Figure 7a illustrates the results for the simulation on 16 February 2016, 14:00, which is identified as a no-debris flow case throughout the area. The result matches with the debris flow disaster inventory reports that debris flow was not observed in the study area. When the rainfall of 17 February 2016, 16:30 is introduced as the model input, the result as presented in Fig. 7b demonstrates debris flow that occurred in the downstream area. This finding conforms with the debris flow database report that there was a strong debris flow that caused two trucks to be trapped by debris in the Gendol River at 16:30.
Another example is the model testing using validation datasets on 16 December 2017, 13:00 and 20 December 2017, 16:30. On 16 December 2017 at 13:00, which was marked as a no-debris flow case, the hourly rainfall intensity was 6.24 mm/h, and the working rainfall was 120.33 mm. On 20 December 2017, 16:30, when a serious debris flow was recorded, the hourly rainfall intensity was 40.484 mm/h, and the working rainfall was 149.883 mm. Figure 7d shows that the hazard map from the NBC on 20 December 2017, 16:30 indicates debris flow occurrence throughout the basin. This result matches those observed in the upper Gendol River Basin, where the debris mobilization occurred at the Gendol River that caused heavy equipment to be buried by sediment at 17:00. Low rainfall indices on 16 December 2017, 13:00 (Figs. 5c, 6c) tend to provide similar results to those in Fig. 7a, where the NBC model simulates the non-occurrence status (Fig. 7c).
The formation of debris flows differs from landslides. Landslides occur on the slope due to slope instability. Debris flow occurs in the river channel, initiated by material mobilization due to water flow (Takahashi 2007). The issue in this study is that each grid is treated independently as a different sample in the calibration phase as applied in landslide vulnerability mapping (Berti et al. 2012; Bui et al. 2012). The Merapi debris flow events were mainly determined from media and resident reports because there was no sediment measurement on the rivers during the observation period. As a result, the timing and the subcatchment location where debris flows occurred are not well identified. Also, in the experimental setting, among the three predictors, rainfall is the only parameter that varies temporally as the slopes and the soil types are constant. Consequently, the model is very sensitive to the changes in the rainfall parameters. For example, the model estimated the debris flow occurrence in the downstream area in the 17 February 2016, 16.30 case (Fig. 7b) because this area had somewhat high working rainfall (Fig. 6b). In contrast, debris flow did not occur on 16 February 2016, 14.00 (Fig. 7a) because the working rainfall was low (Fig. 6a), although the rainfall intensity was high (Fig. 5a). In the 20 December 2017, 16.30 case, debris flow was estimated to occur throughout the basin (Fig. 7d) because the whole area had high working rainfall (Fig. 6d).
To illustrate the vulnerability of the catchment more clearly, high-resolution satellite imagery of the upper Gendol River recently retrieved from Pleaides Imagery with a 0.5 m grid size is presented in Fig. 8. The image indicates that the mountain top is now almost entirely bare of vegetation. The topsoil of upper Mount Merapi is generally formed from pyroclastic material from the last eruption. Arenosols dominate the soil formation of the upper Gendol Basin, though Andosols are more common at the volcano peak. Some of the areas of the river channel with high material deposits are indicated by the white-framed areas. Almost all river reaches on the upper Gendol have high remaining volcanic mud. Generally, the entire basin is prone to debris flows given the soil and land cover characteristics and the material deposits, as illustrated in Fig. 8.

3.3 Performance of the Naïve Bayes Classifier and its Application to Debris Flow Risk Estimation

In the validation stage, comparing the hit and the null from the contingency table with total cases gives 82.28% model accuracy. Further quantitative skill assessment of the model indicates CSI, POD, and FAR of 0.56, 0.56, and 0.00, respectively. Overall, the NBC performs well in predicting the debris flow occurrence. The CSI and POD indicate somewhat low performance. A possible explanation is that the model predicts the vulnerable area unevenly based on variation of all three predictors, whereas the exact location and the extent of flooded areas are unknown because of limited information. A perfect skill is obtained in terms of FAR since all cases without debris flow are simulated as non-occurrence by the model. Altogether, the encouraging performance of the new NBC is further supported by the satisfying accuracy.
This experiment is based on a debris flow database since 2015 that consists of only two debris flow events, which are insufficient to allow the model to learn the features. The evidence presented in this article suggests that the NBC approach with a longer calibration dataset could improve prediction capability. In the NBC, an imbalanced calibration dataset between occurrence and non-occurrence can lead to bias in the construction of the model as stated by Lane et al. (2012). Examples of NBC application for water-related hazards are studies carried out by Bui et al. (2012) that included 118 occurrences cases and Liu et al. (2015) that used 1000 sampling points. A further study with more debris flow cases from all basins in the Merapi area is therefore suggested.
Future studies on the current NBC application also could include temporal and spatial dynamics of soil moisture from remote sensing data (Liu et al. 2015) because there is a strong relationship between soil water content and active debris flow (Chorowicz et al. 1997; Capra et al. 2010; Pham et al. 2016). By using the NBC, we expect to discover hidden patterns that complicate the estimation from existing geomorphological and hydrological data that lead to debris flow.

4 Conclusion

This study set out to evaluate the predictability of a debris flow disaster warning system by applying the Naïve Bayes Classifier (NBC) in the upper Gendol River located on the Merapi Volcano flank in Indonesia. The investigation indicated that the NBC is a simple data mining technique that provides acceptable results and accuracy in deriving a debris flow susceptibility map. The qualitative instant skill assessment through visual comparison found that on some occasions, the “occurrence” grids in the hazard map were not exactly the same as the observed data. However, overall the NBC performed quite well in predicting the debris flow occurrence indicated by CSI, POD, and FAR of 0.56, 0.56, and 0.00, respectively. But the findings in this study are subject to at least three limitations—lack of a representative sample; inadequate fine data of temporal and spatial debris flow observations; and lack of prior information on soil moisture. Involving all basins in the Merapi area as calibration data by considering watershed morphometrics should be pursued in a future investigation to develop a robust model. The system based on such risk factors could give a warning and suggestion regarding the probable occurrence of debris flows and help to reduce the negative impact of a debris flow disaster.

Acknowledgements

This research was supported by the Science and Technology Research Partnership for Sustainable Development (SATREPS), Japan Science and Technology Agency (JST), and the Japan International Cooperation Agency (JICA). The authors thank the Hydraulic Laboratory of Universitas Gadjah Mada (UGM) for providing rainfall data for radar validation.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.
Literatur
Zurück zum Zitat Arthur, J.D., H.A.R. Wood, A.E. Baker, J.R. Cichon, and G.L. Raines. 2007. Development and implementation of a Bayesian-based aquifer vulnerability assessment in Florida. Natural Resources Research 16(2): 93–107.CrossRef Arthur, J.D., H.A.R. Wood, A.E. Baker, J.R. Cichon, and G.L. Raines. 2007. Development and implementation of a Bayesian-based aquifer vulnerability assessment in Florida. Natural Resources Research 16(2): 93–107.CrossRef
Zurück zum Zitat Barde, N.C., and M. Patole. 2016. Classification and forecasting of weather using ANN, k-NN and Naïve Bayes Algorithms. International Journal of Science and Research 5(2): 1740–1742. Barde, N.C., and M. Patole. 2016. Classification and forecasting of weather using ANN, k-NN and Naïve Bayes Algorithms. International Journal of Science and Research 5(2): 1740–1742.
Zurück zum Zitat Bayes, T. 1763. An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society 53: 370–418.CrossRef Bayes, T. 1763. An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society 53: 370–418.CrossRef
Zurück zum Zitat Bélizal, E.D., F. Lavigne, D.S. Hadmoko, J.P. Degeai, G.A. Dipayana, B.W. Mutaqin, M.S. Marfai, M. Coquet, et al. 2013. Rain-triggered lahars following the 2010 eruption of Merapi volcano, Indonesia: A major risk. Journal of Volcanology and Geothermal Research 261: 330–347.CrossRef Bélizal, E.D., F. Lavigne, D.S. Hadmoko, J.P. Degeai, G.A. Dipayana, B.W. Mutaqin, M.S. Marfai, M. Coquet, et al. 2013. Rain-triggered lahars following the 2010 eruption of Merapi volcano, Indonesia: A major risk. Journal of Volcanology and Geothermal Research 261: 330–347.CrossRef
Zurück zum Zitat Berti, M., M.L.V. Martina, S. Franceschini, S. Pignone, A. Simoni, and M. Pizziolo. 2012. Probabilistic rainfall thresholds for landslide occurrence using a Bayesian approach. Journal of Geophysical Research 117: Article F04006. Berti, M., M.L.V. Martina, S. Franceschini, S. Pignone, A. Simoni, and M. Pizziolo. 2012. Probabilistic rainfall thresholds for landslide occurrence using a Bayesian approach. Journal of Geophysical Research 117: Article F04006.
Zurück zum Zitat Brunetti, M.T., S. Peruccacci, M. Rossi, S. Luciani, D. Valigi, and F. Guzzetti. 2010. Rainfall thresholds for the possible occurrence of landslides in Italy. Natural Hazards and Earth System Sciences 10: 447–458.CrossRef Brunetti, M.T., S. Peruccacci, M. Rossi, S. Luciani, D. Valigi, and F. Guzzetti. 2010. Rainfall thresholds for the possible occurrence of landslides in Italy. Natural Hazards and Earth System Sciences 10: 447–458.CrossRef
Zurück zum Zitat Bui, D.T., B. Pradhan, O. Lofman, and I. Revhaug. 2012. Landslide susceptibility assessment in Vietnam using support vector machines, decision tree, and Naïve Bayes models. Mathematical Problems in Engineering 2012: Article 974638. Bui, D.T., B. Pradhan, O. Lofman, and I. Revhaug. 2012. Landslide susceptibility assessment in Vietnam using support vector machines, decision tree, and Naïve Bayes models. Mathematical Problems in Engineering 2012: Article 974638.
Zurück zum Zitat Capra, L., L. Borselli, N. Varley, J.C.G. Ruiz, G. Norini, D. Sarocchi, L. Caballero, and A. Cortes. 2010. Rainfall-triggered debris at Volcán de Colima, Mexico: Surface hydro-repellency as initiation process. Journal of Volcanology and Geothermal Research 189: 105–117.CrossRef Capra, L., L. Borselli, N. Varley, J.C.G. Ruiz, G. Norini, D. Sarocchi, L. Caballero, and A. Cortes. 2010. Rainfall-triggered debris at Volcán de Colima, Mexico: Surface hydro-repellency as initiation process. Journal of Volcanology and Geothermal Research 189: 105–117.CrossRef
Zurück zum Zitat Chorowicz, J., E. Lopez, F. Garcia, J.-F. Parrot, J.-P. Rudant, and R. Vinluan. 1997. Keys to analyze active lahars from Pinatubo on SAR ERS imagery. Remote Sensing of Environment 62(1): 20–29.CrossRef Chorowicz, J., E. Lopez, F. Garcia, J.-F. Parrot, J.-P. Rudant, and R. Vinluan. 1997. Keys to analyze active lahars from Pinatubo on SAR ERS imagery. Remote Sensing of Environment 62(1): 20–29.CrossRef
Zurück zum Zitat Hapsari, R.I., S. Oishi, M. Syarifuddin, and R.A. Asmara. 2017. Uncertainty in debris flow hazard estimation using X-MP radar rainfall forecast. In Proceedings of the 37th IAHR World Congress, 13–18 August 2017, Kuala Lumpur, Malaysia, 1–10. Hapsari, R.I., S. Oishi, M. Syarifuddin, and R.A. Asmara. 2017. Uncertainty in debris flow hazard estimation using X-MP radar rainfall forecast. In Proceedings of the 37th IAHR World Congress, 13–18 August 2017, Kuala Lumpur, Malaysia, 1–10.
Zurück zum Zitat Hapsari, R.I., S. Oishi, M. Syarifuddin, R.A. Asmara, and D. Legono. 2019. X-MP Radar for developing a lahar rainfall threshold for the Merapi Volcano using a Bayesian approach. Journal of Disaster Research 14(5): 811–828.CrossRef Hapsari, R.I., S. Oishi, M. Syarifuddin, R.A. Asmara, and D. Legono. 2019. X-MP Radar for developing a lahar rainfall threshold for the Merapi Volcano using a Bayesian approach. Journal of Disaster Research 14(5): 811–828.CrossRef
Zurück zum Zitat Jenkins, S., J.-C. Komorowski, P.J. Baxter, R. Spence, A. Picquout, F. Lavigne, Surono. 2013. The Merapi 2010 eruption: An interdisciplinary impact assessment methodology for studying pyroclastic density current dynamics. Journal of Volcanology and Geothermal Research 261: 316–329.CrossRef Jenkins, S., J.-C. Komorowski, P.J. Baxter, R. Spence, A. Picquout, F. Lavigne, Surono. 2013. The Merapi 2010 eruption: An interdisciplinary impact assessment methodology for studying pyroclastic density current dynamics. Journal of Volcanology and Geothermal Research 261: 316–329.CrossRef
Zurück zum Zitat Lane, P.C.L., D. Clarke, and P. Hender. 2012. On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, ACL-HLT 2011, 24 June 2011, Portland, Oregon, USA, 44–52. Lane, P.C.L., D. Clarke, and P. Hender. 2012. On developing robust models for favourability analysis: Model choice, feature sets and imbalanced data. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, ACL-HLT 2011, 24 June 2011, Portland, Oregon, USA, 44–52.
Zurück zum Zitat Lavigne, F., and J.C. Thouret. 2003. Sediment transportation and deposition by rain-triggered lahars at Merapi Volcano, Central Java, Indonesia. Geomorphology 49: 45–69.CrossRef Lavigne, F., and J.C. Thouret. 2003. Sediment transportation and deposition by rain-triggered lahars at Merapi Volcano, Central Java, Indonesia. Geomorphology 49: 45–69.CrossRef
Zurück zum Zitat Lavigne, F., J.C. Thouret, B. Voight, H. Suwa, and A. Sumaryono. 2000. Lahars at Merapi Volcano, Central Java: An overview. Journal of Volcanology 100(1–4): 423–456. Lavigne, F., J.C. Thouret, B. Voight, H. Suwa, and A. Sumaryono. 2000. Lahars at Merapi Volcano, Central Java: An overview. Journal of Volcanology 100(1–4): 423–456.
Zurück zum Zitat Liu, R., Y. Chen, J. Wu, L. Gao, D. Barrett, T. Xu, L. Li, C. Huang, et al. 2015. Assessing spatial likelihood of flooding hazard using Naive Bayes and GIS: A case study in Bowen Basin, Australia. Stochastic Environmental Research and Risk Assessment 30: 1575–1590.CrossRef Liu, R., Y. Chen, J. Wu, L. Gao, D. Barrett, T. Xu, L. Li, C. Huang, et al. 2015. Assessing spatial likelihood of flooding hazard using Naive Bayes and GIS: A case study in Bowen Basin, Australia. Stochastic Environmental Research and Risk Assessment 30: 1575–1590.CrossRef
Zurück zum Zitat Mead, S.R., and C.R. Magill. 2017. Probabilistic hazard modelling of rain-triggered debris. Journal of Applied Volcanology 6(8): 1–7. Mead, S.R., and C.R. Magill. 2017. Probabilistic hazard modelling of rain-triggered debris. Journal of Applied Volcanology 6(8): 1–7.
Zurück zum Zitat Niu, F., J. Luo, Z. Lin, M. Liu, and G. Yin. 2014. Thaw-induced slope failures and susceptibility mapping in permafrost regions of the Qinghai–Tibet Engineering Corridor, China. Bulletin of Volcanology 59: 460–480. Niu, F., J. Luo, Z. Lin, M. Liu, and G. Yin. 2014. Thaw-induced slope failures and susceptibility mapping in permafrost regions of the Qinghai–Tibet Engineering Corridor, China. Bulletin of Volcanology 59: 460–480.
Zurück zum Zitat Pagano, A., R. Giordano, I. Portoghese, U. Fratino, and M. Vurro. 2014. A Bayesian vulnerability assessment tool for drinking water mains under extreme events. Natural Hazards 74(3): 2193–2227.CrossRef Pagano, A., R. Giordano, I. Portoghese, U. Fratino, and M. Vurro. 2014. A Bayesian vulnerability assessment tool for drinking water mains under extreme events. Natural Hazards 74(3): 2193–2227.CrossRef
Zurück zum Zitat Park, S.G., M. Maki, K. Iwanami, V.N. Bringi, and V. Chandrasekar. 2005. Correction of radar reflectivity and differential reflectivity for rain attenuation at X Band Part II: Evaluation and application. Journal of Atmospheric and Oceanic Technology 22(11): 1633–1655.CrossRef Park, S.G., M. Maki, K. Iwanami, V.N. Bringi, and V. Chandrasekar. 2005. Correction of radar reflectivity and differential reflectivity for rain attenuation at X Band Part II: Evaluation and application. Journal of Atmospheric and Oceanic Technology 22(11): 1633–1655.CrossRef
Zurück zum Zitat Pham, B.T., I. Prakash, D.T. Bui, and M.B. Dholakia. 2016. Evaluation of predictive ability of support vector machines and naive Bayes trees methods for spatial prediction of landslides in Uttarakhand state (India) using GIS. Journal of Geomatics 10(1): 71–79. Pham, B.T., I. Prakash, D.T. Bui, and M.B. Dholakia. 2016. Evaluation of predictive ability of support vector machines and naive Bayes trees methods for spatial prediction of landslides in Uttarakhand state (India) using GIS. Journal of Geomatics 10(1): 71–79.
Zurück zum Zitat Rish, I. 2001. An empirical study of the Naïve Bayes Classifier. In Proceedings of IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence 3: 41–46. Rish, I. 2001. An empirical study of the Naïve Bayes Classifier. In Proceedings of IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence 3: 41–46.
Zurück zum Zitat Rodolfo, K.S., and A.T. Arguden. 1991. Rain-lahar generation and sediment delivery-systems at Mayon Volcano Philippines. In Sedimentation in volcanic settings, ed. R.V. Fisher, and G.A. Smith, 71–87. Tulsa, OK: SEPM (Society for Sedimentary Geology) (SEPM Special Publication 45). Rodolfo, K.S., and A.T. Arguden. 1991. Rain-lahar generation and sediment delivery-systems at Mayon Volcano Philippines. In Sedimentation in volcanic settings, ed. R.V. Fisher, and G.A. Smith, 71–87. Tulsa, OK: SEPM (Society for Sedimentary Geology) (SEPM Special Publication 45).
Zurück zum Zitat Roebber, P.J. 2009. Visualizing multiple measures of forecast quality. Weather and Forecasting 24(2): 601–608.CrossRef Roebber, P.J. 2009. Visualizing multiple measures of forecast quality. Weather and Forecasting 24(2): 601–608.CrossRef
Zurück zum Zitat Shieh, C.L., Y.S. Chen, T.J. Tsai, and J.H. Wu. 2009. Variability in rainfall threshold for debris flow after the Chi-Chi earthquake in central Taiwan, China. International Journal of Sediment Research 24(2): 177–188.CrossRef Shieh, C.L., Y.S. Chen, T.J. Tsai, and J.H. Wu. 2009. Variability in rainfall threshold for debris flow after the Chi-Chi earthquake in central Taiwan, China. International Journal of Sediment Research 24(2): 177–188.CrossRef
Zurück zum Zitat Solikhin, A., J.-C. Thouret, A. Gupta, D.S. Sayudi, J.-F. Oehler, and S.C. Liew. 2015a. Effects and behavior of pyroclastic and debris deposits of the 2010 Merapi Eruption based on high-resolution optical imagery. Procedia Earth and Planetary Science 12: 1–10.CrossRef Solikhin, A., J.-C. Thouret, A. Gupta, D.S. Sayudi, J.-F. Oehler, and S.C. Liew. 2015a. Effects and behavior of pyroclastic and debris deposits of the 2010 Merapi Eruption based on high-resolution optical imagery. Procedia Earth and Planetary Science 12: 1–10.CrossRef
Zurück zum Zitat Solikhin, A., J.-C. Thouret, S.C. Liew, A. Gupta, D.S. Sayudi, J.-F. Oehler, and Z. Kassouk. 2015b. High-spatial-resolution imagery helps map deposits of the large (VEI 4) 2010 Merapi Volcano eruption and their impact. Bulletin of Volcanology 77(3): 1–42.CrossRef Solikhin, A., J.-C. Thouret, S.C. Liew, A. Gupta, D.S. Sayudi, J.-F. Oehler, and Z. Kassouk. 2015b. High-spatial-resolution imagery helps map deposits of the large (VEI 4) 2010 Merapi Volcano eruption and their impact. Bulletin of Volcanology 77(3): 1–42.CrossRef
Zurück zum Zitat Syarifuddin, M., S. Oishi, D. Legono, R.I. Hapsari, and M. Iguchi. 2017. Integrating X-MP radar data to estimate rainfall induced debris flow in the Merapi volcanic area. Advances in Water Resources 110: 249–262.CrossRef Syarifuddin, M., S. Oishi, D. Legono, R.I. Hapsari, and M. Iguchi. 2017. Integrating X-MP radar data to estimate rainfall induced debris flow in the Merapi volcanic area. Advances in Water Resources 110: 249–262.CrossRef
Zurück zum Zitat Takahashi, T. 2007. Debris flow: Mechanics, prediction and countermeasures. London: Taylor and Francis.CrossRef Takahashi, T. 2007. Debris flow: Mechanics, prediction and countermeasures. London: Taylor and Francis.CrossRef
Zurück zum Zitat Wilford, D.J., M.E. Sakals, J.L. Innes, R.C. Sidle, and W.A. Bergerud. 2004. Recognition of debris flow, debris flood and flood hazard through watershed morphometrics. Landslides 1(1): 61–66.CrossRef Wilford, D.J., M.E. Sakals, J.L. Innes, R.C. Sidle, and W.A. Bergerud. 2004. Recognition of debris flow, debris flood and flood hazard through watershed morphometrics. Landslides 1(1): 61–66.CrossRef
Zurück zum Zitat Varnes, D.J. 1978. Slope movement type and processes. In Landslides: Analysis and control, ed. R.L. Schuster, and R.J. Krizek, 11–33. Washington, DC: National Research Council. Varnes, D.J. 1978. Slope movement type and processes. In Landslides: Analysis and control, ed. R.L. Schuster, and R.J. Krizek, 11–33. Washington, DC: National Research Council.
Zurück zum Zitat Zhao, H.F., and L.M. Zhang. 2014. Instability of saturated and unsaturated coarse granular soils. Journal of Geotechnical and Geoenvironmental Engineering 140(1): 25–35.CrossRef Zhao, H.F., and L.M. Zhang. 2014. Instability of saturated and unsaturated coarse granular soils. Journal of Geotechnical and Geoenvironmental Engineering 140(1): 25–35.CrossRef
Metadaten
Titel
Naïve Bayes Classifier for Debris Flow Disaster Mitigation in Mount Merapi Volcanic Rivers, Indonesia, Using X-band Polarimetric Radar
verfasst von
Ratih Indri Hapsari
Bima Ahida Indaka Sugna
Dandung Novianto
Rosa Andrie Asmara
Satoru Oishi
Publikationsdatum
19.11.2020
Verlag
Beijing Normal University Press
Erschienen in
International Journal of Disaster Risk Science / Ausgabe 6/2020
Print ISSN: 2095-0055
Elektronische ISSN: 2192-6395
DOI
https://doi.org/10.1007/s13753-020-00321-7

Weitere Artikel der Ausgabe 6/2020

International Journal of Disaster Risk Science 6/2020 Zur Ausgabe