Top

Neural Computing and Applications

Published in:

Open Access 14-12-2023 | Original Article

Handling uncertainty issue in software defect prediction utilizing a hybrid of ANFIS and turbulent flow of water optimization algorithm

Authors: M. A. Elsabagh, O. E. Emam, M. G. Gafar, T. Medhat

Published in: Neural Computing and Applications | Issue 9/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

During the development cycle of software projects, numerous defects and challenges have been identified, leading to prolonged project durations and escalated costs. As a result, both product delivery and defect tracking have become increasingly complex, expensive, and time-consuming. Recognizing the challenge of identifying every software defect, it is crucial to foresee potential consequences and strive for the production of high-quality products. The goal of software defect prediction (SDP) is to identify problematic locations within software code. This study presents the first experimental investigation utilizing the turbulent flow of water optimization (TFWO) in conjunction with the adaptive neuro-fuzzy inference system (ANFIS) to enhance SDP. The TFWO_ANFIS model is designed to address the uncertainties present in software features and predict defects with feasible accuracy. Data are divided randomly at the beginning of the model into training and testing sets to avoid the local optima and over-fitting issues. By applying the TFWO approach, it adjusts the ANFIS parameters during the SDP process. The proposed model, TFWO_ANFIS, outperforms other optimization algorithms commonly used in SDP, such as particle swarm optimization (PSO), gray wolf optimization (GWO), differential evolution (DE), ant colony optimization (ACO), standard ANFIS, and genetic algorithm (GA). This superiority is demonstrated through various evaluation metrics for four datasets, including standard deviation (SD) scores (0.3307, 0.2885, 0.3205, and 0.2929), mean square error (MSE) scores (0.1091, 0.0770, 0.1026, and 0.0850), root-mean-square error (RMSE) scores (0.3303, 0.2776, 0.3203, and 0.2926), mean bias error (MBE) scores (0.1281, 0.0860, 0.0931, and 0.2310), and accuracy scores (87.3%, 90.2%, 85.8%, and 89.2%), respectively, for the datasets KC2, PC3, KC1, and PC4. These datasets with different instances and features are obtained from an open platform called OPENML. Additionally, multiple evaluation metrics such as precision, sensitivity, confusion matrices, and specificity are employed to assess the model’s performance.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Defects are the most significant problems in the current situation, and forecasting them is a difficult procedure or process. This bug or defect’s presence increases the likelihood that the project will fail. Consequently, it may result in a drop in project quality as well as an increase in time and cost. As a result, finding these problems early in the software development life cycle (SDLC) reduces both the time and financial costs of the project as a whole. Therefore, defect prediction plays an essential role in the developing and testing phases and contributes to the success of the entire project. At the beginning of the SDLC, defects should be anticipated. For this reason, a variety of SDP models have been developed for professionals to locate the modules that are initially identified as defective [1, 2]. To meet user goals in a constrained amount of time, software engineering requires excellent quality and stability. Quality assurance teams can efficiently allocate their limited resources using SDP models to inspect and test software products [3, 4].

Initially, software businesses relied on manual testing, which consumed 27% of the project’s time and could not address all software defects. Typically, these businesses lack the resources and time to resolve every issue before product release, resulting in harm to their reputation and product value. SDP models provide a solution, allowing businesses to prioritize critical issues and allocate resources efficiently to the most defect-prone code [5].

Machine learning (ML) is one of the promising methods that are having a big impact on prediction. ML is concerned with the creation of algorithms that can recognize patterns in known data to create models and then use those models to predict outcomes from unknown data. This is especially true when combined with data mining methods [6, 7]. As a result, deep learning (DL) and ML approaches have been widely used in SDP to enhance its performance.

Various methods, including support vector machine (SVM) [8], bagging [9], Naïve Bayes (NB) [10], boosting [11], C4.5 [12], random forest (RF) [13], artificial neural network (ANN) [14], and K-nearest neighbor (KNN) [15], have been used in SDP. Despite the fact that these individual nonlinear machine learning algorithms outperform conventional models in SDP, these algorithms have issues with the accuracy of handling uncertainty in SDP and with over-fitting and parameter optimization [7]. As a result, composite algorithms have been developed to improve prediction accuracy and address the shortcomings of single models [16‐18]. Moreover, meta-heuristic algorithms have been used in SDP to enhance the accuracy of prediction due to their ability to decrease complexity issues in real life, find the best solution, and search globally [7]. Every instance in the population offers a potential solution, and compared to other traditional approaches currently in use, they are more popular because of their intricacy and efficiency [19, 20]. According to the no free lunch (NFL) theorem [21], no single meta-heuristic method can solve all optimization problems. In other words, a specific meta-heuristic algorithm may produce good results in some situations but perform poorly in others.

In the context of SDP, addressing uncertainty is crucial. ANFIS, a form of soft computation, combines ANN capabilities with fuzzy inference processes. ANFIS offers strong adaptation abilities and a rapid, precise learning process [22, 23]. However, a significant challenge in real-world applications is training ANFIS parameters. Researchers prioritize adjusting these parameters for improved precision and accuracy. Various training techniques have emerged, typically categorized as probabilistic and deterministic methods.

Least square estimator (LSE) and gradient descent (GD) [24‐26] are two examples of deterministic categories that are slow and occasionally fail to converge. Additionally, because the chain rule deployed creates the gradient computation at each step, the conventional ANFIS learning systems employ the GD algorithm, leading to a large number of local optimums. In contrast, a novel optimization technique based on TFWO is utilized in this paper. The random and natural behavior of vortices in oceans, rivers, and seas served as an inspiration for this technique [27].

In this paper, we employ a novel optimization technique based on TFWO to optimize ANFIS parameters. This optimization model takes advantage of the random and natural behavior of vortices in oceans, rivers, and seas.

The contributions of the study include the enhanced handling of uncertainty with greater accuracy in SDP through the proposed TFWO_ANFIS model. This model leverages the advantages of TFWO for adapting the ANFIS model’s parameters. The ANFIS training process uses the TFWO technique as a method for parameter adaption. The fuzzification and the defuzzification layers (premise and consequent parameters) are where the adaptive parameters are located. Four datasets were used with various evaluation criteria to assess the effectiveness of the proposed TFWO algorithm for adapting ANFIS parameters such as RMSE, MSE, SD, and accuracy. TFWO_ANFIS outperformed all other compared techniques with standard ANFIS and with specific optimization techniques [28], such as GA, DE [29], ACO, PSO [30], and GWO [31‐33].

Given the rapid utilization of ML and artificial intelligence (AI)-based software-intensive systems in semi-autonomous automobiles, recommendation systems, and various real-world applications, there are concerns about the outcomes of their use, especially when these systems have the potential to affect the environment or people, as in the case of self-driving cars or the medical field. In such situations, addressing these uncertainties is crucial [34]. The developed model is used to predict defects in software with higher accuracy under uncertainty. The outcomes show that the recommended model TFWO_ANFIS outperformed the alternative optimization techniques in terms of the ANFIS’s training and testing error rates.

This research highlights the presence of uncertainty in software features, leading to adverse outcomes in SDP, including low product quality, increased defects during the SDLC, and extended delivery time and costs. To address this issue, a solution lies in combining the capabilities of an ANN with a fuzzy inference system known as ANFIS. The research proposed an enhanced variation of ANFIS termed turbulent flow of water optimization algorithm (TFWO) that increases ANFIS’s overall optimization performance. The proposed upgrade focuses on training ANFIS parameters with a novel optimization technique, as opposed to LSE and GD, which are time-consuming, prone to a large number of local optima, and sometimes fail to converge. The TFWO_ANFIS model aims to better manage software metric uncertainty and predict defects with higher accuracy. Dealing with these problems leads to predicting defects in software with a feasible accuracy. Improving software performance, meeting customer needs in a short period of time, and assisting quality control teams in effectively allocating their limited assets during software system evaluation are the motivating factors behind handling uncertainty in software defect prediction and obtaining higher accuracy in the suggested model.

The following are the benefits of treating uncertainty in SDP:

Models become more dependable when uncertainty is considered during software development. Additionally, appropriate software model validation helps reduce uncertainty in later phases of development.

Applying software uncertainty modeling can improve decision-making during the development process.

The major contribution of this research can be summarized as follows:

(1)

Four datasets from NASA named KC2, PC3, KC1, and PC4 are utilized with different instances and features. They are obtained from an open platform called OPENML.

(2)

Proposed a novel model for predicting defects in software with higher accuracy in uncertain environments.

(3)

Utilizing the TFWO algorithm for adapting ANFIS’s parameter optimization rather than traditional algorithms.

(4)

Comparing the suggested TFWO_ANFIS with conventional ANFIS, ACO_ANFIS, DE_ANFIS, PSO_ANFIS, GWO_ANFIS, and GA_ANFIS.

(5)

Evaluating the suggested TFWO_ANFIS against some recent relevant metrics in SDP such as SD, MSE, RMSE, MBE, and accuracy.

The rest of this paper is structured as follows: Sect. 2 presents the related works on software defect prediction, the optimization process of ANFIS, and uncertainty analysis. Section 3 shows methods and materials. Section 4 presents the results and discussion. Finally, Sect. 5 presents conclusions and future work.

The related literature is organized into three subsections to precisely cover the essential topics in this research and present the latest findings in each field. First, software defect prediction is the process of identifying and rectifying flaws. In the realm of developing embedded software, this task is particularly time-consuming and expensive due to the complex infrastructure, large scale, time constraints, and cost considerations. Measuring and achieving quality becomes a significant challenge, especially in automated systems. Second, the optimization process of ANFIS, where the ANFIS model offers the advantage of integrating linguistic and numerical expertise. Additionally, ANFIS harnesses the data categorization and pattern recognition capabilities of artificial neural networks (ANN). This organization aims to provide a comprehensive understanding of the critical aspects of this research. The ANFIS architecture is less prone to memorization problems and is clearer to the user than the ANN.

As a result, the ANFIS has a number of benefits, such as the ability to adapt, nonlinearity, and quick learning [35, 36]. Third, uncertainty analysis in SDP, especially in software features, can be handled in this research by adapting the parameters of ANFIS architecture. As a result, it is important to study the related work of these subsections in detail severally.

2.1 Software defect prediction (SDP)

Software testing is a crucial phase in the software development life cycle, as it identifies defects in the system and ensures that the software passes input test cases. Testing is not only time-consuming but also costly. While some automated technologies can help reduce testing effort, their high maintenance costs often contribute to increased expenses. Early software defect prediction decreases work and budget greatly without compromising limitations. It highlights the modules that are more prone to defects and need more thorough testing. The difficulties in dimensionality reduction and class imbalance located in SDP, demand for a realistic and efficient defect prediction technique. Recently, ML has become a potent method for making decisions in this area [37]. SDP primarily relies on prediction models to anticipate software defects. Although various strategies and algorithms have been employed to enhance the performance, the fundamental processes of SDP are illustrated in Fig. 1 [38]:

(1) Accumulate clean and flawed code sample data from software systems; (2) collect characteristics to create a dataset; (3) adjust the source data if it is unstable; (4) train an SDP model on a set of data; (5) forecast the flawed parts for a dataset obtained from new software; and (6) assess the accuracy of the SDP model. This process involves iterations.

The process begins with gathering samples of both clean and flawed codes, as shown in Fig. 1. There are numerous formats in which software data are available, including commit messages, source codes, defect files, and other software artifacts. Typically, these data are taken from repositories and archives.

The feature extraction (collect characteristics) phase of SDP is the next stage. Software artifacts, source codes, messages, and commit logs, among others, are transformed into metrics at this phase and utilized as input data for training models. The feature extraction stage depends heavily on the type of input data, which can include McCabe metrics [39], Chidamber and Kemerer (CK) metrics [40], modification histories, assembly code, and source code. A number of DL algorithms today offer automatic feature extraction from more complicated, high-dimensional data in addition to metric-based data. Defect data from well-known open defect repositories, such as the NASA [41] and PROMISE [42] databases, have been used in types of researches in the literature.

Usually, the next stage is elective. Since defect datasets often include a lot fewer faulty parts than non-faulty ones, this phase entails balancing the data. Consequently, this class imbalance issue affects the majority of SDP approaches, as it causes false results for various metrics used to assess SDP performance [43]. This problem can be resolved, and SDP performance can be improved by a number of methods, such as oversampling.

The fourth phase in the SDP process involves finding defective software components. Identifying suitable DL techniques, which can encompass various topologies such as convolutional neural networks and ML types, whether supervised or not, is a key consideration at this stage. Additionally, it is crucial to determine the granularity of the defective sections to be identified, which may range from file and module levels to function, class, or even phrase levels.

The following phase involves utilizing the trained model from the previous stage to forecast the flawed portions of new (test) data. The final phase of the SDP steps uses the prediction made here as its input.

The final stage of the SDP process involves evaluating the created model. Two commonly used metrics for assessing the SDP model are the area under the curve and F-measure. These metrics are employed when evaluating prediction models and making comparisons with other relevant studies.

Tang et al. [44] applied a swarm intelligence optimization technique to offer the model’s ideal parameters in an effort to enhance SDP. This study suggested an adaptive variable sparrow search algorithm (AVSSA) focused on different logarithmic spirals and variable hyper-parameters. This work conducted AVSSA investigations on eight benchmark functions and received positive results.

Elsabagh et al. [5, 45] suggested an innovative classifier based on the spotted hyena optimizer algorithm (SHO) to anticipate defects in both single and cross-projects. SHO acts as a classifier by identifying the most suitable rules among populations. To find the optimal classification criteria, confidence and support are used as a multi-objective fitness function. These classification criteria are applied to other projects with incomplete data or new projects to forecast faults. Four software datasets from NASA were used for experiments.

Kakkar et al. [46] proposed a novel approach that relies on the ANFIS that is optimized by PSO. For improved performance, the PSOANFIS method integrates the adaptability of the ANFIS model with PSO’s capability for optimization. The dataset from various-sized open-source Java projects is used to test the presented model. They suggested an SDP model-based PSOANFIS that provided software engineers with the amount of defects as an output. The data can then be used by engineers to allocate their limited resources, such time and labor, more effectively. The method called PSOANFIS makes use of the ANFIS model’s flexibility and employs PSO to optimize it. The PSOANFIS findings were excellent, and it can also be inferred that the size of the projects may have an impact on how well the SDP model based on PSOANFIS performs.

In response to the class imbalance issue, Somya Goyal [15] proposed the novel neighborhood under-sampling (N-US) approach. This work aims to demonstrate the effectiveness of the N-US approach in accurately predicting damaged modules. N-US samples the dataset to enhance the visibility of minority data points while minimizing the removal of majority data points to avoid information loss.

Nasser et al. [47] offered robust-tuned-KNN (RT-KNN), an ML method for SDP based on the K-nearest neighbors classifier. Their work was summarized as follows: (1) adjusting KNN and determining the ideal value for k in both the testing and training stages that may produce accurate prediction outcomes. (2) Rescaling the many independent inputs using the robust scalar.

Lei Qiao et al. [48] put out a fresh strategy that makes use of DL methods to forecast the occurrence of defects. First, they refine a dataset that is openly accessible by performing data normalization and log transformation. To build the data input for the DL method, they next undertook data modeling. Third, they sent the generated data to a deep neural network-based algorithm that was specifically created to forecast the number of faults. The following table presents a comparative study of SDP and illustrates the contribution to the most common literature review and the future possibilities for improving the SDP field.

2.2 Optimization process of ANFIS

ANFIS offers all the advantages of fuzzy systems and neural networks. However, when used for real-world applications, one of the major issues is learning ANFIS parameters. The problem of ANFIS learning has been addressed in numerous prior research using methods based on various algorithms, including the PSO, GWO, and GA.

Hasanipanah et al. [54] proposed a contemporary method for predicting rock fragmentation using the PSO method for parameter optimization in conjunction with ANFIS learning. Their model has shown efficacy when compared to SVM and multiple regression (MR) techniques.

Lin et al. [55] developed a method for learning ANFIS parameters based on the PSO. The system concentrated on applying quantum behaving PSO (QPSO) for setting the parameters of ANFIS. While the premise parameters were changed using the QPSO algorithm, the LSE was used to define the subsequent parameters.

Rahnama et al. [56] utilized ANFIS fuzzy c-means, ANFIS subtractive clustering, ANFIS grid partitioning, and radial basis function (RBF) to anticipate the sodium adsorption rate of different areas in Iran. Also, Asadollahfardi et al. [57] used the GA algorithm to detect the optimal combination for optimizing the tracking stations of water quality. Asadollahfardi et al. [58] applied three models: fuzzy regression analysis, ANFIS, and RBF to predict the reactor efficiency of eliminating acid red 14.

In rainfall gage only areas, Aghelpour et al. [59] developed an efficient ANFIS method for agricultural drought detection, utilizing a minimal number of variables. They applied ANFIS in conjunction with bio-inspired optimization methods, including ANFIS-PSO, ANFIS-GA, and ANFIS-ACO. Among these, GA and ACO proved to be the most effective algorithms for ANFIS optimization.

On the other hand, a lot of research has gone into describing how the GA for adjusting ANFIS parameters works. For the purposes of predicting rainfall on river, Panda et al. [60] presented and applied the MR and the ANFIS method. Both methods have been used to predict the outcome as learning models. To obtain the hydrological parameter condition, the GA is next coupled with the MR training technique. The goal function’s optimal control factor value is obtained via a GA. A novel modified GA was developed by Sarkheyli et al. [61] using various population structures to improve the parameters for the fuzzy membership functions and rules of ANFIS.

Raftari et al. [31] calculated the friction strength ratio using a technique that employed two-parameter optimization methods, GA and PSO. Dehghani et al. [62] created a method for forecasting and simulating the short to long-term influence flow rate. To anticipate the quick, short, and long flow rates, ANFIS and GWO were combined. GWO optimized and modified each parameter of ANFIS.

Maroufpoor et al. [63] created a method that combined the ANFIS with the GWO. The method outperformed the SVM, neural network, and standard ANFIS methods in terms of performance. A strategy for compressive power forecasting of energy, expense, and timeframe was presented by Golafshani et al. [64]. They employed the GWO and ANFIS methodologies to modify the ANN’s initial weights and parameters. A method for whale optimization algorithm (WOA) that used 28 days for the assessment of compressive power of concrete was proposed by Bui et al. [65]. The WOA is used to optimize its computational parameters in conjunction with a neural network (NN).

2.3 Uncertainty analysis

In risk evaluation, information currently available is gathered and used to inform judgments about the risk connected to a specific stressor, such as a physical, biological, or chemical factor. Risk assessment decisions are generally not made with complete clarity, which leads to confusion and uncertainty. Risk assessment includes a section called uncertainty analysis, which concentrates on the assessment’s uncertainties. The qualitative analysis that detects the uncertainties, the quantitative analysis that examines how the uncertainties affect the decision-making process, and the communication of the uncertainty are crucial elements of uncertainty analysis. The problem will determine how to analyze the uncertainty [66]. The way a scientist views uncertainty frequently differs by field. A risk manager would frequently perceive uncertainty as a decision-making process, assessing the costs and errors of actions. Uncertainty is perceived as a bothersome element that impairs decisions.

Kläs et al. [34] proposed three effective categories for identifying the primary sources of uncertainty in practice: model fit, data quality, and scope compliance. They emphasize the significance of these categories in the context of AI and ML model development and testing by establishing connections with specific tasks and methods for assessing and addressing these uncertainties.

One of the hardest issues in medical image analysis is accurate automated medical picture classification, covering segmentation and classification. DL techniques have recently achieved success in the classification and segmentation of medical images, indeed emerging as state-of-the-art techniques. However, most of these techniques are frequently overconfident and unable to offer uncertainty quantification (UQ) for their results, which can have severe effects. To solve this problem, Bayesian DL (BDL) techniques can be employed to quantify the uncertainty of conventional DL techniques. Three strategies for identifying uncertainty are used by Abdar et al. [67] to address uncertainty in the classification of skin cancer images. They are ensemble Monte Carlo (EMC) dropout, deep ensemble (DE), and Monte Carlo (MC) dropout. They offered a novel hybrid dynamic BDL method that accounts for uncertainty and relies on the three-way decision (TWD) theory to address the ambiguity or uncertainty that remains after using the MC, EMC, and DE approaches.

Walayat et al. [68] introduced a novel predictive model based on fuzzy time series, weighted averages (WA), and induced ordered weighted averages (IOWA).

A recent development in water engineering is fuzzy logic, a soft computing approach of AI. It is a fantastic mathematical tool for dealing with system uncertainty brought on by fuzziness or ambiguity. Bisht et al. [69] applied fuzzy logic modeling and ANFIS as soft computing methodologies. These systems start with some fundamental guidelines that define the procedure. To predict the elevation of the ground water table, two methods using fuzzy rules and two methods using ANFIS have been created. Out of all the generated methods, ANFIS produced the best results based on performance criteria [69].

Finally, based on the literature review, traditional techniques such as LSE and GD have been employed to modify the parameters of ANFIS [24‐26] to handle uncertain environments. However, these techniques are often slow and may fail to converge. Furthermore, using the chain rule in conventional ANFIS learning systems, which employ the GD algorithm, can result in many local optimums. Consequently, optimizing ANFIS parameters becomes a significant issue in real-world applications to handle uncertainty and improve accuracy. Hence, there is a growing demand to learn ANFIS parameters in SDP and choose the appropriate optimization algorithm for their management. In this study, the TFWO algorithm is selected to fine-tune ANFIS parameters due to its stable architecture, enhanced convergence capability, and effectiveness in addressing the control parameter selection issue. TFWO is inspired by the random and natural behavior of vortices in oceans, rivers, and seas.

3 Methods and materials

In this research, methods and materials to handle uncertainty in SDP are organized into three subsections: (1) ANFIS that represents human reasoning to address uncertainty problems. Fuzzy logic is used by ANFIS to turn information connections and fully integrated components of NN inputs into the desired output. (2) The TFWO algorithm is used as an optimization algorithm for modifying the parameters of the ANFIS during the SDP process due to its efficiency and reliability. (3) Adaptation of ANFIS utilizing TFWO: This subsection demonstrates the configuration of ANFIS with TFWO. ANFIS system is trained using the TFWO algorithm to optimize its parameters. This adaptation is illustrated through the flowchart of TFWO in Fig. 5, algorithm 1, and the architecture of TFWO_ANFIS model in Fig. 6.

3.1 ANFIS: adaptive neuro-fuzzy inference system

Jang [70] introduced ANFIS, an AI technique that emulates human thought processes to address inaccuracies. ANFIS utilizes fuzzy logic to process inputs from integrated neural network components and information links to produce appropriate outputs. This method is a straightforward approach to data learning. ANFIS combines fuzzy logic and ANN, making it capable of handling complex nonlinear problems, imprecise data, and human cognitive uncertainty within a single structure [71]. ANFIS is a widely used significant contribution approximated where the relationship among both the input and output dimensions of the problem is represented as a collection of if–then rules.

The Mamdani fuzzy technique and the Takagi–Sugeno (T–S) fuzzy technique are two popular fuzzy rule-based inference systems [71]. The Mamdani fuzzy technique has some benefits: 1. It makes sense. 2. It is generally accepted. 3. It is compatible with human cognition [72‐74].

The T–S system ensures output surface continuity and performs well with linear techniques [75, 76]. However, it faces challenges in handling multi-parameter synthetic assessment and weighing each input while applying fuzzy rules. On the other hand, the Mamdani system is known for its readability and understandability to a broad audience. In this work, we employ the Mamdani system, which proves beneficial in output expression.

It is necessary to designate a function for each of the following operators to fully describe the behavior of a Mamdani system:

For the computation of the rule firing strength with AND’ed premises, use the AND operator (often T-norm).

OR operation for estimating the firing strength of a rule with OR’ed premises (often T-conorm).

An operator for computing suitable consequent membership functions (MFs) depending on the firing strength provided, often a T-norm.

An aggregate operation, typically a T-conorm, for combining qualified consecutive MFs to produce an overall output MF.

A defuzzification operation that converts a sharp single output value from an output MF.

The following theorem is derived if the AND operation and implication operation are product, the aggregate operation is sum, and the defuzzification operation is centroid of area (COA) [77]. Implementing such composite inference has the benefit of allowing the Mamdani ANFIS to learn due to differentiability during processing (Table 1).

Table 1

Comparative study of SDP

References	Contribution	Future possibilities for improvements
Gyani et al. [49]	Suggests using the class imbalance reduction technique to address the imbalance between faulty and non-faulty modules, which is then assessed against smote and K-means in the datasets	Optimizing algorithms such as ACO, feature subset selection, and projected work performance can be tested for cross-project SDP to increase accuracy even more
Ganesh et al. [50]	This study investigates how different authors have used fluctuating binary variants of meta-heuristic techniques to solve the issue of choosing the optimal feature sets	In a classification challenge, participants need to employ a range of classifiers and contrast them with the most popular ones
Naik et al. [51]	The bagging approach used in this research prevents over-fitting and lowers variance	They recommended using a variety of DL techniques in the future for analyzing the feature set and find each defect in the product
Goyal [52]	Research demonstrates how stacked ensemble addresses dataset class imbalance to increase the accuracy of the SDP classifier	In the future, work may be performed utilizing DL architectures for various datasets
Raheem et al. [53]	The research used classification techniques combined with meta-heuristic-based correlation feature selection algorithms to assess and pick the best features.	Future studies could look into different feature selections and novel classification techniques to enhance the accuracy of the SDP models.

The following theorem [78] is provided by the sum-product; look at Eqs. 1 and 2. When utilizing centroid defuzzification, the final crisp result is equal to the weighted average of the centroids of the subsequent MFs, where:

$$ \psi \left( {r_{i} } \right) = \omega \left( {r_{i} } \right) \times a $$

(1)

where $\psi \left({r}_{i}\right)$ is the factor weight of ${r}_{i}$; $i$th is the rule; $\omega \left({r}_{i}\right)$ is the strength of firing the rule ${r}_{i}$; and $a$ is the area of MFs in the consequent part of the rule ${r}_{i}$.

$$ Z_{{{\text{COA}}}} = \frac{{\smallint z\mu c^{\prime}\left( z \right)z{\text{d}}z}}{{\smallint z\mu c^{\prime}\left( z \right){\text{d}}z}} = \frac{{\omega_{1} a_{1} z_{1} + \omega_{2} a_{2} z_{2} }}{{\omega_{1} z_{1} + \omega_{2} a_{2} }} = \omega ^{\prime}_{1} a_{1} \cdot z_{1} + \omega ^{\prime}_{2} a_{2} \cdot z_{2} $$

(2)

where ${a}_{i}$ is the area, and ${z}_{i}$ is the center of the consequent.

MF $\mu {c}_{i}(z)$. We obtain the relevant Mamdani ANFIS using Eqs. 1 and 2 as in Fig. 2.

Rule (1): If $x$ is ${A}_{1}$ and $y$ is ${B}_{1}$ then ${f}_{1}={\omega {\prime}}_{1}{a}_{1}\cdot {z}_{1}$

Rule (2): If $x$ is ${A}_{2}$ and $y$ is ${B}_{2}$ then ${f}_{2}={\omega {\prime}}_{2}{a}_{2}\cdot {z}_{2}$

where ${A}_{1}$ and ${A}_{2}$ are sets of fuzzy for input $x$; ${B}_{1}$ and ${B}_{2}$ are sets of fuzzy for input $y$.

The outcome of each layer in the five-layer Mamdani ANFIS design is as follows [64, 71, 79‐81].

Layer (1) Create the membership degrees ${\mu }_{A},{\mu }_{B}$

$${O}_{1,i}={\mu }_{{A}_{i}}\left(x\right),\quad i=\mathrm{1,2}$$

(3a)

$$ O_{{1,i}} = \mu _{{B_{{i - 2}} }} \left( y \right),i = 3,4 $$

(3b)

The MF is the generalized Gaussian function which is described by two parameters (d,$\sigma $):

$${\mu }_{{A}_{i}}\left(x\right)= {e}^{-\frac{1}{2}{(\frac{x-d}{\sigma })}^{2}}$$

(4)

Although center d and width $\sigma $ govern the Gaussian MF, they are sometimes referred to the parameters of premise.

Layer (2)

$${O}_{2,i}={\omega }_{i}={\mu }_{{A}_{i}}\left(x\right)\times {\mu }_{{B}_{i}}\left(y\right),\quad i=\mathrm{1,2}$$

(5)

The product approach generates the firing strength ${\omega }_{i}$.

Layer (3)

$${O}_{3,i}={\omega {\prime}}_{i}=\frac{{\omega }_{i}}{{\omega }_{1}+{\omega }_{2}},\quad i=\mathrm{1,2}$$

(6)

Layer (4)

$$ O_{4,i} = f_{i} = \omega ^{\prime}_{i} a_{i} \cdot z_{i} ,\quad i = 1,2 $$

(7)

where the consequential parameters, ${ a}_{i}$ and $ {z}_{i}$, are, respectively, the area and center of the resulting MFs.

Layer (5)

$$ O_{5,i} = \sum f_{i} = \sum \omega ^{\prime}_{i} a_{i} \cdot z_{i} ,\quad i = 1,2 $$

(8)

As shown in Fig. 3, a general M-ANFIS system can be generated.

Rule (1): If $x$ is ${A}_{1}$ and $y$ is ${B}_{1}$ then $Z={C}_{1}$

Rule (2): If $x$ is ${A}_{2}$ and $y$ is ${B}_{2}$ then $Z={C}_{2}$

The outcome of each layer in the five layers of general M-ANFIS design is as follows.

Layer (1) Layer of fuzzification

$${O}_{1,i}={\mu }_{{A}_{i}}\left(x\right),\quad i=1, 2$$

(9a)

$${O}_{1,i}={\mu }_{{B}_{i-3}}\left(y\right),\quad i=4, 5$$

(9b)

The MF is the generalized Gaussian function which is described by two parameters (d,$\sigma $):

$${\mu }_{{A}_{i}}\left(x\right)= {e}^{-\frac{1}{2}{(\frac{x-d}{\sigma })}^{2}}$$

(10)

Layer (2) Layer of rules

$${O}_{2,i}={\omega }_{i}={\mu }_{{A}_{i}}\left(x\right)\times {\mu }_{{B}_{i}}\left(y\right),\quad i=\mathrm{1,2}$$

(11)

The product approach generates the firing strength ${\omega }_{i}$

Layer (3)

$${O}_{3,i}={\omega }_{i}^\circ {C}_{i},\quad i=\mathrm{1,2}$$

(12)

Product is the implication operator.

Layer (4) Layer of aggregation

$${O}_{4}=\sum {\omega }_{i}^\circ {C}_{i},\quad i=\mathrm{1,2}$$

(13)

Sum is the aggregate operator. ${C}_{i}$ Establishes the consequential parameters.

Layer (5) Layer of defuzzification

$${O}_{5}=f=D^\circ {O}_{4}$$

(14)

The defuzzification approach COA yields a crisp or sharp output.

The ANFIS training process employed both forward and backward training techniques to update its parameters. ANFIS improves its parameters to reduce errors between predicted and target outcomes by using a hybrid GD (gradient descent) and LSE (least squares error) estimator, as shown in 2.

In the forward pass of the learning method, node outputs progressed from layers 1 to 4, and the consequential parameters were chosen and updated using the LSE. In the backward pass, GD updated the premise parameters as error signals propagated backward from the output to the input. The NN learned and trained to select parameter values that best fit the training data.

3.2 TFWO: turbulent flow of water optimization

A novel and effective optimization technique based on TFWO is utilized in this paper. The random and natural behavior of vortices in oceans, rivers, and seas served as an inspiration for this technique. TFWO is selected due to its stable structure, which increases the power of convergence, and overcomes the issue of determining control parameters. The TFWO is utilized to locate the overall solutions in various dimensions [27]. In addition, two real-world technical field optimization challenges are addressed using TFWO, including reliability–redundancy allocation optimization for the excessive speeding security mechanism of a gas turbine and different kinds of nonlinear economic load dispatch optimization issues in energy systems. The outcomes demonstrate the TFWO algorithm’s superiority and reliability in contrast with other optimization techniques, such as meta-heuristic techniques.

3.2.1 The whirlpool concept: an introduction to turbulent water flow

A whirlpool forms when water moves turbulently in a narrow, circular path, typically around a submerged obstacle like a rock. The gravitational force influences this circular motion, causing the water to follow a downward-spiraling pattern. As the water spirals, it accelerates, creating a small hole at its center, which further increases the flow speed. The formation of a whirlpool occurs as water is drawn into this central hole, causing a spinning motion [27].

3.2.2 TFWO algorithm

Seas, rivers, and oceans all have whirlpools as a random act of nature. In whirlpools, the middle of the whirlpool functions as a sucking hole, pulling the surrounding particles and objects toward its core and interior or applying centripetal force on them. In reality, a whirlpool is a body of moving water that is mostly caused by ocean tides. Where there are a few little ridges next to one another on the streamlet’s surface, whirlpools can emerge. These ridges bump into the rushing water, which then circles back around itself. This causes the water to progressively amalgamate around this circuit and form a funnel as it passes in a restricted path around the ridges. Centrifugal force is what causes the water to flow in this way. Sometimes, whirlpools near to one another interact in addition to having an impact on the particles and objects in their immediate surroundings, as shown in the next subsections [27].

3.2.2.1 The impacts of whirlpools on its set of objects and other whirlpools

The starting population $({X}^{o}$, consisting of ${N}_{p}$ members) of the technique is equally distributed between whirlpool group or ${N}_{wh}$ sets, then the strongest object of every set of whirlpool (the item with the better objective values $(f$) is taken into account as the whirlpool that pulls the objects $(X),$ including, ${N}_{p}-{N}_{wh}$.

Every whirlpool $(wh)$ functions as a sucking hole or well and, by doing the force of centripetal on the particles in its set $(X)$, tends to bring their locations into alignment with the well’s central position. Because of this, the $j$th whirlpool behaves in a way that makes the position of the $i$th object $({X}_{i})$ equal to the position of the $i$th whirlpool, i.e.,$ { X}_{i}={wh}_{j}$. However, other whirlpools give certain deviations $(\Delta {X}_{i})$, depending on how far away from the objective method $(f)$ they are $(wh-{wh}_{j})$. The updated position of the $i$th particle or object would then be equal to ${X}_{i}^{new}={wh}_{j}-{\Delta X}_{i}$. The objects $(X)$ rotate around the center of their whirlpool at their unique angle $(\theta )$. As a result, this angle changes with every iteration of the algorithm, as shown in Fig. 4.

$${\theta }_{i}^{new}={\theta }_{i}+{rand}_{1}*{rand}_{2}*\pi $$

(15)

To compute and determine $\Delta {X}_{i}$, the farthest and closest whirlpools, or the whirlpools with the most and least weighed distance from all particles, are computed as Eq. (16), then $\Delta {X}_{i}$ is computed as Eq. (17). To update the object’s position, apply Eq. (18).

$${\Delta }_{t}={f(wh}_{t})*{{abs(wh}_{t}-sum({X}_{i}))}^{0.5}$$

(16)

$${\Delta {\text{X}}}_{i}=({\text{cos}}({\theta }_{i}^{new})*rand\left(1,D\right)*\left({wh}_{f}-{X}_{i}\right)-{\text{sin}}({\theta }_{i}^{new})*rand(1,D)*\left({wh}_{w}-{X}_{i}\right))*(1+abs({\text{cos}}({\theta }_{i}^{new})-{\text{sin}}({\theta }_{i}^{new})))$$

(17)

$${X}_{i}^{new}={wh}_{j}-{\Delta X}_{i}$$

(18)

where ${wh}_{w}$ and ${wh}_{f}$ are the whirlpools with the highest and lowest values of ${\Delta }_{t}$, respectively. ${\theta }_{i}$ is the $i$th particle’s angle.

3.2.2.2 Centrifugal power $({\mathbf{F}\mathbf{E}}_{\mathbf{i}})$

While centripetal force attracts moving objects toward the whirlpool’s center, centrifugal force pushes them away from that center, as represented in Eq. (19). If this force exceeds a randomly generated number between 0 and 1, the centrifugal operation is performed on the randomly chosen dimension according to Eq. (20).

$${FE}_{i}={({({\text{cos}}({\theta }_{i}^{new}))}^{2}*{({\text{sin}}({\theta }_{i}^{new}))}^{2})}^{2}$$

(19)

$${X}_{i,p}={X}_{p}^{min}+rand*({X}_{p}^{max}-{X}_{p}^{min})$$

(20)

3.2.2.3 The whirlpools’ interactions

Whirlpools interact and move around one another in a manner similar to that of a whirlpool on the particles in its surrounding as shown in Eqs. 21, 22, and 23.

$${\Delta }_{t}={f(wh}_{t})*abs({wh}_{t}-sum({wh}_{j}))$$

(21)

$${\Delta wh}_{j}=rand\left(1,D\right)*abs({\text{cos}}{(\theta }_{j}^{new})+{\text{sin}}{(\theta }_{j}^{new}))*({wh}_{f}-{wh}_{j})$$

(22)

$${wh}_{j}^{new}= {wh}_{f}-{\Delta wh}_{j}$$

(23)

where ${\theta }_{j}$ is the angle of the $j$th whirlpool opening.

Finally, the individual of the new particles of the whirlpool’s set is picked as a new particle if it has more strength (i.e., its value of the objective method is lower) than its related whirlpool. So, it is decided to use it as the new whirlpool in the following iteration. Therefore, all the previous steps are shown briefly in Fig. 5.

3.3 Adaptation of ANFIS utilizing TFWO

In this study, both the subsequent and antecedent (premise) parameters of the ANFIS model are adjusted using the TFWO algorithm. The ANFIS training algorithms employ the conventional hybrid optimization technique, GD_LSE, which combines GD and LSE. This traditional hybrid technique uses LSE for modifying parameter values in the forward pass and GD for parameter modification in the membership function during the backward pass, similar to back propagation (as shown in Table 2). As a result, Table 2 is updated in accordance with the proposed model, as shown in Table 3.

Table 2

Training general ANFIS

Parameters	Forward	Backward
Premise	Settled	GD
Consequential	LSE	Settled

Table 3

Training ANFIS with TFWO

Parameters	Forward	Backward
Premise	Settled	TFWO
Consequential	TFWO	Settled

Traditional mathematical programming methods often fail to provide optimal solutions for real-world optimization problems due to the large number of parameters involved [27, 82]. GD and LSE are two examples of deterministic categories that are slow and occasionally fail to converge, and a major critique of GD is that it tends to stick to local minima, which is avoided by TFWO. In comparison with GD, TFWO performed the function of learning ANFIS parameters more quickly and flexibly since it is computationally less expensive. The total number of ANFIS adjustable parameters is a crucial element in the development of an ANFIS network due to the processing effort required for the adaptation process. Therefore, attention should be paid when choosing the membership categories. Better than other member functions is the Gaussian function, which simply requires the two parameters center and width as illustrated in Eq. 4.

The complete TFWO cycles with ANFIS are depicted in Fig. 6 and Algorithm 1. They outline the steps of the proposed TFWO_ANFIS as follows:

Data are divided at the beginning of the model into training and testing sets. Data for both training and testing are chosen at random to avoid the local optima and over-fitting issues. Data are trained using (70% of the datasets). This maintains the proper level of population variety and increases the capability of global search.

Create initial ANFIS model utilizing fuzzy C-means clustering (FCM) to find the degree of membership. ANFIS model contains a set of premise and consequential parameters that describe the parameters of membership functions in the two parts of if–then rule. ANFIS model is created utilizing the equations described in layers from 1 to 5 in Sect. 3.1.

Feed parameters of ANFIS (premise and consequential parameters) to TFWO algorithm with training data.

Initialization step that includes, creating initial population randomly, assesses the fitness function of initialized population MSE and split the population into ${N}_{wh}$ sets of whirlpools.

The TFWO algorithm uses its advantages iteratively, to reach its best whirlpool to modify the parameters of ANFIS based on MSE fitness function for each whirlpool.

Till MaxDecades, best ANFIS model is returned, then compute the result of best ANFIS with training data.

Find the result of best ANFIS with the rest testing datasets (30%).

The evaluation performance is applied using SD, RMSE, MSE, MBE, and accuracy.

When comparing the target amount to the actual performance, the fitness function was measured as a mean square error (MSE) as shown in Eq. 23.

$$ {\text{MSE}} = \frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {out_{m} - out_{m}^{{\Lambda }} } \right)^{2} }}{K} $$

(24)

where ${out}_{m}$ is the target (desired outcome), ${out}_{m}^{\Lambda }$ is the predicted outcome, and $K$ is the volume of data.

As depicted in Fig. 6, the initial stage of the model involves randomly splitting the data into training and testing sets to avoid issues such as local optima and over-fitting. This random selection maintains population diversity and enhances global search capabilities. Additionally, the ANFIS system design utilizes fuzzy C-means clustering (FCM) for identifying the degree of membership. The model includes a set of premise and consequential parameters that describe the parameters of membership functions in the two parts of if–then rule. The parameters of ANFIS are used as input for TFWO algorithm. TFWO creates an initial population randomly, assesses the fitness function of initialized population by MSE, and splits the population into N_wh sets of whirlpools. After computing fitness function, these processes are repeated till the maximum of iterations. Afterwards, the system finds the result of best ANFIS with testing data. The evaluation performance is applied using SD, RMSE, MSE, and accuracy. Finally, evaluating the performance of TFWO_ANFIS is performed utilizing accuracy metric.

4 Results and discussion

This section details the evaluation of TFWO_ANFIS efficiency. The experiment assesses the effectiveness and efficiency of the TFWO_ANFIS model in addressing uncertainty in the SDP field with higher accuracy and achieving the lowest error on four datasets obtained from OPENML [83]. This experiment marks the first use of TFWO with ANFIS to enhance SDP. The TFWO_ANFIS model is designed to better manage software metrics’ uncertainty and predict defects with higher accuracy. We compare TFWO_ANFIS with conventional ANFIS, ACO_ANFIS, DE_ANFIS, PSO_ANFIS, GWO_ANFIS, and GA_ANFIS. The evaluation of TFWO_ANFIS against recent relevant studies in SDP demonstrates its superior performance over all other techniques.

4.1 Evaluation performance

To evaluate the effectiveness of the recommended TFWO ANFIS technique and the performance of the results, various metrics are employed. These metrics are as follows:

MSE:

$$ {\text{MSE}} = \frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {out_{m} - out_{m}^{{\Lambda }} } \right)^{2} }}{K} $$

(25)

where ${out}_{m}$ is the target (desired output), ${out}_{m}^{\Lambda }$ is the predicted output, and $K$ is the size of data.

RMSE:

$$ {\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {out_{m} - out_{m}^{{\Lambda }} } \right)^{2} }}{K}} $$

(26)

SD:

$$ {\text{SD}} = \sqrt {\frac{{\mathop \sum \nolimits_{m = 1}^{K} \left( {X_{m} - \mu } \right)^{2} }}{m}} $$

(27)

Mean bias error (MBE):

$$ {\text{MBE}} = \frac{1}{k}\mathop \sum \limits_{m = 1}^{k} \left| {\frac{{out_{m} - out_{m}^{{\Lambda }} }}{{out_{m}^{{\Lambda }} }}} \right| $$

(28)

where ${X}_{m}$ is every value from population, $\mu $ is the mean, and $m$ is the size.

Accuracy (ACC):

$$ {\text{ACC}} = \left( {\text{TP + TN}} \right)/ ({\text{T P + TN + FP + FN)}} $$

(29)

Specificity (SP):

$$ {\text{SP}} = {\text{ T N}}/ \left( {{\text{T N}} + {\text{F P}}} \right) $$

(30)

Sensitivity (S):

$$ S = {\text{TP}}/ \left( {{\text{TP}} + {\text{FN}}} \right) $$

(31)

Precision (P):

$$ P = {\text{T P}}/ \left( {{\text{T P}} + {\text{ F P}}} \right) $$

(32)

The model is considered suitable for training when MBE equals zero. A negative MBE suggests an underestimated model, while a positive MBE indicates overestimations during the training phase [58].

where TP, TN, FP, and FN are shown in confusion matrix’s, Table 4.

Table 4

Confusion matrix

Predicted label	Target label
Predicted label	Negative	Positive
Negative	True negative (TN)	False positive (FP)
Positive	False negative (FN)	True positive (TP)

A common way to display the efficiency of a classification technique is by using a confusion matrix [84]. This matrix includes both the predicted class value and its corresponding actual class. These values are employed to assess the classifier’s performance, as shown in Table 4.

4.2 TFWO_ANFIS evaluation

4.2.1 Tools and environment

This subsection includes four software defect datasets obtained from OPENML [83]. These datasets are used to evaluate the effectiveness and efficiency of the proposed technique (TFWO_ANFIS) in addressing uncertainty issues in the field of software defect prediction (SDP). These datasets were selected based on their variations in sample sizes, features, and the number of defects, which reflects the diversity needed for the study’s accuracy. These datasets include essential information for SDP, and they were made publicly available to support the development of reliable, measurable, debatable, and enhanced software development prediction models. These datasets originated from source code extractors by McCabe and Halstead, designed to accurately define code aspects related to software quality, such as lines of code, cyclomatic complexity, volume, Halstead’s line count, unique operators, and operands. Detailed characteristics of these datasets are presented in the following table (Table 5).

Table 5

Datasets descriptions

Dataset	Samples	Features	Defects	Defects (%)
KC2	522	22	107	20.5
PC3	1563	38	160	10.23
KC1	2109	22	326	15.5
PC4	1458	38	178	12.21

In this experiment, the TFWO_ANFIS model is tested against various meta-heuristic methods, including ACO, PSO, GWO, standard ANFIS, DE [85], and GA. The dataset is split into 70% for training and 30% for testing. Parameters for each algorithm are found in Table 6. The experiments were conducted on a system running Windows 10 Pro (64-bit) with an Intel(R) Core(TM) i5 CPU and 4 GB of RAM. MATLAB (R2016a) [86] was used for all implementations.

Table 6

Parameters configuration of different models

Parameters	Iters	Error_goal	Ini_step	Step_decrease	Step_increse
ANFIS
Values	100	0	0.01	0.9	1.1

Parameters	Iters	Crossover	Mutation	Selection_pressure	Mutation_rate
GA_ANFIS
Values	100	0.7	0.5	8	0.1

Parameters	Personal_learn	Global_learn	Inertia_weight	Inertia_damping	Velocity_limits
PSO_ANFIS
Values	1	2	1	0.99	− 10 to 2

Parameters	Min_scaling_factor	Max_scaling_factor	Crossover_probability
DE_ANFIS
Values	0.2	0.8	0.1

Parameters	Iters	Populations	a
GWO_ANFIS
Values	100	93	Decreased from 2:0

Parameters	Iters	Selection_pressure	Deviation-distance ratio
ACO_ANFIS
Values	100	0.4	1

All optimization techniques have the following parameters: maximum decades (iteration) = 100, size of population = 93 according to this equation of TFWO ${N}_{pop}={Nw}_{h}+{Nw}_{h}*{N}_{obw}$ such that ${Nw}_{h}=3$, ${N}_{obw}=30$, the upper and lower bound are 10 and − 10, respectively (Fig. 7).

4.2.2 Output of experiment

The experiment used common metrics such as accuracy, RMSE, precision, SD, specificity, sensitivity, and MSE to evaluate the TFWO_ANFIS model’s performance in optimizing ANFIS parameters. Average results for the ten experiments are presented in Tables 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, and 16 and Figs. 8, 9, 10, and 11, respectively. These figures and tables demonstrate that TFWO_ANFIS outperforms the other algorithms across all four datasets. Tables 10, 12, 14, 16, and 8 demonstrate that TFWO_ANFIS outperformed all other algorithms in terms of accuracy across all datasets used in this experiment. This validates the effectiveness and efficiency of our recommended model, which can enhance ANFIS parameter tuning. Additionally, a typical metric for optimization techniques is the convergence rate [87]. As shown in Fig. 7, convergence describes how a solution’s progression through the iterations to an appropriate point in less time. In this study, the TFWO converges at a rate of about 3% of the total number of iterations. As a result, the best-fit individual is satisfied more quickly.

Table 7

MSE in testing

Technique/datasets	KC2	PC3	KC1	PC4
ANFIS	0.1749	0.0909	0.6702	0.4440
GA_ANFIS	0.1111	0.0906	0.1236	0.0998
PSO_ANFIS	0.1292	0.0877	0.1099	0.0910
DE_ANFIS	0.1261	0.1134	0.1192	0.1053
GWO_ANFIS	0.2202	0.0868	0.1070	0.0900
ACO_ANFIS	0.1371	0.1057	0.1135	0.1109
TFWO_ANFIS	0.1091	0.0770	0.1026	0.0850

Table 8

RMSE testing

Technique/datasets	KC2	PC3	KC1	PC4
ANFIS	0.4182	0.3016	0.8187	0.6663
GA_ANFIS	0.3333	0.3011	0.3515	0.3160
PSO_ANFIS	0.3595	0.2961	0.3319	0.3015
DE_ANFIS	0.3551	0.3368	0.3453	0.3245
GWO_ANFIS	0.4693	0.2646	0.3271	0.2999
ACO_ANFIS	0.3702	0.3252	0.3369	0.3331
TFWO_ANFIS	0.3303	0.2776	0.3203	0.2926

Table 9

SD in testing

Technique/datasets	KC2	PC3	KC1	PC4
ANFIS	0.4165	0.3018	0.8174	0.6661
GA_ANFIS	0.3345	0.3013	0.3474	0.3158
PSO_ANFIS	0.3602	0.2964	0.3317	0.3013
DE_ANFIS	0.3527	0.3024	0.3436	0.3224
GWO_ANFIS	0.4705	0.2943	0.3272	0.2982
ACO_ANFIS	0.3706	0.3246	0.3341	0.3335
TFWO_ANFIS	0.3307	0.2885	0.3205	0.2929

Table 10

Confusion matrix for testing KC2

Predicted label	Target label
Predicted label	Negative	Positive
Negative	123	16
Positive	4	14

Table 11

Comparative between TFWO_ANFIS and others for KC2

Technique	P	SP	S	ACC
ANFIS	31.7	79.7	68.4	78.3
GA_ANFIS	25.0	83.6	72.7	82.8
PSO_ANFIS	15.2	80.8	45.5	78.3
DE_ANFIS	10.3	83.0	75.0	82.8
GWO_ANFIS	33.3	84.3	64.7	82.2
ACO_ANFIS	29.4	83.3	76.9	82.8
TFWO_ANFIS	46.7	88.5	77.8	87.3

Table 12

Confusion matrix for testing PC3

Predicted label	Target label
Predicted label	Negative	Positive
Negative	422	42
Positive	4	1

Table 13

Comparative between TFWO_ANFIS and others for PC3

Technique	P	SP	S	ACC
ANFIS	4.2	90.1	33.3	89.3
GA_ANFIS	3.8	89.3	66.7	89.1
PSO_ANFIS	4.0	89.5	20.0	88.1
DE_ANFIS	20.4	91.1	30.3	86.6
GWO_ANFIS	7.1	88.7	57.1	88.3
ACO_ANFIS	4.4	90.7	33.3	89.9
TFWO_ANFIS	2.3	90.9	20.0	90.2

Table 14

Confusion matrix for testing KC1

Predicted label	Target label
Predicted label	Negative	Positive
Negative	430	80
Positive	10	13

Table 15

Comparative between TFWO_ANFIS and others for KC1

Technique	P	SP	S	ACC
FIS	11.7	85.2	66.7	84.7
GA_ANFIS	30.9	88.2	35.4	81.4
PSO_ANFIS	8.9	85.1	64.3	84.7
DE_ANFIS	6.1	85.0	50.0	84.4
GWO_ANFIS	17.5	86.0	64.3	85.0
ACO_ANFIS	15.5	86.4	51.7	84.8
TFWO_ANFIS	14.0	86.9	56.5	85.8

Table 16

Confusion matrix for testing PC4

Predicted label	Target label
Predicted label	Negative	Positive
Negative	381	45
Positive	2	9

Table 7 and Fig. 9 show the MSE metric that can be calculated according to Eq. 25. They show the comparison between our proposed TFWO_ANFIS with common meta-heuristic optimization techniques in the literature such as PSO, GA, GWO, DE, ACO, and standard ANFIS. The scores of the proposed TFWO_ANFIS in terms of MSE are 0.1091, 0.0770, 0.1026, and 0.0850 for the KC2, PC3, KC1, and PC4 datasets, respectively. Table 8 and Fig. 10 present RMSE metrics that can be computed regards to Eq. 26. Our proposed model (TFWO_ANFIS) scores lowest results 0.3303, 0.2776, 0.3203, and 0.2926 in terms of RMSE for KC2, PC3, KC1, and PC4 datasets, respectively. SD metric presented in Table 9 and Fig. 11 is calculated as shown in Eq. 27. It is also used to evaluate the performance of proposed TFWO_ANFIS with other techniques. The SD of the proposed model scores 0.3307, 0.2885, 0.3205, and 0.2929. From Tables 7 and 9 and Figs. 9 and 11, the MSE, RMSE, and SD of TFWO_ANFIS are the lowest, so the proposed model has a better performance.

Table 10 displays the confusion matrix results for the TFWO_ANFIS applied to the KC2 dataset. From this table, evaluation metrics such as P, SP, S, and ACC can be calculated using Eqs. 29, 30, 31, and 32. Accuracy is one of the most important metrics in this study, and the proposed TFWO_ANFIS achieves the highest accuracy of 87.3%, outperforming other techniques.

Tables 12 and 13 describe the confusion matrix and comparative between the proposed TFWO_ANFIS with other meta-heuristic techniques on the PC3 dataset. TFWO_ANFIS scores 90.2% in terms of accuracy that is the best score.

Table 15, derived from Table 14, provides a comparison between TFWO_ANFIS and other techniques using the KC1 dataset. The confusion matrix for the tested KC1 dataset is presented in Table 13. When TFWO_ANFIS is applied to the test data, it achieves the highest accuracy among all techniques, scoring 85.8%.

Finally, TFWO_ANFIS is applied to the PC4 dataset. The results are shown in Tables 16 and 17. Table 16 represents the confusion matrix resulting from applying TFWO_ANFIS on the tested PC4 dataset, and Table 17 describes the comparative analysis between the proposed TFWO_ANFIS with other techniques. Table 16 shows that TFWO_ANFIS has better accuracy than others, with a score 89.2%.

Table 17

Comparative between TFWO_ANFIS and others for PC4

Technique	P	SP	S	ACC
ANFIS	40.4	92.8	38.8	86.7
GA_ANFIS	5.6	88.2	50.0	87.6
PSO_ANFIS	12.1	88.1	77.8	87.9
DE_ANFIS	7.3	88.2	66.7	87.9
GWO_ANFIS	17.7	88.0	84.6	87.9
ACO_ANFIS	3.4	87.0	66.6	86.9
TFWO_ANFIS	16.7	89.4	81.8	89.2

Table 18 presents the most common metrics for evaluating the model, such as MSE, MBE, RMSE, and SD. This table depicts the different datasets utilized in the proposed research KC2, PC3, KC1, and PC4 with their number of data and features, respectively. This table concludes the outperformance of the TFWO_ANFIS.

Table 18

Various metrics for estimating TFWO_ANFIS efficiency

Dataset	Samples	Features	MSE	MBE	RMSE	SD
KC2	522	22	0.1091	0.1281	0.3303	0.3307
PC3	1563	38	0.0770	0.0860	0.2776	0.2885
KC1	2109	22	0.1026	0.0931	0.3203	0.3205
PC4	1458	38	0.0850	0.2310	0.2926	0.2929

4.2.3 Result discussion

The research results offer several advantages in the field of software defect prediction (SDP) and related areas. Firstly, when compared to optimization algorithms such as PSO, GWO, DE, ACO, standard ANFIS, and GA, the TFWO_ANFIS model demonstrates superior accuracy in predicting software defects. This enhanced accuracy is valuable for software development teams and organizations as it enables them to identify and address potential issues early, thereby improving software quality and reliability. Secondly, thanks to the underlying TFWO algorithm, the TFWO_ANFIS model provides stability and convergence power, ensuring consistent performance across various datasets and instances. This stability makes it a reliable choice for real-world applications. Furthermore, the proposed TFWO_ANFIS model effectively handles uncertainty in software features, a common issue in real-world software engineering. The TFWO_ANFIS model solves this problem by offering a more accurate defect prediction, enabling quality assurance teams to allocate their resources and efforts. Also, the research findings’ practical usefulness is improved by the use of publicly available datasets from platforms such as OPENML. The model’s performance and accuracy may be verified and extended to other software development scenarios and contexts by using real-world datasets.

There are four datasets explained in Table 4, namely, KC2, PC3, KC1, and PC4, with different instances and features in our experiment to examine and assess the effectiveness and efficiency of the proposed TFWO_ANFIS in handling uncertainty in software features. In every tested dataset, the TFWO_ANFIS produced good results.

Case KC2 TFWO_ANFIS results in 87.3%, 0.1091, 0.1281, 0.3303, and 0.3307 in terms of accuracy, MSE, MBE, RMSE, and SD, respectively.

Case PC3 TFWO_ANFIS achieves 90.2%, 0.0770, 0.0860, 0.2776, and 0.2885 in terms of accuracy, MSE, MBE, RMSE, and SD, respectively.

Case KC1 TFWO_ANFIS fulfills 85.8%, 0.1026, 0.0931, 0.1026, and 0.3205 in terms of accuracy, MSE, MBE, MBE, RMSE, and SD, respectively.

Case PC4 TFWO_ANFIS obtains 89.2%, 0.0850, 0.2310, 0.2926, and 0.2929 in terms of accuracy, MSE, MBE, RMSE, and SD, respectively.

These cases conclude that the TFWO_ANFIS outperformed the traditional ANFIS model and other meta-heuristic optimization techniques such as GA, PSO, GWO, ACO, and DE in terms of training and testing accuracy while also having the lowest error rate. The outcomes show that the suggested TFWO_ANFIS performed better than all of them in terms of accuracy, MSE, SD, and RMSE.

This study has significant theoretical and practical implications. Theoretical implications arise from addressing the limitations of conventional methods such as LSE and GD when optimizing ANFIS parameters in uncertain scenarios. The research enhances optimization strategies for handling uncertainty and improving software defect prediction (SDP) accuracy through the introduction of the TFWO algorithm.

In practical terms, the TFWO_ANFIS model offers valuable applications. Its improved convergence power and stable architecture allow for efficient adjustment of ANFIS parameters, resulting in enhanced SDP accuracy. The model proves its utility in practice by outperforming alternative optimization algorithms across various evaluation measures. The study also emphasizes the importance of effective algorithm selection and parameter optimization in SDP. However, it is crucial to be aware of practical considerations, such as the additional time required for configuration and the complexity of implementing the suggested algorithm. These insights provide valuable guidance for those considering the use of the TFWO_ANFIS model in software defect prediction and related fields. This research contributes to the fields of software engineering and optimization, highlighting both theoretical advancements and their practical applications. It holds value for both researchers and professionals in the industry.

To summarize, the characteristics of the ANFIS, which is used to anticipate software defects, are the primary parameters the proposed research attempted to enhance in the suggested study. The ANFIS system combined the interpretability of fuzzy logic with the learning powers of NNs. To achieve accurate predictions in conventional ANFIS learning systems, characteristics such as membership function shapes, the number of fuzzy rules, and consequent parameters are essential. Nevertheless, there is a big issue in optimizing these parameters when there is uncertainty and when SDP is involved. The proposed research aimed to use TFWO to enhance the parameters of ANFIS in SDP, increasing accuracy and handling uncertainty.

The no free lunch theorem asserts that no single optimization algorithm can effectively address every optimization problem. Therefore, it is important to recognize that the TFWO algorithm may not be suitable for all optimization issues. Additionally, the proposed model may require additional iterations during the training process. While TFWO with ANFIS demonstrates efficiency, it is worth noting that implementing the proposed algorithm can be quite complex, and configuring it may take more time.76

5 Conclusions and future work

This study introduces a model called TFWO_ANFIS to address uncertainty in SDP with improved accuracy. Unlike traditional methods such as GD and LSE, TFWO_ANFIS leverages the turbulent flow of water optimization (TFWO) to optimize parameters in the adaptive neuro-fuzzy inference system (ANFIS), including membership function shapes, fuzzy rule numbers, and consequent parameters.

The proposed TFWO_ANFIS outperformed other optimization algorithms and recent literature in SDP such as particle swarm optimization (PSO), gray wolf optimization (GWO), differential evolution (DE), ant colony optimization (ACO), standard ANFIS, and genetic algorithm (GA) in terms of standard deviation (SD), mean square error (MSE), mean bias error (MBR), root-mean-square error (MSE), and accuracy. Four datasets with different instances and features from open platform for publishing datasets called OPENML are utilized. The proposed TFWO_ANFIS had an accuracy 87.3%, 90.2%, 85.8%, and 89.2%, respectively, for the datasets KC2, PC3, KC1, and PC4. Moreover, many evaluation metrics are utilized such as precision, sensitivity, confusion matrices, and specificity.

The results indicate that TFWO_ANFIS has better and outperformed than the previous algorithms across all four datasets. Moreover, they showed that the suggested TFWO_ANFIS outscored all other algorithms in all used datasets in terms of accuracy and other evaluation metrics. Finally, this experiment validates the effectiveness and efficiency of the recommendation model and can be used to enhance the method for ANFIS’s parameter tuning to handle uncertainty in SDP with higher accuracy.

Future research is expected to enhance the described TFWO_ANFIS model by incorporating additional real-world fields and datasets. Addressing software feature uncertainty in SDP with alternative methods is also considered a critical challenge.

Declarations

Conflict of interest

None.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

previous article Spatial deep feature augmentation technique for FER using genetic algorithm

next article Sugar beet farming goes high-tech: a method for automated weed detection using machine learning and deep learning in precision agriculture

Pavana MS, Pushpalatha MN, Parkavi A (2022) Software fault prediction using machine learning algorithms. In: Sengodan T, Murugappan M, Misra S (eds) Advances in electrical and computer technologies. ICAECT 2021. Lecture Notes in Electrical Engineering, vol 881. Springer, Singapore. https://doi.org/10.1007/978-981-19-1111-8_16

Wahono RS, Suryana N (2013) Combining particle swarm optimization based feature selection and bagging technique for software defect prediction. Int J Softw Eng Appl 7:153–166

Nam J (2014) Survey on software defect prediction. Department of Compter Science and Engineerning, the Hong Kong University of Science and Technology Tech Rep

Raukas H Some Approaches for software defect prediction

Elsabagh MA, Farhan MS, Gafar MG (2021) Meta-heuristic optimization algorithm for predicting software defects. Expert Syst 38:e12768

Kuang B, Tekin Y, Mouazen AM (2015) Comparison between artificial neural network and partial least squares for on-line visible and near infrared spectroscopy measurement of soil organic carbon, pH and clay content. Soil Tillage Res 146:243–252

El-Hasnony IM, Barakat SI, Mostafa RR (2020) Optimized ANFIS model using hybrid metaheuristic algorithms for Parkinson’s disease prediction in IoT environment. IEEE Access 8:119252–119270

Goyal S (2022) Effective software defect prediction using support vector machines (SVMs). Int J Syst Assur Eng Manag 13:681–696

Kuncheva LI, Skurichina M, Duin RPW (2002) An experimental study on diversity for bagging and boosting with linear classifiers. Inf fus 3:245–258

10.

Okutan A, Yıldız OT (2014) Software defect prediction using Bayesian networks. Empir Softw Eng 19:154–181

11.

Aljamaan HI, Elish MO (2009) An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: 2009 IEEE symposium on computational intelligence and data mining. IEEE, pp 187–194

12.

Li B, Shen B, Wang J, et al (2014) A scenario-based approach to predicting software defects using compressed C4. 5 model. In: 2014 IEEE 38th annual computer software and applications conference. IEEE, pp 406–415

13.

Alshammari FH (2022) Software Defect prediction and analysis using enhanced random forest (extRF) technique: a business process management and improvement concept in IOT-based application processing environment. Mob Inf Syst

14.

Khan MA, Elmitwally NS, Abbas S et al (2022) Software defect prediction using artificial neural networks: a systematic literature review. Sci Program

15.

Goyal S (2022) Handling class-imbalance with KNN (neighbourhood) under-sampling for software defect prediction. Artif Intell Rev 55:2023–2064

16.

Khosravi K, Daggupati P, Alami MT et al (2019) Meteorological data mining and hybrid data-intelligence models for reference evaporation simulation: a case study in Iraq. Comput Electron Agric 167:105041

17.

Yaseen ZM, Mohtar WHMW, Ameen AMS et al (2019) Implementation of univariate paradigm for streamflow simulation using hybrid data-driven model: case study in tropical region. IEEE Access 7:74471–74481

18.

Yaseen ZM, Ebtehaj I, Kim S et al (2019) Novel hybrid data-intelligence model for forecasting monthly rainfall with uncertainty analysis. Water 11:502

19.

Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Softw 114:48–70

20.

Dhiman G, Kumar V (2019) Spotted hyena optimizer for solving complex and non-linear constrained engineering problems BT—Harmony search and nature inspired optimization algorithms. In: Yadav N, Yadav A, Bansal JC et al (eds) Springer, Singapore, pp 857–867

21.

Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evolut Comput 1:67–82

22.

Allawi MF, Jaafar O, Mohamad Hamzah F et al (2018) Reservoir inflow forecasting with a modified coactive neuro-fuzzy inference system: a case study for a semi-arid region. Theor Appl Climatol 134:545–563ADS

23.

Sharafati A, Tafarojnoruz A, Shourian M, Yaseen ZM (2020) Simulation of the depth scouring downstream sluice gate: the validation of newly developed data-intelligent models. J Hydro-Environ Res 29:20–30

24.

Enayatollahi H, Fussey P, Nguyen BK (2020) Modelling evaporator in organic Rankine cycle using hybrid GD-LSE ANFIS and PSO ANFIS techniques. Therm Sci Eng Prog 19:100570

25.

Silarbi S, Tlemsani R, Bendahmane A (2021) Hybrid PSO-ANFIS for speaker recognition. Int J Cognit Inform Nat Intell 15:83–96

26.

Qiao J, Sun Z, Meng X (2023) Interval type-2 fuzzy neural network based on active semi-supervised learning for non-stationary industrial processes. IEEE Trans Autom Sci Eng

27.

Ghasemi M, Davoudkhani IF, Akbari E et al (2020) A novel and effective optimization algorithm for global optimization and its engineering applications: turbulent flow of water-based optimization (TFWO). Eng Appl Artif Intell 92:103666

28.

Jing W, Yaseen ZM, Shahid S et al (2019) Implementation of evolutionary computing models for reference evapotranspiration modeling: short review, assessment and possible future research directions. Eng Appl Comput Fluid Mech 13:811–823

29.

Rauf HT, Bangyal WHK, Lali MI (2021) An adaptive hybrid differential evolution algorithm for continuous optimization and classification problems. Neural Comput Appl 33:10841–10867

30.

Pervaiz S, Ul-Qayyum Z, Bangyal WH, et al (2021) A systematic literature review on particle swarm optimization techniques for medical diseases detection. Comput Math Methods Med

31.

Moayedi H, Raftari M, Sharifi A et al (2020) Optimization of ANFIS with GA and PSO estimating α ratio in driven piles. Eng Comput 36:227–238

32.

Tien Bui D, Khosravi K, Li S et al (2018) New hybrids of anfis with several optimization algorithms for flood susceptibility modeling. Water 10:1210

33.

Ahmadlou M, Karimi M, Alizadeh S et al (2019) Flood susceptibility assessment using integration of adaptive network-based fuzzy inference system (ANFIS) and biogeography-based optimization (BBO) and BAT algorithms (BA). Geocarto Int 34:1252–1272ADS

34.

Kläs M, Vollmer AM (2018) Uncertainty in machine learning applications: a practice-driven classification of uncertainty. In: International conference on computer safety, reliability, and security. Springer, Singapore, pp 431–438

35.

Srisaeng P, Baxter GS, Wild G (2015) An adaptive neuro-fuzzy inference system for forecasting Australia’s domestic low cost carrier passenger demand. Aviation 19:150–163

36.

Şahin M, Erol R (2017) A comparative study of neural networks and ANFIS for forecasting attendance rate of soccer games. Math Comput Appl 22:43

37.

Anand K, Jena AK (2023) Software defect prediction: an ML approach-based comprehensive study. In: Communication, software and networks. Springer, Singapore, pp 497–512

38.

Giray G, Bennin KE, Köksal Ö et al (2023) On the use of deep learning in software defect prediction. J Syst Softw 195:111537

39.

McCabe T, Meqsure AC (1976) A complexity measure. IEEE Tran Softw Eng 2(4):308–320. https://doi.org/10.1109/TSE.1976.233837MathSciNetCrossRef

40.

Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Tans Softw Eng 20(6):476–493. https://doi.org/10.1109/32.295895CrossRef

41.

Shepperd M, Song Q, Sun Z, Mair C (2013) Data quality: some comments on the nasa software defect datasets. IEEE Trans Softw Eng 39:1208–1215

42.

Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering, pp 1–10

43.

Bennin KE, Toda K, Kamei Y, et al (2016) Empirical evaluation of cross-release effort-aware defect prediction models. In: 2016 IEEE international conference on software quality, reliability and security (QRS). IEEE, pp 214–221

44.

Tang Y, Dai Q, Yang M et al (2023) Software defect prediction ensemble learning algorithm based on adaptive variable sparrow search algorithm. Int J Mach Learn Cybern 14(6):1–21

45.

Elsabagh MA, Farhan MS, Gafar MG (2020) Cross-projects software defect prediction using spotted hyena optimizer algorithm. SN Appl Sci 2:538. https://doi.org/10.1007/s42452-020-2320-4CrossRef

46.

Kakkar M, Jain S, Bansal A, Grover PS (2021) An optimized software defect prediction model based on PSO-ANFIS. Recent Adv Comput Sci Commun (Former Recent Patents Comput Sci) 14:2732–2741

47.

Nasser AB, Ghanem W, Abdul-Qawy ASH, et al (2023) A robust tuned K-nearest neighbours classifier for software defect prediction. In: International conference on emerging technologies and intelligent systems. Springer, Singapore, pp 181–193

48.

Qiao L, Li X, Umer Q, Guo P (2020) Deep learning based software defect prediction. Neurocomputing 385:100–110

49.

Bejjanki KK, Gyani J, Gugulothu N (2020) Class imbalance reduction (CIR): a novel approach to software defect prediction in the presence of class imbalance. Symmetry 12:407ADS

50.

Agrawal P, Abutarboush HF, Ganesh T, Mohamed AW (2021) Metaheuristic algorithms on feature selection: a survey of one decade of research (2009–2019). IEEE Access 9:26766–26791

51.

Suresh Kumar P, Behera HS, Nayak J, Naik B (2021) Bootstrap aggregation ensemble learning-based reliable approach for software defect prediction by using characterized code feature. Innov Syst Softw Eng 17:355–379

52.

Goyal S (2020) Heterogeneous stacked ensemble classifier for software defect prediction. In: 2020 sixth international conference on parallel, distributed and grid computing (PDGC). IEEE, pp 126–130

53.

Oloduowo AA, Raheem MO, Ayinla FB, Ayeyemi BM (2020) Software defect prediction using metaheuristic-based feature selection and classification algorithms. Ilorin J Comput Sci Inf Technol 3:23–39

54.

Hasanipanah M, Amnieh HB, Arab H, Zamzam MS (2018) Feasibility of PSO–ANFIS model to estimate rock fragmentation produced by mine blasting. Neural Comput Appl 30:1015–1024

55.

Lin X, Sun J, Palade V, et al (2012) Training ANFIS parameters with a quantum-behaved particle swarm optimization algorithm. In: International conference in swarm intelligence. Springer, Singapore, pp 148–155

56.

Rahnama E, Bazrafshan O, Asadollahfardi G (2020) Application of data-driven methods to predict the sodium adsorption rate (SAR) in different climates in Iran. Arab J Geosci 13:1–19

57.

Asadollahfardi G, Heidarzadeh N, Mosalli A, Sekhavati A (2018) Optimization of water quality monitoring stations using genetic algorithm, a case study, Sefid-rud river. Iran Adv Environ Res 7:87–107

58.

Asadollahfardi G, Afsharnasab M, Rasoulifard MH, Tayebi Jebeli M (2022) Predicting of acid red 14 removals from synthetic wastewater in the advanced oxidation process using artificial neural networks and fuzzy regression. Rend Lincei Scienze Fis e Nat 33:115–126ADS

59.

Aghelpour P, Bahrami-Pichaghchi H, Kisi O (2020) Comparison of three different bio-inspired algorithms to improve ability of neuro fuzzy approach in prediction of agricultural drought, based on three different indexes. Comput Electron Agric 170:105279

60.

Ghose DK, Panda SS, Swain PC (2013) Prediction and optimization of runoff via ANFIS and GA. Alex Eng J 52:209–220

61.

Sarkheyli A, Zain AM, Sharif S (2015) Robust optimization of ANFIS based on a new modified GA. Neurocomputing 166:357–366

62.

Dehghani M, Seifi A, Riahi-Madvar H (2019) Novel forecasting models for immediate-short-term to long-term influent flow prediction by combining ANFIS and grey wolf optimization. J Hydrol 576:698–725

63.

Maroufpoor S, Maroufpoor E, Bozorg-Haddad O et al (2019) Soil moisture simulation using hybrid artificial intelligent model: hybridization of adaptive neuro fuzzy inference system with grey wolf optimizer algorithm. J Hydrol 575:544–556

64.

Golafshani EM, Behnood A, Arashpour M (2020) Predicting the compressive strength of normal and high-performance concretes using ANN and ANFIS hybridized with grey wolf optimizer. Constr Build Mater 232:117266

65.

Tien Bui D, Abdullahi MM, Ghareh S et al (2021) Fine-tuning of neural computing using whale optimization algorithm for predicting compressive strength of concrete. Eng Comput 37:701–712

66.

Smith E (2002) Uncertainty analysis. Encycl Environ 4:2283–2297

67.

Abdar M, Samami M, Mahmoodabad SD et al (2021) Uncertainty quantification in skin cancer classification using three-way decision-based bayesian deep learning. Comput Biol Med 135:104418PubMed

68.

Hussain W, Merigo JM, Raza MR (2022) Predictive intelligence using ANFIS-induced OWAWA for complex stock market prediction. Int J Intell Syst 37:4586–4611

69.

Bisht DCS, Raju M, Joshi M (2009) Simulation of water table elevation fluctuation using fuzzy-logic and ANFIS. Comput Model New Technol 13:16–23

70.

Jang J-S (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23:665–685

71.

Chai Y, Jia L, Zhang Z (2009) Mamdani model based adaptive neural fuzzy inference system and its application. Int J Comput Inf Eng 3:663–670

72.

Mamdani EH, Assilian S (1999) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Hum Comput Stud 51:135–147

73.

Mamdani EH, Gaines BR (1981) Fuzzy reasonings and its applications. Academic Press, Inc, Cambridge

74.

Mamdani EH (1977) Application of fuzzy logic to approximate reasoning using linguistic synthesis. IEEE Trans Comput 26:1182–1191

75.

Takagi T, Sugeno M (1983) Derivation of fuzzy control rules from human operator’s control actions. IFAC Proc 16:55–60

76.

Takagi T, Sugeno M (1985) Fuzzy identification of systems and its applications to modeling and control. IEEE Trans Syst Man Cybern 15(1):116–132. https://doi.org/10.1109/TSMC.1985.6313399CrossRef

77.

Yager RR, Filev DP (1993) SLIDE: a simple adaptive defuzzification method. IEEE Trans fuzzy Syst 1:69

78.

Jang J-SR, Sun C-T, Mizutani E (1997) Neuro-fuzzy and soft computing-a computational approach to learning and machine intelligence [Book Review]. IEEE Trans Autom Control 42:1482–1484

79.

Hassan N, Ghazali R, Hussain K (2017) Training ANFIS using catfish-particle swarm optimization for classification. In: Recent advances on soft computing and data mining: the second international conference on soft computing and data mining (SCDM-2016), Bandung, Indonesia, August 18–20, 2016 Proceedings Second. Springer, pp 201–210

80.

Negnevitsky M (2005) Artificial intelligence: a guide to intelligent systems. Pearson education

81.

Salih SQ, Allawi MF, Yousif AA et al (2019) Viability of the advanced adaptive neuro-fuzzy inference system model on reservoir evaporation process simulation: case study of Nasser Lake in Egypt. Eng Appl Comput Fluid Mech 13:878–891

82.

Ghasemi M, Taghizadeh M, Ghavidel S, Abbasian A (2016) Colonial competitive differential evolution: an experimental study for optimal economic load dispatch. Appl Soft Comput 40:342–363

83.

OpenML (2022) https://www.openml.org/search?type=data. Accessed 9 Dec 2022

84.

Antaki F, Coussa RG, Kahwati G et al (2023) Accuracy of automated machine learning in classifying retinal pathologies from ultra-widefield pseudocolour fundus images. Br J Ophthalmol 107:90–95PubMed

85.

Sun J, Zhang Q, Tsang EPK (2005) DE/EDA: a new evolutionary algorithm for global optimization. Inf Sci (Ny) 169:249–262MathSciNet

86.

Products and services—MATLAB & Simulink, MATLAB & Simulink, https://www.mathworks.com/downloads/web_downloads/?s_tid=sp_ban_dl. Accessed 9 Dec 2022

87.

Reiszadeh M, Narimani H, Fazel MS (2023) Improving convergence properties of autonomous demand side management algorithms. Int J Electr Power Energy Syst 146:108764

Title: Handling uncertainty issue in software defect prediction utilizing a hybrid of ANFIS and turbulent flow of water optimization algorithm
Authors: M. A. Elsabagh
O. E. Emam
M. G. Gafar
T. Medhat
Publication date: 14-12-2023
Publisher: Springer London
Published in: Neural Computing and Applications / Issue 9/2024
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-023-09315-0

Springer Professional

Handling uncertainty issue in software defect prediction utilizing a hybrid of ANFIS and turbulent flow of water optimization algorithm

Abstract

Publisher's Note

1 Introduction

2.1 Software defect prediction (SDP)

2.2 Optimization process of ANFIS

2.3 Uncertainty analysis

3 Methods and materials

3.1 ANFIS: adaptive neuro-fuzzy inference system

3.2 TFWO: turbulent flow of water optimization

3.2.1 The whirlpool concept: an introduction to turbulent water flow

3.2.2 TFWO algorithm

3.2.2.1 The impacts of whirlpools on its set of objects and other whirlpools

3.2.2.2 Centrifugal power \(({\mathbf{F}\mathbf{E}}_{\mathbf{i}})\)

3.2.2.3 The whirlpools’ interactions

3.3 Adaptation of ANFIS utilizing TFWO

4 Results and discussion

4.1 Evaluation performance

4.2 TFWO_ANFIS evaluation

4.2.1 Tools and environment

4.2.2 Output of experiment

4.2.3 Result discussion

5 Conclusions and future work

Declarations

Conflict of interest

Publisher's Note

Premium Partner

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 Related works

2.1 Software defect prediction (SDP)

2.2 Optimization process of ANFIS

2.3 Uncertainty analysis

3 Methods and materials

3.1 ANFIS: adaptive neuro-fuzzy inference system

3.2 TFWO: turbulent flow of water optimization

3.2.1 The whirlpool concept: an introduction to turbulent water flow

3.2.2 TFWO algorithm

3.2.2.1 The impacts of whirlpools on its set of objects and other whirlpools

3.2.2.2 Centrifugal power \(({\mathbf{F}\mathbf{E}}_{\mathbf{i}})\)

3.2.2.3 The whirlpools’ interactions

3.3 Adaptation of ANFIS utilizing TFWO

4 Results and discussion

4.1 Evaluation performance

4.2 TFWO_ANFIS evaluation

4.2.1 Tools and environment

4.2.2 Output of experiment

4.2.3 Result discussion

5 Conclusions and future work

Declarations

Conflict of interest

Publisher's Note

Other articles of this Issue 9/2024

Sugar beet farming goes high-tech: a method for automated weed detection using machine learning and deep learning in precision agriculture

Toward interpretable credit scoring: integrating explainable artificial intelligence with deep learning for credit card default prediction

Anterior cruciate ligament tear detection based on convolutional neural network and generative adversarial neural network

A new super-predefined-time convergence and noise-tolerant RNN for solving time-variant linear matrix–vector inequality in noisy environment and its application to robot arm

Improving productivity in mining operations: a deep reinforcement learning model for effective material supply and equipment management

MGCRL: Multi-view graph convolution and multi-agent reinforcement learning for dialogue state tracking

Premium Partner