In this study, we propose an ensemble hybrid model called seasonal trend decomposition based on Loess and long short-term memory (STL-LSTM) for forecasting non-stationary, nonlinear, and seasonal agricultural price series. The model integrates …
In this paper, we propose joint model for claim counts and amounts in non-life insurance using a Gaussian Copula joint model for analyzing longitudinal (k, l)-inflated Conway–Maxwell Poisson and Normal responses with random effects. We develop a …
In the study of statistical inference, researchers often face two types of errors: those arising from insufficient information, inaccurate information, or a combination of both. Inaccuracy measures serve as valuable tools for addressing these …
Model choice algorithms are usually compared based on their accuracy, i.e. ability to find true models. However, conservative algorithms (such as BIC minimisation) are accurate when no true effects exist, while more liberal algorithms (such as …
Hypothesis testing can be applied in a detective fashion in the field of data science. Different tests are developed to test the population parameters. This article provides a data depth-based test for testing the equality of location vectors of …
Common methods used for topic modeling have generally suffered problems of overfitting, leading to diminished predictive performance, as well as a weakness towards reconstructing sparse topic structures that involve only a few critical words to …
This study explores the application of data mining techniques to analyse factors influencing university choice and predict enrolment trends in Kazakhstan. For this purpose, methods of analysis (multiple correlation and regression analysis, factor …
Course recommendation (CD) is essential for success in a student’s educational journey. Due to the variations in student’s knowledge system, it might be difficult to select the course content from online educational platforms. This problem is …
A novel probability distribution, the Generalized Alpha Power Inverted Weibull (GAPIW) distribution, is derived from the generalization of the $$\alpha$$ α -power family and compounded with the inverted Weibull distribution. The researchers looked …
The xgamma distribution was first introduced by Sen et al. [1] as an alternative distribution to the exponential model. The xgamma distribution exhibits a bathtub-shaped hazard rate function, so it is suitable for many lifetime phenomena. In this …
Recognizing and reducing risk is a major part of Supply Chain Management (SCM). Several companies are invested in Supply Chain Risk Management (SCRM) and they have the knowledge about the procurement occupancies within their companies and take …
One of the key objectives of statistics is to provide a model compatible with the data generated by an unknown random process. Often, it happens that the unknown process is intractable, and no prior data or information associated with the unknown …
We live in a world where everything is connected to online social media platforms, and the person uses social media networks like Face book, Twitter, Instagram, Whatsapp, etc. In the present scenario, working women, celebrities, sports persons …
The advancement of technology has increased competitiveness, especially in the manufacturing industry. Alongside Statistical Process Control (SPC), capacity indices are tools used to measure the quality of processes and are useful for establishing …
Two known characteristics of the distribution of stock returns (price fluctuations) and, more recently, the distribution of financial asset volumes are power laws and scaling. These power laws can be viewed as the asymptotic behaviour of …
Data science often employs discrete probability distributions to model and analyze various phenomena. These distributions are particularly useful when dealing with data that can be categorized into distinct outcomes or events. This study presents …
The rise of mobile technology has significantly transformed numerous aspects of our everyday lives, especially within food delivery services. The investigation aims to explore the food delivery mobile apps (FDMA) satisfaction (SAT) and the …
This paper has investigated an empirical study to consider the impact of supply chain management on small scale integrated commercial agriculture by focusing on the moderator role of impediments and obligations to offer solutions for agricultural …
This paper introduces a Modified Lindley distribution using a convex combination of exponential and gamma distribution. The fundamental properties of the proposed distribution such as the shapes of the distribution, moments, mean, variance …
Nature-inspired algorithms (NIA) are proven to be the potential tool for solving intricate optimization problems and aid in the development of better computational techniques. In recent years, these algorithms have raised considerable interest to …
Advancements in genome sequencing technologies have significantly increased the availability of genomic data. The use of machine learning models to predict the pathogenicity or clinical significance of genetic mutations is crucial. However …
Carbon emissions disclosure (CED) has become a pivotal aspect of corporate sustainability efforts, reflecting a company’s commitment to environmental responsibility and accountability. This study delves into the complex connection between CED and …
In this paper, a nonparametric kernel method is introduced to estimate the well-known overlapping coefficient, Matusita $$\rho (X,Y)$$ ρ ( X , Y ) , between two random variables $$X$$ X and $$Y$$ Y . Due to the complexity of finding the formula …
In this paper we first define the class of Generalized Inflated Power Series Distributions (GIPSDs) which contain the inflated discrete distributions most often seen in practice as special cases. We describe the hitherto unkown exponential family …
With the widespread use of social networks, detecting the topics discussed on these platforms has become a significant challenge. Current approaches primarily rely on frequent pattern mining or semantic relations, often neglecting the structure of …
In this article, we propose the quadratic rank transmutation map approach on shifted Lindley distribution to improve the existing distribution further. An additional skewness parameter $$\lambda $$ λ is incorporated to transmute the distribution.
In this paper, we propose a new model by adding an additional parameter to the baseline distributions for modeling claim and risk data used in actuarial and financial studies. The new model is called alpha power transformed exponential Poisson …
Alcohol's dehydrating effects can cause vocal cords to dry out, potentially causing temporary voice changes and increasing the risk of vocal strain or damage. Short-term changes in pitch, volume, and alcohol consumption can cause voice clarity …
Metric learning consists of designing adaptive distance functions that are well-suited to a specific dataset. Such tailored distance functions aim to deliver superior results compared to standard distance measures while performing machine learning …
The Inverse Rayleigh distribution has many applications in the area of reliability studies. It is regarded as a model for a lifetime random variable. It is essential to develop an efficient goodness-of-fit test for this distribution. In this …
The Medical Imaging Query Response System is among the most challenging concepts in the medical field. It requires a significant amount of effort to organize and comprehend the various representations of the human body. Additionally, the system …
In this work, we propose a novel hybrid method for the estimation of regression models, which is based on a combination of LASSO-type methods and smooth transition (STR) random forests. Tree-based regression models are known for their flexibility …
Nowadays, with the growth of emerging technologies, increased attention has been paid to the classification of privacy-preserved medical data and development of various privacy-preserving models for the promotion of online medical pre-diagnosis …
Modernization in the healthcare industry is happening with the support of artificial intelligence and blockchain technologies. Collecting healthcare data is done through any Google survey from different governing bodies and data available on the …
Early detection of dementia patients in advance is a great concern for the physicians. That is why physicians make use of multi modal data to accomplish this. The baseline visit data of the patients are mainly utilized for this task. Modern …
Real estate significantly contributes to the broader stock market and garners substantial attention from individual households to the overall country’s economy. Predicting real estate trends holds great importance for investors, policymakers, and …
Generalized linear mixed effect models (GLMEMs) are widely applied for the analysis of correlated non-Gaussian data such as those found in longitudinal studies. On the other hand, the Cox (proportional hazards, PHs) and the accelerated failure …
A mathematical approach to developing new distributions is reviewed. The method which composes of integration and the concept of a normalizing constant, allows for primitive interjection of new parameter(s) in an existing distribution to form new …
In the past decade, deep learning has greatly increased the complexity of industrial production intelligence by virtue of its powerful learning capability. At the same time, it has also brought security challenges to the field of industrial …
Search and recommendation are two essential features of any e-commerce website for finding and purchasing a specific product. Visual Search is a promising and quick method in comparison to a textual-based search method. Hence, the objective of …
Safer sexual practice is essential for improving women’s reproductive and sexual health outcomes. The goal of this study is to identify the contributing factors influencing safer sexual negotiations (SSN) through the application of machine …
This article introduced a three-parameter extension of the Generalized Rayleigh distribution called half-logistic Generalized Rayleigh distribution, which has submodels the Generalized Rayleigh and Rayleigh distribution. The proposed model is …
Data clustering is one of the main issues in the optimization problem. It is the process of clustering a group of items into several groups. Items within each group have the greatest similarity and the least similarity to things in other groups.
Agriculture, engineering, public health, sociology, psychology, and epidemiology are just few of the numerous disciplines that find analysis and modeling of zero-truncated count data to be of paramount importance. Very recently, researchers have …
In this paper, we propose the exponential ratio-type estimator for the elevated estimation of population mean, implying one auxiliary variable in stratified random sampling using the conventional ratio and, Bahl and Tuteja exponential ratio-type …
In the era of big data, preserving data privacy has become paramount due to the sheer volume and sensitivity of the information being processed. This research is dedicated to safeguarding data privacy through a novel data sanitization approach …
Panel count data refers to the information collected in studies focusing on recurrent events, where subjects are observed only at specific time points. If these study subjects are exposed to recurrent events of several types, we obtain panel count …
In this research, we introduce an innovative automated resume screening approach that leverages advanced Natural Language Processing (NLP) technology, specifically the Bidirectional Encoder Representations from Transformers (BERT) language model …
Traditionally, in cognitive modeling for binary decision-making tasks, stochastic differential equations, particularly a family of diffusion decision models, are applied. These models suffer from difficulties in parameter estimation and …
This paper introduces a new family of distributions called the hyperbolic tangent (HT) family. The cumulative distribution function of this model is defined using the standard hyperbolic tangent function. The fundamental properties of the …
The main objective of this paper is to forecast the realized volatility (RV) of Bitcoin futures (BTCF) market. To serve our purpose, we propose an augmented heterogenous autoregressive (HAR) model to consider the information on time-varying jumps …
In this paper, we propose and investigate a novel approach for generating the probability distributions. The novel method is known as the SMP transformation technique. By using the SMP Transformation technique, we have developed a new model of the …