Skip to main content

Annals of Data Science OnlineFirst articles

The Effect of Company Size, Profitability, Leverage, Media Exposure, and Liquidity on Carbon Emissions Disclosure

Carbon emissions disclosure (CED) has become a pivotal aspect of corporate sustainability efforts, reflecting a company’s commitment to environmental responsibility and accountability. This study delves into the complex connection between CED and …

Partial Label Learning with Noisy Labels

Partial label learning (PLL) is a particular problem setting within weakly supervised learning. In PLL, each sample corresponds to a candidate label set in which only one label is true. However, in some practical application scenarios, the …

Kernel Method for Estimating Matusita Overlapping Coefficient Using Numerical Approximations

In this paper, a nonparametric kernel method is introduced to estimate the well-known overlapping coefficient, Matusita $$\rho (X,Y)$$ ρ ( X , Y ) , between two random variables $$X$$ X and $$Y$$ Y . Due to the complexity of finding the formula …

Maximum Likelihood Estimation for Generalized Inflated Power Series Distributions

In this paper we first define the class of Generalized Inflated Power Series Distributions (GIPSDs) which contain the inflated discrete distributions most often seen in practice as special cases. We describe the hitherto unkown exponential family …

Farm-Level Smart Crop Recommendation Framework Using Machine Learning

Agriculture is the primary source of food, fuel, and raw materials and is vital to any country’s economy. Farmers, the backbone of agriculture, primarily rely on instinct to determine what crops to plant in any given season. They are comfortable …

A Human Word Association Based Model for Topic Detection in Social Networks

With the widespread use of social networks, detecting the topics discussed on these platforms has become a significant challenge. Current approaches primarily rely on frequent pattern mining or semantic relations, often neglecting the structure of …

Transmuted Shifted Lindley Distribution: Characterizations, Classical and Bayesian Estimation with Applications

In this article, we propose the quadratic rank transmutation map approach on shifted Lindley distribution to improve the existing distribution further. An additional skewness parameter $$\lambda $$ λ is incorporated to transmute the distribution.

Apple Leaf Disease Detection Using Transfer Learning

Automated detection of plant diseases is crucial as it simplifies the task of monitoring large farms and identifies diseases at their early stages to mitigate further plant degradation. Besides the decline in plant health, reduced production …

Representing a Model for the Anonymization of Big Data Stream Using In-Memory Processing

In light of the escalating privacy risks in the big data era, this paper introduces an innovative model for the anonymization of big data streams, leveraging in-memory processing within the Spark framework. The approach is founded on the principle …

A Review of Anonymization Algorithms and Methods in Big Data

In the era of big data, with the increase in volume and complexity of data, the main challenge is how to use big data while preserving the privacy of users. This study was conducted with the aim of finding a solution to this challenge. In this …

Analyzing Insurance Data with an Alpha Power Transformed Exponential Poisson Model

In this paper, we propose a new model by adding an additional parameter to the baseline distributions for modeling claim and risk data used in actuarial and financial studies. The new model is called alpha power transformed exponential Poisson …

Drinkers Voice Recognition Intelligent System: An Ensemble Stacking Machine Learning Approach

Alcohol's dehydrating effects can cause vocal cords to dry out, potentially causing temporary voice changes and increasing the risk of vocal strain or damage. Short-term changes in pitch, volume, and alcohol consumption can cause voice clarity …

A New Kernel Density Estimation-Based Entropic Isometric Feature Mapping for Unsupervised Metric Learning

Metric learning consists of designing adaptive distance functions that are well-suited to a specific dataset. Such tailored distance functions aim to deliver superior results compared to standard distance measures while performing machine learning …

Power Evaluation of Some Tests for Inverse Rayleigh Distribution

The Inverse Rayleigh distribution has many applications in the area of reliability studies. It is regarded as a model for a lifetime random variable. It is essential to develop an efficient goodness-of-fit test for this distribution. In this …

Visual Question Answer System for Skeletal Image Using Radiology Images in the Healthcare Domain Based on Visual and Textual Feature Extraction Techniques

The Medical Imaging Query Response System is among the most challenging concepts in the medical field. It requires a significant amount of effort to organize and comprehend the various representations of the human body. Additionally, the system …

Combining LASSO-type Methods with a Smooth Transition Random Forest

In this work, we propose a novel hybrid method for the estimation of regression models, which is based on a combination of LASSO-type methods and smooth transition (STR) random forests. Tree-based regression models are known for their flexibility …

A Comprehensive Survey of Image Generation Models Based on Deep Learning

In recent years, generative artificial intelligence has been developing rapidly. In the image domain, image generation models based on deep learning have made remarkable achievements. Early frameworks for image generation models were dominated by …

Classification of Privacy Preserved Medical Data with Fractional Tuna Sailfish Optimization Based Deep Residual Network in Cloud

Nowadays, with the growth of emerging technologies, increased attention has been paid to the classification of privacy-preserved medical data and development of various privacy-preserving models for the promotion of online medical pre-diagnosis …

A Two-Stage Analysis of Interaction Between Stock and Exchange Rate Markets: Evidence from Turkey

In this study, we use a novel approach to explore possible connections between foreign exchange and stock returns using Turkish financial data from 2005 to 2022. Our method involves a two-stage technique. The first stage begins by decomposing …

A Comprehensive Study and Research Perception towards Secured Data Sharing for Lung Cancer Detection with Blockchain Technology

Modernization in the healthcare industry is happening with the support of artificial intelligence and blockchain technologies. Collecting healthcare data is done through any Google survey from different governing bodies and data available on the …