Skip to main content

Journal of Classification OnlineFirst articles

A Storage and Classification Algorithm for Concept Drift Data Streams Based on OS-ELM

  • Original Research

Many methods can deal with some special cases of data streams (e.g., concept drift) currently; however, these methods need to store historical data and access them repeatedly, which is inconsistent with the single-channel characteristics of data …

Two-Group k-Adic Similarity Coefficients for Binary Classifiers

  • Original Research

When using two different sets of binary classification rules on the same items, we obtain two sets of binary vectors. We can, for example, consider the case of two groups of doctors with different experiences classifying patients as diseased or …

Old but Gold or New and Shiny? Comparing Tree Ensembles for Ordinal Prediction with a Classic Parametric Approach

  • Open Access
  • Original Research

Ordinal data are frequently encountered, e.g., in the life and social sciences. Predicting ordinal outcomes can inform important decisions, e.g., in medicine or education. Two methodological streams tackle prediction of ordinal outcomes: …

Vine Copula-Based Classifiers with Applications

  • Open Access
  • Original Research

The vine pair-copula construction can be used to fit flexible non-Gaussian multivariate distributions to a mix of continuous and discrete variables. With multiple classes, fitting univariate distributions and a vine to each class lead to posterior …

Mixed-Type Distance Shrinkage and Selection for Clustering via Kernel Metric Learning

  • Original Research

Distance-based clustering is widely used to group mixed numeric and categorical data (mixed-type data), where a predefined metric is used to quantify dissimilarity or distance between data points for clustering data. However, many existing metrics …

Studying Hierarchical Latent Structures in Heterogeneous Populations with Missing Information

  • Open Access
  • Original Research

An ultrametric Gaussian mixture model is a powerful tool for modeling hierarchical relationships among latent concepts, making it ideal for studying complex phenomena in diverse and potentially heterogeneous populations. However, in many cases …

How to Measure the Researcher Impact with the Aid of its Impactable Area: A Concrete Approach Using Distance Geometry

  • Open Access
  • Original Research

Assuming that the subject of each scientific publication can be identified by one or more classification entities, we address the problem of determining a similarity function (distance) between classification entities based on how often two …

Multi-task Support Vector Machine Classifier with Generalized Huber Loss

Compared to single-task learning (STL), multi-task learning (MTL) achieves a better generalization by exploiting domain-specific information implicit in the training signals of several related tasks. The adaptation of MTL to support vector …

Clustering-Based Oversampling Algorithm for Multi-class Imbalance Learning

Multi-class imbalanced data learning faces many challenges. Its complex structural characteristics cause severe intra-class imbalance or overgeneralization in most solution strategies. This negatively affects data learning. This paper proposes a …

Combining Semi-supervised Clustering and Classification Under a Generalized Framework

Most machine learning algorithms rely on having a sufficient amount of labeled data to train a reliable classifier. However, labeling data is often costly and time-consuming, while unlabeled data can be readily accessible. Therefore, learning from …

An Effective Crow Search Algorithm and Its Application in Data Clustering

  • Original Research

In today’s data-centric world, the significance of generated data has increased manifold. Clustering the data into a similar group is one of the dynamic research areas among other data practices. Several algorithms’ proposals exist for clustering.

Slope Stability Classification Model Based on Single-Valued Neutrosophic Matrix Energy and Its Application Under a Single-Valued Neutrosophic Matrix Scenario

  • Original Research

Since matrix energy (ME) implies the expressive merit of collective information, a classification method based on ME has not been investigated in the existing literature, which reflects its research gap in a matrix scenario. Therefore, the purpose …

Flexible Clustering with a Sparse Mixture of Generalized Hyperbolic Distributions

  • Original Research

Robust clustering of high-dimensional data is an important topic because clusters in real datasets are often heavy-tailed and/or asymmetric. Traditional approaches to model-based clustering often fail for high dimensional data, e.g., due to the …

Clustering with Minimum Spanning Trees: How Good Can It Be?

  • Open Access
  • Original Research

Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they are meaningful in …

A New Matrix Feature Selection Strategy in Machine Learning Models for Certain Krylov Solver Prediction

  • Original Research

Numerical simulation processes in scientific and engineering applications require efficient solutions of large sparse linear systems, and variants of Krylov subspace solvers with various preconditioning techniques have been developed. However, it …

Cluster Validation Based on Fisher’s Linear Discriminant Analysis

  • Open Access
  • Original Research

Cluster analysis aims to find meaningful groups, called clusters, in data. The objects within a cluster should be similar to each other and dissimilar to objects from other clusters. The fundamental question arising is whether found clusters are …

A New Look at the Dirichlet Distribution: Robustness, Clustering, and Both Together

  • Open Access
  • Original Research

Compositional data have peculiar characteristics that pose significant challenges to traditional statistical methods and models. Within this framework, we use a convenient mode parametrized Dirichlet distribution across multiple fields of …

Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure

  • Open Access

There is no, nor will there ever be, single best clustering algorithm. Nevertheless, we would still like to be able to distinguish between methods that work well on certain task types and those that systematically underperform. Clustering …