International Journal of Computer Vision OnlineFirst articles

23-04-2024

VNAS: Variational Neural Architecture Search

Differentiable neural architecture search delivers point estimation to the optimal architecture, which yields arbitrarily high confidence to the learned architecture. This approach thus suffers in calibration and robustness, in contrast with the …

Authors:: Benteng Ma, Jing Zhang, Yong Xia, Dacheng Tao

Open Access 23-04-2024

Augmenting the Softmax with Additional Confidence Scores for Improved Selective Classification with Out-of-Distribution Data

Detecting out-of-distribution (OOD) data is a task that is receiving an increasing amount of research attention in the domain of deep learning for computer vision. However, the performance of detection methods is generally evaluated on the task in …

Authors:: Guoxuan Xia, Christos-Savvas Bouganis

Download PDF-version View full text

Open Access 23-04-2024

Multimodal Machine Learning in Image-Based and Clinical Biomedicine: Survey and Prospects

Machine learning (ML) applications in medical artificial intelligence (AI) systems have shifted from traditional and statistical methods to increasing application of deep learning models. This survey navigates the current landscape of multimodal …

Authors:: Elisa Warner, Joonsang Lee, William Hsu, Tanveer Syeda-Mahmood, Charles E. Kahn Jr., Olivier Gevaert, Arvind Rao

Download PDF-version View full text

Open Access 18-04-2024

On Finite Difference Jacobian Computation in Deformable Image Registration

Producing spatial transformations that are diffeomorphic is a key goal in deformable image registration. As a diffeomorphic transformation should have positive Jacobian determinant $$\vert J\vert $$ | J | everywhere, the number of pixels (2D) or …

Authors:: Yihao Liu, Junyu Chen, Shuwen Wei, Aaron Carass, Jerry Prince

Download PDF-version View full text

13-04-2024

Ensemble Quadratic Assignment Network for Graph Matching

Graph matching is a commonly used technique in computer vision and pattern recognition. Recent data-driven approaches have improved the graph matching accuracy remarkably, whereas some traditional algorithm-based methods are more robust to feature …

Authors:: Haoru Tan, Chuang Wang, Sitong Wu, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu

13-04-2024

Learning with Noisy Correspondence

This paper studies a new learning paradigm for noisy labels, i.e., noisy correspondence (NC). Unlike the well-studied noisy labels that consider the errors in the category annotation of a sample, the NC refers to the errors in the alignment …

Authors:: Zhenyu Huang, Peng Hu, Guocheng Niu, Xinyan Xiao, Jiancheng Lv, Xi Peng

08-04-2024

CRetinex: A Progressive Color-Shift Aware Retinex Model for Low-Light Image Enhancement

Low-light environments introduce various complex degradations into captured images. Retinex-based methods have demonstrated effective enhancement performance by decomposing an image into illumination and reflectance, allowing for selective …

Authors:: Han Xu, Hao Zhang, Xunpeng Yi, Jiayi Ma

08-04-2024

Error-Aware Conversion from ANN to SNN via Post-training Parameter Calibration

Spiking Neural Network (SNN), originating from the neural behavior in biology, has been recognized as one of the next-generation neural networks. Conventionally, SNNs can be obtained by converting from pre-trained Artificial Neural Networks (ANNs) …

Authors:: Yuhang Li, Shikuang Deng, Xin Dong, Shi Gu

04-04-2024

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Traditional methods for object detection typically necessitate a substantial amount of training data, and creating high-quality training data is time-consuming. We propose a novel Few-Shot Object Detection network (FSODv2) in this paper that aims …

Authors:: Qi Fan, Wei Zhuo, Chi-Keung Tang, Yu-Wing Tai

02-04-2024

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) and derives that both have consistent mathematical formulation. Then inspired by …

Authors:: Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong Liu, Dacheng Tao

02-04-2024

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

Existing multimodal conditional image synthesis (MCIS) methods generate images conditioned on any combinations of various modalities that require all of them must be exactly conformed, hindering the synthesis controllability and leaving the …

Authors:: Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, Dacheng Tao

26-03-2024

InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions

We have recently seen tremendous progress in diffusion advances for generating realistic human motions. Yet, they largely disregard the multi-human interactions. In this paper, we present InterGen, an effective diffusion-based approach that …

Authors:: Han Liang, Wenqian Zhang, Wenxuan Li, Jingyi Yu, Lan Xu

Open Access 26-03-2024

Hyperbolic Deep Learning in Computer Vision: A Survey

Deep representation learning is a ubiquitous part of modern computer vision. While Euclidean space has been the de facto standard manifold for learning visual representations, hyperbolic space has recently gained rapid traction for learning in …

Authors:: Pascal Mettes, Mina Ghadimi Atigh, Martin Keller-Ressel, Jeffrey Gu, Serena Yeung

Download PDF-version View full text

Open Access 22-03-2024

Pictorial and Apictorial Polygonal Jigsaw Puzzles from Arbitrary Number of Crossing Cuts

Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unordered visual fragments, is fundamental to numerous applications, and yet most of the literature of the last two decades has focused thus far on …

Authors:: Peleg Harel, Ofir Itzhak Shahar, Ohad Ben-Shahar

Download PDF-version View full text

19-03-2024

UrbanEvolver: Function-Aware Urban Layout Regeneration

Urban regeneration is an important strategy for land redevelopment, to address the urban decay in cities. Among many tasks, urban layout is the foundation for urban regeneration. In this paper, we target a new task called function-aware urban …

Authors:: Yiming Qin, Nanxuan Zhao, Jiale Yang, Siyuan Pan, Bin Sheng, Rynson W. H. Lau

18-03-2024

Vision-Language Alignment Learning Under Affinity and Divergence Principles for Few-Shot Out-of-Distribution Generalization

Recent advances in fine-tuning large-scale vision-language pre-trained models (VL-PTMs) have shown promising results in quick adaption to downstream tasks. However, prior research often lacks comprehensive investigation into out-of-distribution …

Authors:: Lin Zhu, Weihan Yin, Yiyao Yang, Fan Wu, Zhaoyu Zeng, Qinying Gu, Xinbing Wang, Chenghu Zhou, Nanyang Ye

13-03-2024

Softmax-Free Linear Transformers

Vision transformers (ViTs) have pushed the state-of-the-art for visual perception tasks. The self-attention mechanism underpinning the strength of ViTs has a quadratic complexity in both computation and memory usage. This motivates the development …

Authors:: Jiachen Lu, Junge Zhang, Xiatian Zhu, Jianfeng Feng, Tao Xiang, Li Zhang

Open Access 13-03-2024

One-Shot Neural Face Reenactment via Finding Directions in GAN’s Latent Space

In this paper, we present our framework for neural face/head reenactment whose goal is to transfer the 3D head orientation and expression of a target face to a source face. Previous methods focus on learning embedding networks for identity and …

Authors:: Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

Download PDF-version View full text

10-03-2024

PLP: Point-Line Minimal Problems under Partial Visibility in Three Views

We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of …

Authors:: Timothy Duff, Kathlén Kohn, Anton Leykin, Tomas Pajdla

Open Access 08-03-2024

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent selection of the types of augmentations …

Authors:: Guofeng Mei, Cristiano Saltori, Elisa Ricci, Nicu Sebe, Qiang Wu, Jian Zhang, Fabio Poiesi

Download PDF-version View full text

Springer Professional