International Journal of Computer Vision OnlineFirst articles

08.04.2024

CRetinex: A Progressive Color-Shift Aware Retinex Model for Low-Light Image Enhancement

Low-light environments introduce various complex degradations into captured images. Retinex-based methods have demonstrated effective enhancement performance by decomposing an image into illumination and reflectance, allowing for selective …

verfasst von:: Han Xu, Hao Zhang, Xunpeng Yi, Jiayi Ma

08.04.2024

Error-Aware Conversion from ANN to SNN via Post-training Parameter Calibration

Spiking Neural Network (SNN), originating from the neural behavior in biology, has been recognized as one of the next-generation neural networks. Conventionally, SNNs can be obtained by converting from pre-trained Artificial Neural Networks (ANNs) …

verfasst von:: Yuhang Li, Shikuang Deng, Xin Dong, Shi Gu

04.04.2024

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Traditional methods for object detection typically necessitate a substantial amount of training data, and creating high-quality training data is time-consuming. We propose a novel Few-Shot Object Detection network (FSODv2) in this paper that aims …

verfasst von:: Qi Fan, Wei Zhuo, Chi-Keung Tang, Yu-Wing Tai

02.04.2024

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) and derives that both have consistent mathematical formulation. Then inspired by …

verfasst von:: Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong Liu, Dacheng Tao

02.04.2024

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

Existing multimodal conditional image synthesis (MCIS) methods generate images conditioned on any combinations of various modalities that require all of them must be exactly conformed, hindering the synthesis controllability and leaving the …

verfasst von:: Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, Dacheng Tao

26.03.2024

InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions

We have recently seen tremendous progress in diffusion advances for generating realistic human motions. Yet, they largely disregard the multi-human interactions. In this paper, we present InterGen, an effective diffusion-based approach that …

verfasst von:: Han Liang, Wenqian Zhang, Wenxuan Li, Jingyi Yu, Lan Xu

Open Access 26.03.2024

Hyperbolic Deep Learning in Computer Vision: A Survey

Deep representation learning is a ubiquitous part of modern computer vision. While Euclidean space has been the de facto standard manifold for learning visual representations, hyperbolic space has recently gained rapid traction for learning in …

verfasst von:: Pascal Mettes, Mina Ghadimi Atigh, Martin Keller-Ressel, Jeffrey Gu, Serena Yeung

PDF-Version jetzt herunterladen Zum Volltext

Open Access 22.03.2024

Pictorial and Apictorial Polygonal Jigsaw Puzzles from Arbitrary Number of Crossing Cuts

Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unordered visual fragments, is fundamental to numerous applications, and yet most of the literature of the last two decades has focused thus far on …

verfasst von:: Peleg Harel, Ofir Itzhak Shahar, Ohad Ben-Shahar

PDF-Version jetzt herunterladen Zum Volltext

19.03.2024

UrbanEvolver: Function-Aware Urban Layout Regeneration

Urban regeneration is an important strategy for land redevelopment, to address the urban decay in cities. Among many tasks, urban layout is the foundation for urban regeneration. In this paper, we target a new task called function-aware urban …

verfasst von:: Yiming Qin, Nanxuan Zhao, Jiale Yang, Siyuan Pan, Bin Sheng, Rynson W. H. Lau

18.03.2024

Vision-Language Alignment Learning Under Affinity and Divergence Principles for Few-Shot Out-of-Distribution Generalization

Recent advances in fine-tuning large-scale vision-language pre-trained models (VL-PTMs) have shown promising results in quick adaption to downstream tasks. However, prior research often lacks comprehensive investigation into out-of-distribution …

verfasst von:: Lin Zhu, Weihan Yin, Yiyao Yang, Fan Wu, Zhaoyu Zeng, Qinying Gu, Xinbing Wang, Chenghu Zhou, Nanyang Ye

13.03.2024

Softmax-Free Linear Transformers

Vision transformers (ViTs) have pushed the state-of-the-art for visual perception tasks. The self-attention mechanism underpinning the strength of ViTs has a quadratic complexity in both computation and memory usage. This motivates the development …

verfasst von:: Jiachen Lu, Junge Zhang, Xiatian Zhu, Jianfeng Feng, Tao Xiang, Li Zhang

Open Access 13.03.2024

One-Shot Neural Face Reenactment via Finding Directions in GAN’s Latent Space

In this paper, we present our framework for neural face/head reenactment whose goal is to transfer the 3D head orientation and expression of a target face to a source face. Previous methods focus on learning embedding networks for identity and …

verfasst von:: Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

PDF-Version jetzt herunterladen Zum Volltext

10.03.2024

PLP: Point-Line Minimal Problems under Partial Visibility in Three Views

We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of …

verfasst von:: Timothy Duff, Kathlén Kohn, Anton Leykin, Tomas Pajdla

Open Access 08.03.2024

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent selection of the types of augmentations …

verfasst von:: Guofeng Mei, Cristiano Saltori, Elisa Ricci, Nicu Sebe, Qiang Wu, Jian Zhang, Fabio Poiesi

PDF-Version jetzt herunterladen Zum Volltext

07.03.2024

Does Confusion Really Hurt Novel Class Discovery?

When sampling data of specific classes (i.e., known classes) for a scientific task, collectors may encounter unknown classes (i.e., novel classes). Since these novel classes might be valuable for future research, collectors will also sample them …

verfasst von:: Haoang Chi, Wenjing Yang, Feng Liu, Long Lan, Tao Qin, Bo Han

07.03.2024

Open Set Recognition in Real World

Open set recognition (OSR) constitutes a critical endeavor within the domain of computer vision, frequently deployed in applications, such as autonomous driving and medical imaging recognition. Existing OSR methodologies predominantly center on …

verfasst von:: Zhen Yang, Jun Yue, Pedram Ghamisi, Shiliang Zhang, Jiayi Ma, Leyuan Fang

07.03.2024

Adaptive Multi-Source Predictor for Zero-Shot Video Object Segmentation

Static and moving objects often occur in real-life videos. Most video object segmentation methods only focus on extracting and exploiting motion cues to perceive moving objects. Once faced with the frames of static objects, the moving object …

verfasst von:: Xiaoqi Zhao, Shijie Chang, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu

06.03.2024

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

Knowledge about the own pose is key for all mobile robot applications. Thus pose estimation is part of the core functionalities of mobile robots. Over the last two decades, LiDAR scanners have become the standard sensor for robot localization and …

verfasst von:: Huan Yin, Xuecheng Xu, Sha Lu, Xieyuanli Chen, Rong Xiong, Shaojie Shen, Cyrill Stachniss, Yue Wang

Open Access 06.03.2024

Domain Generalization with Small Data

In this work, we propose to tackle the problem of domain generalization in the context of insufficient samples. Instead of extracting latent feature embeddings based on deterministic models, we propose to learn a domain-invariant representation …

verfasst von:: Kecheng Chen, Elena Gal, Hong Yan, Haoliang Li

PDF-Version jetzt herunterladen Zum Volltext

Open Access 05.03.2024

Automated Detection of Cat Facial Landmarks

The field of animal affective computing is rapidly emerging, and analysis of facial expressions is a crucial aspect. One of the most significant challenges that researchers in the field currently face is the scarcity of high-quality, comprehensive …

verfasst von:: George Martvel, Ilan Shimshoni, Anna Zamansky

PDF-Version jetzt herunterladen Zum Volltext

Springer Professional