In recent years, many low-light enhancement approaches had been proposed. This section briefly introduced the related classic and contemporary technical solutions.
Deep Learning Methods
With the popularity of deep learning, many low-level visual tasks had been improved by the application of deep learning models. For example, different methods were used to perform tasks such as rain removal [
29], super-resolution [
30] and [
31], artifact removal [
32], and impurity removal [
33].
The task of low-light image enhancement was mainly divided into two categories: supervised and unsupervised.
In supervised learning, LLNet [
34] employed a deep learning network, using a stacked sparse denoising autoencoder (SSDA) for enhancing and denoising low-light noisy images. Wei et al. designed RetinexNet [
35] that combined the Retinex theory with a convolutional neural network (CNN) to estimate and adjust the illumination map for image contrast enhancement, and then used BM3D [
9] for post-processing to achieve denoising. MBLLEN [
36] first assigned image feature extraction to different branches through the convolutional layers of a CNN. Second, multiple subnets were used for simultaneous enhancement. Finally, the multi-branch output results were fused into the final enhanced image. KinD [
37] decomposed the image into two parts. One component (illumination) was responsible for light conditioning, while the other (reflectance) was responsible for degradation removal. Lim et al. [
38] proposed a deep stacked Laplacian restorer (DSLR) that learned the feature mapping from a cell phone to a DSLR camera. KinD++ [
39] used a two-stage approach to first decompose the image into reflectance and illumination components and then adjusted the illumination component to brighten the image while preserving its naturalness. Li et al. [
40] proposed a progressive-recursive image enhancement network to enhance low-light images. In [
41], a noise-suppressing low-light image enhancement approach based on the extent of exposedness at each image pixel was proposed. Lv et al. [
42] combined the attention mechanism with a multi-branch convolutional neural network for low-light enhancement on synthetic datasets. Li et al. [
43] proposed a neural network—a progressive-recursive image enhancement network (PRIEN)—to enhance low-light images. Recursive units consisting of recurrent layers and residual blocks were used to extract features from the input image. Xu et al. [
44] exploited the structure and texture features of low-light images to improve perceptual quality. Lv et al. [
45] first employed a sharpening-smoothing image filter (SSIF) for multi-scale decomposition of the image, subsequently applying contrast-limited adaptive histogram equalization (CLAHE) to the decomposed image segments to enhance low-light images effectively. Qian et al. [
46] and Zhu et al. [
47] were likewise enhanced by supervised methods.
In unsupervised learning, GLAD [
48] calculated the global illumination estimate of the low-light input, adjusted the illumination under the guidance of the illumination estimate, and finally supplemented the details by cascading with the original input. Guo et al. [
49] proposed a lightweight network, ZeroDCE, for low-light image enhancement. This is a new deep learning method—zero-reference depth curve estimation. Jiang et al. [
50] proposed a GAN-based network for low-light image enhancement. Liu et al. [
51] proposed a new network called RUAS. This discovered low-light prior architectures from a compact search space by designing a collaborative reference-free learning strategy and used the prior architecture for low-light enhancement. Zhou et al. [
52] is also enhanced by unsupervised methods.
In addition, there are some methods that are different from those listed above. MSR-net [
53] was a model of multi-scale Retinex combined with a CNN. Ying et al. [
54] first obtained a camera response model by analyzing the relationship between images with different exposures. Then, the exposure ratio map of the image was obtained by using the estimation method of the luminance component of the image. Finally, the low-light images were enhanced using the corresponding camera model and exposure ratio map. Yu et al. [
55] proposed a physical illumination model which described the degradation of low illumination images to enhance images. CRENet [
56] introduced a new re-enhancement module which allows the network to iteratively refine the enhanced image. MAXIM [
57] was a novel approach to optimizing MLPs for image processing tasks by introducing convolutional layers and spatial fusion. Similar methods are [
58] and [
59].
However, the existing models still cannot perfectly address the problem of region degradation; they still suffer from many problems.
In supervised models, some methods used synthetic low-light images for training. However, these synthetic images could not accurately represent real-world illumination conditions, such as spatially varying lighting and degree of noise. Some methods also used pre-defined loss functions to simulate Retinex decomposition, which led to reduced generalization performance of the trained decomposition network. Additionally, some methods idealized the relationship between reflection and illumination components. When generating images, simply multiplying the reflection and illumination components using the Retinex formula led to poor fusion results of complex features.
Although unsupervised approaches demonstrated competitive performance, they continued to face several limitations. First, achieving stable training, avoiding color bias, and establishing relationships across domain-specific information remained challenging for current methodologies. Second, the design of non-reference loss functions became complex when considerations such as color preservation, artifact removal, and gradient backpropagation were integrated. As a result, the visually enhanced images often failed to meet satisfactory quality standards.
This article addresses several challenges associated with supervised low-light enhancement methods. Specifically, it utilizes the LOL dataset to tackle the issue of training with non-realistic images. A data-driven decomposition approach is employed to enhance the transferability of decomposition networks. Furthermore, an adjustment network is implemented to improve the fusion between the illumination and reflection components.