Skip to main content

Open Access 16.09.2024 | Research

A Joint Network for Low-Light Image Enhancement Based on Retinex

verfasst von: Yonglong Jiang, Jiahe Zhu, Liangliang Li, Hongbing Ma

Erschienen in: Cognitive Computation

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Methods based on the physical Retinex model are effective in enhancing low-light images, adeptly handling the challenges posed by low signal-to-noise ratios and high noise in images captured under weak lighting conditions. However, traditional models based on manually designed Retinex priors do not adapt well to complex and varying degradation environments. DEANet (Jiang et al., Tsinghua Sci Technol. 2023;28(4):743–53 2023) combines frequency and Retinex to address the interference of high-frequency noise in low-light image restoration. Nonetheless, low-frequency noise still significantly impacts the restoration of low-light images. To overcome this issue, this paper integrates the physical Retinex model with deep learning to propose a joint network model, DEANet++, for enhancing low-light images. The model is divided into three modules: decomposition, enhancement, and adjustment. The decomposition module employs a data-driven approach based on Retinex theory to split the image; the enhancement module restores degradation and adjusts brightness in the decomposed images; and the adjustment module restores details and adjusts complex features in the enhanced images. Trained on the publicly available LOL dataset, DEANet++ not only surpasses the control group in both visual and quantitative aspects but also achieves superior results compared to other Retinex-based enhancement methods. Ablation studies and additional experiments highlight the importance of each component in this method.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

In everyday scenarios, as photographic equipment becomes increasingly compact, individuals frequently capture and share images of interest. Due to the diminutive aperture sizes in portable photography devices, limited light transmission often results in images that are dark, lack detail, and offer poor visual quality. While professional cameras can mitigate these issues by employing higher sensitivity settings (ISO), longer exposure times, and flash usage, these adjustments are not without drawbacks. High ISO settings tend to amplify noise, particularly in dark and solid color areas of an image, significantly reducing the peak signal-to-noise ratio (PSNR) and degrading image quality. Extended exposure times allow more light to enter, enhancing brightness but introducing Johnson noise, which can disrupt image clarity. Although using a flash can improve local contrast, it often produces results that are visually displeasing. Traditional methods for enhancing image quality exhibit clear limitations. However, the advent of deep learning has spurred the development of innovative low-light image enhancement techniques in recent years. Despite these advancements, achieving enhanced images that maintain rich detail and high color accuracy remains a challenging endeavor.
Figure 1 illustrates this point with three natural images captured under challenging lighting conditions. The first image was taken indoors with low light during the day, the second captures a backlit scene at sunrise with ample ambient light, and the third was shot in dim outdoor conditions under cloudy skies. The subsequent line displays the enhanced results achieved using the model proposed in this paper, demonstrating notable improvements in image clarity and detail.
In recent years, significant progress had been made in low-light image enhancement. Traditional methods such as histogram equalization [2, 3] and their developments [4, 5] had merely simplified brightness enhancement, resulting in modest improvements in image quality. The Retinex-based methods [6], such as SSR [7] and MSRCR [8], had been widely adopted. These approaches, based on the Retinex decomposition concept, assumed that the scene perceived by the human eye was a product of reflectance and illumination layers. While these methods used manually designed prior filters for image decomposition and enhancement, and the results were visually pleasing in terms of light distribution, they introduced noticeable noise, significantly reducing visual quality. Subsequently, BM3D [9] was employed to remove noise from the illumination and reflection components, but issues with residual noise or overly smoothed details persisted. With the advancement of deep learning, data-driven approaches had enhanced the effectiveness and applicability of low-light image enhancement. R2RNet [10] utilized spatial information of images to improve contrast while preserving details through frequency information. LR3M [11] infused a low-rank prior into the Retinex decomposition process, aiming to effectively enhance low-light images and suppress intense noise. SCI [12] had established a cascaded illumination learning process with weight sharing to achieve low-light image enhancement. URetinex-Net [13] formulated the decomposition problem as an implicit prior regularization model, enhancing low-light images through learned modules. These Retinex-based low-light image enhancement schemes focus on decomposing images with minimal noise while maintaining more detail. However, they overlook the complexity of feature fusion after enhancement. During the image enhancement process, although the feature distribution is preliminarily restored, it still deviates from the true image feature distribution. In DEANet, not only do we combine frequency with Retinex to address the interference of high-frequency noise during the image restoration process, but we also incorporate an adjustment module to resolve the issues of feature fusion in enhanced images. Nevertheless, the performance in reducing the impact of low-frequency noise on low-light image restoration is not optimal.
Based on the above analysis, the challenges in low-light image enhancement are summarized as follows:
  • How can Retinex theory be leveraged to devise a decomposition network within a neural framework, ensuring the effective separation of lighting and reflective components in both normal and low-light images following neural processing?
  • How can the degradation in the reflective components of low-light images, which is typically exacerbated by noise interference, be effectively counteracted?
  • How can the loss of intricate image details in the output of neural networks be averted, thereby guaranteeing the production of high-quality images?
A new deep learning network is proposed in this paper which addresses the problem of low-light image enhancement. The contribution of this paper can be summarized as follows:
1.
The proposed DEANet++ network integrates the Retinex physical model with deep learning techniques to enhance low-light images under non-ideal conditions.
 
2.
The DEANet++ network comprises three modules: a decomposition module, an enhancement module, and an adjustment module. The decomposition and adjustment modules utilize a novel network structure that combines channel attention, U-Net, and residual network (ResNet) [14] to minimize information loss and enhance the representation of image features. In the enhancement module, a new structure employing channel attention, dense convolutional network (DenseNet) [15], and U-Net [16] is implemented to deeply extract and refine image features and details. The adjustment module is specifically designed to align and integrate complex feature images from the enhancement module, thus more effectively restoring image detail and structure.
 
3.
New loss functions have been introduced in the enhancement and adjustment modules of the network to more efficiently guide the image enhancement process.
 
In addition, the decomposition network, the enhancement network, and the adjustment network are trained separately. This approach not only reduces the model’s memory requirements but also allows for flexible adjustments during network training. Numerous experiments have been conducted to verify the effectiveness of the design and demonstrate its superiority over existing technical solutions.
In recent years, many low-light enhancement approaches had been proposed. This section briefly introduced the related classic and contemporary technical solutions.
Traditional Methods
For images with overall low brightness, when the distribution of each grayscale value was highly uniform, the amount of information contained in the image was substantial. Conversely, if there was only a single grayscale value, the information content was considerably limited. Techniques such as histogram equalization (HE) [2, 3, 17] and its further developments [4, 5] mapped the grayscale range of an image to [0,1]. This reassigned pixel values across different grayscale ranges, aiming to equalize the number of pixels in each range. Such redistribution balanced the output histogram, prevented excessive concentration of grayscale values, and thus enhanced image contrast. Another approach involved applying nonlinear gamma correction (GC) to each pixel, effectively improving brightness but overlooking the relationships among pixels. Traditional methods primarily focused on enhancing brightness, disregarding real illumination factors, which led to significant discrepancies between the enhanced image and the actual scene. Furthermore, some traditional algorithms started with image decomposition and reconstruction to achieve image restoration. Liang et al. [18] proposed a multi-scale tone mapping scheme based on a layer decomposition model, addressing issues of halo artifacts and over-enhancement. Shibata et al. [19] introduced a gradient-domain image reconstruction framework with intensity range and structural constraints. Similar methodologies were developed by [2023].
Lighting-Based Methods
Unlike traditional methods, ligh-ting-based methods were based on the Retinex theory [6]. A lighting-based approach separated normal-light images and low-light images into light components and object reflection components. At the same time, it also ensured that the object reflection components were as close as possible between the normal-light image and the low-light image. Early attempts that adopted this method included single-scale Retinex (SSR) [7], which obtained the illumination component through a single-scale Gaussian filter, and multi-scale Retinex (MSR), which enhanced SSR by performing a weighted average of the results obtained by multi-scale Gaussian filters. MSR maintained high image fidelity and dynamic compression. Based on MSR, Multi-Scale Retinex with Color Restoration (MSRCR) was proposed [8] that added a color restoration factor C to adjust the problem of color distortion caused by the contrast enhancement of the local area of the image. BIMEF [24] used a multi-exposure fusion framework for image enhancement. LIME [25] estimated the lighting through presuppositions, obtained the estimated lighting through a weighted model, and then used block-matching and 3D filtering (BM3D) [9] for post-processing. Wang et al. proposed a method called NPE [26], which enhanced the contrast while maintaining the naturalness of the image lighting. Fu et al. [27] proposed a method that enhanced low-light images by fusing multiple images, but this method sacrificed the realism of some detailed areas of the image. In SRIE [28], it was shown that the reflection and illumination components could be better obtained by using a weighted variational model.
Deep Learning Methods
With the popularity of deep learning, many low-level visual tasks had been improved by the application of deep learning models. For example, different methods were used to perform tasks such as rain removal [29], super-resolution [30] and [31], artifact removal [32], and impurity removal [33].
The task of low-light image enhancement was mainly divided into two categories: supervised and unsupervised.
In supervised learning, LLNet [34] employed a deep learning network, using a stacked sparse denoising autoencoder (SSDA) for enhancing and denoising low-light noisy images. Wei et al. designed RetinexNet [35] that combined the Retinex theory with a convolutional neural network (CNN) to estimate and adjust the illumination map for image contrast enhancement, and then used BM3D [9] for post-processing to achieve denoising. MBLLEN [36] first assigned image feature extraction to different branches through the convolutional layers of a CNN. Second, multiple subnets were used for simultaneous enhancement. Finally, the multi-branch output results were fused into the final enhanced image. KinD [37] decomposed the image into two parts. One component (illumination) was responsible for light conditioning, while the other (reflectance) was responsible for degradation removal. Lim et al. [38] proposed a deep stacked Laplacian restorer (DSLR) that learned the feature mapping from a cell phone to a DSLR camera. KinD++ [39] used a two-stage approach to first decompose the image into reflectance and illumination components and then adjusted the illumination component to brighten the image while preserving its naturalness. Li et al. [40] proposed a progressive-recursive image enhancement network to enhance low-light images. In [41], a noise-suppressing low-light image enhancement approach based on the extent of exposedness at each image pixel was proposed. Lv et al. [42] combined the attention mechanism with a multi-branch convolutional neural network for low-light enhancement on synthetic datasets. Li et al. [43] proposed a neural network—a progressive-recursive image enhancement network (PRIEN)—to enhance low-light images. Recursive units consisting of recurrent layers and residual blocks were used to extract features from the input image. Xu et al. [44] exploited the structure and texture features of low-light images to improve perceptual quality. Lv et al. [45] first employed a sharpening-smoothing image filter (SSIF) for multi-scale decomposition of the image, subsequently applying contrast-limited adaptive histogram equalization (CLAHE) to the decomposed image segments to enhance low-light images effectively. Qian et al. [46] and Zhu et al. [47] were likewise enhanced by supervised methods.
In unsupervised learning, GLAD [48] calculated the global illumination estimate of the low-light input, adjusted the illumination under the guidance of the illumination estimate, and finally supplemented the details by cascading with the original input. Guo et al. [49] proposed a lightweight network, ZeroDCE, for low-light image enhancement. This is a new deep learning method—zero-reference depth curve estimation. Jiang et al. [50] proposed a GAN-based network for low-light image enhancement. Liu et al. [51] proposed a new network called RUAS. This discovered low-light prior architectures from a compact search space by designing a collaborative reference-free learning strategy and used the prior architecture for low-light enhancement. Zhou et al. [52] is also enhanced by unsupervised methods.
In addition, there are some methods that are different from those listed above. MSR-net [53] was a model of multi-scale Retinex combined with a CNN. Ying et al. [54] first obtained a camera response model by analyzing the relationship between images with different exposures. Then, the exposure ratio map of the image was obtained by using the estimation method of the luminance component of the image. Finally, the low-light images were enhanced using the corresponding camera model and exposure ratio map. Yu et al. [55] proposed a physical illumination model which described the degradation of low illumination images to enhance images. CRENet [56] introduced a new re-enhancement module which allows the network to iteratively refine the enhanced image. MAXIM [57] was a novel approach to optimizing MLPs for image processing tasks by introducing convolutional layers and spatial fusion. Similar methods are [58] and [59].
However, the existing models still cannot perfectly address the problem of region degradation; they still suffer from many problems.
In supervised models, some methods used synthetic low-light images for training. However, these synthetic images could not accurately represent real-world illumination conditions, such as spatially varying lighting and degree of noise. Some methods also used pre-defined loss functions to simulate Retinex decomposition, which led to reduced generalization performance of the trained decomposition network. Additionally, some methods idealized the relationship between reflection and illumination components. When generating images, simply multiplying the reflection and illumination components using the Retinex formula led to poor fusion results of complex features.
Although unsupervised approaches demonstrated competitive performance, they continued to face several limitations. First, achieving stable training, avoiding color bias, and establishing relationships across domain-specific information remained challenging for current methodologies. Second, the design of non-reference loss functions became complex when considerations such as color preservation, artifact removal, and gradient backpropagation were integrated. As a result, the visually enhanced images often failed to meet satisfactory quality standards.
This article addresses several challenges associated with supervised low-light enhancement methods. Specifically, it utilizes the LOL dataset to tackle the issue of training with non-realistic images. A data-driven decomposition approach is employed to enhance the transferability of decomposition networks. Furthermore, an adjustment network is implemented to improve the fusion between the illumination and reflection components.

Methodology

An excellent low-light image enhancement model should be able to restore the details of the image, adjust its brightness, and solve the problem of the degradation of reflection components hidden in the dark. A deep convolutional neural network architecture is proposed to achieve this goal. As shown in Fig. 2, the network can be functionally decomposed into three modules: the decomposition module, the enhancement module, and the adjustment module. The decomposition module is used to extract the lighting and reflection components of the image. It uses a data-driven decomposition method which addresses the problem of poor transferability that is caused when using preset constraints to decompose images. The enhancement module is used to reduce the degradation of object reflection and lighting components caused by noise. The adjustment module is used for complex feature fusion, ensuring the restoration of image detail and color. More information about this network is given below.

Further Thoughts and Hypothesis

Analysis of the Retinex Theory

The Retinex theory posits that the color of an object is primarily determined by its reflectivity, an intrinsic property that remains constant regardless of the light intensity or illumination inhomogeneities to which it is exposed. Thus, an object’s image results from the incident light reflecting off its surface. This reflectivity depends solely on the object’s own characteristics and is unaffected by the characteristics of the incident light. According to Retinex theory, an image I comprises a light component L and a reflection component R, mathematically represented as \( I = R \circ L \).
Although designed for image enhancement under ideal conditions, the traditional decomposition model \( I=R\circ L \) does not account for the influence of noise, which can significantly affect both the reflectance and illumination components after decomposition. This paper introduces the DEANet++ network, which integrates the Retinex model with deep learning techniques to effectively address noise interference. Building on the physical model, DEANet++ offers enhanced interpretability over other deep learning networks and improves the performance of the physical model under non-ideal conditions.

Improvements to DEANet

In our previous research, we introduced DEANet [1] to mitigate the impact of noise on the reflective components during low-light image enhancement. The model was initially based on the premise that image noise predominantly affects high-frequency information with minimal impact on low frequencies. However, subsequent empirical studies have shown that even slight noise within the low-frequency domain can adversely affect the reflective components, leading to decreased image quality and blurriness under dim lighting conditions. To address the interference of low-frequency noise and further improve the quality of restored images, DEANet++ has been redesigned. The enhanced module now integrates a channel attention mechanism to effectively extract features and suppress noise. In the final adjustment module, the inclusion of attention mechanisms ensures thorough integration of image features. Ultimately, this paper presents improvements in the network structure and loss functions, significantly enhancing the quality of image restoration under low-light conditions.

Image Decomposition

Traditional methods often decompose images into reflection and lighting components using well-designed, yet rigid constraints that may not be suitable for all applications. In contrast, a data-driven approach to image decomposition has been implemented, effectively overcoming these limitations. During the training phase, pairs of low-light and normal-light images are simultaneously fed into the decomposition network module. This module then processes and separates the images into distinct lighting and reflection components. Trained on a substantial dataset, the module demonstrates enhanced performance in component separation compared to traditional methods restricted by pre-defined rules.

Image Enhancement

In practical scenarios, the degradation observed in low-light images is typically more severe than in normal-light images. This issue extends to the reflection and lighting components derived post-decomposition of these images. Specifically, in the reflection component of objects, the intensity of the lighting component significantly influences the extent of degradation. To address this, an enhancement module is proposed. This module initially learns the mapping relationship between the reflection components of low-light and normal-light objects. Subsequently, it applies this mapping to restore the degraded reflected images.

Image Adjustment

Guided by the principle of \(I = R \circ L\), the adjustment module integrates feature fusion of the reflection and illumination components processed by the enhancement module. This integration facilitates precise tuning of image detail restoration and illumination enhancement. Considering the potential loss of image details in earlier stages within the decomposition and enhancement networks, the adjustment network overlays the enhanced reflection and illumination components onto the original input. This step ensures a comprehensive restoration of image details.

DEANet++

Building on the analysis and assumptions discussed in the previous section, we propose the deep neural network, DEANet++. This network not only ensures the restoration of image details but also enhances the brightness of low-light images. The architecture of DEANet++ is illustrated in Fig. 2.
DEANet++ comprises three subnets: a decomposition module, an enhancement module, and an adjustment module. These modules are responsible for decomposing low-light images, enhancing contrast, and adjusting details, respectively. Specifically, the decomposition module separates the low-light image into lighting and object reflectance components based on Retinex theory. The enhancement module focuses on the recovery of reflection and the enhancement of lighting. The processed components are then forwarded to the adjustment module, which aims to further enhance image contrast and detail reconstruction.
Subsequent sections will provide a detailed description of these modules, examining their functionality and contribution to the overall performance of the network.

Decomposition Module

Firstly, guided by Retinex theory, the image is decomposed into components R and L through the decomposition network, where R retains the color and details of the image, and L captures the light intensity. Due to the absence of actual lighting guidance, it is challenging to design a constraint a priori that effectively separates the lighting and reflection components from the image. Fortunately, the dataset cited in [35], which contains paired images [\(I_{low}\), \(I_{high}\)] configured at different exposures, facilitates this process. Here, \(I_{low}\) represents a low-light image, and \(I_{high}\), a normal-light image.
As depicted in Fig. 3, the decomposition network leverages the extensive data within the dataset to drive learning and refine an optimal decomposition method. This method enables the network to accurately decompose image pairs under various illumination conditions, resulting in paired object reflection components [\(R_{low}\), \(R_{high}\)] and lighting components [\(L_{low}\), \(L_{high}\)]. In Retinex theory, the object reflection component is expected to remain constant across different illumination intensities, suggesting that \(R_{low}\) and \(R_{high}\) should be identical. However, in practice, the object reflection component exhibits varying degrees of degradation under different illumination intensities, with degradation intensifying as illumination decreases. Consequently, adjustments to the object reflection components aim to align them as closely as possible.
To enhance the training of the network, several loss functions are employed to ensure accurate image separation. In LIME [25], the light change constraint is derived by weighting the initial light map, determined by the maximum pixel values in the RGB channels. However, this method of weighting the light map does not perfectly adapt to changes in image brightness. Therefore, inspired by LIME, the maximum pixel values across the RGB channels are used as the light map. This single-channel light map is then combined with the low-light image along the channel dimension and fed into the decomposition network. The loss functions specified in Eqs. 1 and 2 ensure the structural correctness of the decomposition for both low-light and normal-light images according to Retinex theory, where \(\left\| \cdot \right\| \) signifies the mean absolute loss (L1 loss). Additionally, the loss functions in Eqs. 3, 4, and 5 are designed to ensure consistency in the object reflections.
Total variation minimization (TVM) is incorporated to impose smoothness constraints on the decomposition network. TVM is commonly utilized in image restoration tasks to minimize the overall image gradient. However, if applied directly as a loss function, it fails in areas with significant gradient variations, due to its assumption of uniform image gradients. To enhance the smoothness of the image post-decomposition, the TVM is modulated by the gradient of the object reflection component, as delineated in Eq. 6. Here, \(\nabla \) represents the image gradient, with horizontal and vertical components denoted by \(\nabla _h\) and \(\nabla _v\), respectively. The smoothing constraint coefficient, \(\lambda _g\), and the exponential weight \(\exp \left( -\lambda _{g} \nabla R_{\textrm{i}}\right) \) adjust the sensitivity of the smooth loss function in regions of abrupt gradient changes. The total loss function of the decomposition module is thus formulated.
$$\begin{aligned} \mathcal {L}_{\text {recon-low}}= & \left\| R_{\text{ low } } \cdot L_{\text{ Low } }-I_{\text{ low }}\right\| _{1} \end{aligned}$$
(1)
$$\begin{aligned} \mathcal {L}_{\text {recon-high}}= & \left\| R_{\text{ high } } \cdot L_{\text{ high } }-I_{\text{ high }}\right\| _{1} \end{aligned}$$
(2)
$$\begin{aligned} \mathcal {L}_{\text {r}}= & \left\| R_{\text{ low } } -R_{\text{ high }}\right\| _{1} \end{aligned}$$
(3)
$$\begin{aligned} \mathcal {L}_{\text {recon-low-mutal}}= & \left\| R_{\text{ high } } \cdot L_{\text{ Low } }-I_{\text{ low }}\right\| _{1} \end{aligned}$$
(4)
$$\begin{aligned} \mathcal {L}_{\text {recon-high-mutal}}= & \left\| R_{\text{ low } } \cdot L_{\text{ high } }-I_{\text{ high }}\right\| _{1} \end{aligned}$$
(5)
$$\begin{aligned} \mathcal {L}_{\text{ smooth-loss } }= & \sum _{t=low, high}\left\| \nabla L_{t} \circ \exp \left( -\lambda _{g} \nabla R_{\textrm{t}}\right) \right\| \end{aligned}$$
(6)
$$\begin{aligned} \mathcal {L}_{\text {decom}}= & \mathcal {L}_{\text {recon-low}}+\mathcal {L}_{\text {recon-high}}+0.01*\mathcal {L}_{\text {r}}\nonumber \\ & +0.001*\mathcal {L}_{\text {recon-low-mutal}}+0.001\nonumber \\ & *\mathcal {L}_{\text {recon-high-mutal}} +\mathcal {L}_{\text{ smooth-loss } } \end{aligned}$$
(7)
The decomposition module employs a novel hybrid network structure that integrates elements of ResNet and U-Net, as depicted in Fig. 3. This hybrid architecture leverages the skip connections from ResNet and the cross-layer connections from U-Net to minimize information loss during the convolution process. The network processes pairs of images, \(I_{low}\) and \(I_{high}\), as input. Initially, under the guidance of the loss function in Eq. 7, the network decomposes \(I_{low}\) to obtain \(R_{low}\) and \(L_{low}\). Subsequently, using weight sharing, it decomposes \(I_{high}\), yielding \(R_{high}\) and \(L_{high}\). Backpropagation is then performed by calculating the loss between the newly generated components (\(R_{low}\), \(L_{low}\), \(R_{high}\), \(L_{high}\)) and the original image pair (\(I_{low}\), \(I_{high}\)). To address feature loss during the image upsampling and subsampling processes, the decomposition network utilizes skip-layer connections within its architecture. Additionally, a channel attention mechanism is incorporated into the skip-layer connections between ResNet and U-Net, enhancing the network’s capacity to emphasize key features by weighting the channels of the convolutional outputs.

Enhancement Module

In the enhancement module, we use a DenseNet_R is used with excellent performance and a simple ResNet_L. These are employed to recover the degraded object reflection component and enhance the lighting component, respectively.
In DenseNet_R, the degradation intensity of the object reflection component varies with different illumination intensities. Consequently, the lighting component \(L_{low}\) and the object reflection component \(R_{low}\) from the low-light image are concatenated along the channel dimension and fed into the enhancement network. Governed by the L1 loss function as specified in Eq. 8, the enhancement network effectively learns the mapping between the object reflection components of the low-light and normal-light images.
The reflection component, initially obtained in the decomposition module, is input into the enhancement module to mitigate degradation. The enhancement module generates a new reflection component, denoted as \(R_{new}\). This process not only restores the degraded reflection but also reduces noise compared to the original low-light reflection component. This mapping effectively eliminates the degradation caused by noise between the reflection component of the low-light object and that of the normal-light object. Furthermore, Eq. 9 ensures that the gradients’ changes in the reflected components of both objects are identical, implying that the boundary contours of the reflection components are aligned. Ultimately, the restoration network employs \(\mathcal {L}{R-S S I M}={\text {SSIM}}\left( R{\text{ new }}, R_{\text{ high }}\right) \), where SSIM (\(\cdot \), \(\cdot \)) measures structural similarity. This metric confirms that the reflected components of both objects are structurally and perceptually similar. The loss associated with the recovery of the object reflection component is expressed in Eq. 10.
$$\begin{aligned} \mathcal {L}_{R-\text{ res }}= & \left\| R_{\text{ new } }-R_{high}\right\| _{1} \end{aligned}$$
(8)
$$\begin{aligned} \mathcal {L}_{R-\text{ smooth }}= & \left\| \nabla R_{\text{ new } }-\nabla R_{\text{ high } }\right\| _{2}^{2} \end{aligned}$$
(9)
$$\begin{aligned} \mathcal {L}_{R}= & \mathcal {L}_{R-\text{ res }}+\mathcal {L}_{R-\text{ smooth }}+\mathcal {L}_{R-S S I M} \end{aligned}$$
(10)
The network for object reflection component recovery is based on the combination of DenseNet and U-Net, as shown in Fig. 4. This proposed network combines the powerful information extraction ability of DenseNet and the cross-layer connection of U-Net to capture image features for recovery. In the DenseNet_R network, DenseNet is combined with UNet in the following way. The first layer is a Resnet_block that subsamples the input feature map. The second to the fifth layers are Dense_block blocks, which contains {2, 4, 12, 8} Conv_blocks. This network utilizes the powerful feature extraction capability of DenseNet and can obtain sufficient features in the subsampled process. At the same time, U-Net’s skip-layer connection is used to minimize the loss of features in the convolution process. Like the decomposition module, the channel attention mechanism is also added to the U-Net skip-layer connection. This can efficiently transfer effective information to the upsampling layers to ensure that the final generated image has excellent detail information and better visual perception.
As shown in Fig. 5. The ResNet_L network is a five-layer simple convolutional neural network. In order to effectively enhance the lighting component, the \(L_{low}\) obtained by the decomposition module is enhanced under the constraint of the L1 loss function in Eq. 11. In other words, the light enhancement network learns the mapping between the low-light components and the normal-light components. This mapping can adjust the brightness of the light and adjust the low-light image to the normal-light level. \(L_{new}\) represents the newly generated lighting component. Finally, the loss function in Eq. 12 is also used in this network to ensure that the gradient change of \(L_{new}\) equals that of \(L_{high}\). The loss of the light enhancement network can be expressed as Eq. 13.
$$\begin{aligned} \mathcal {L}_{L-r e s}= & \left\| L_{\text{ new } }-L_{\text{ high } }\right\| _{1} \end{aligned}$$
(11)
$$\begin{aligned} \mathcal {L}_{L-s m o o t h}= & \left\| \nabla L_{\text{ new } }-\nabla L_{\text{ high } }\right\| _{2}^{2} \end{aligned}$$
(12)
$$\begin{aligned} \mathcal {L}_{l}= & \mathcal {L}_{l-\text{ res }}+\mathcal {L}_{l-\text{ smooth }} \end{aligned}$$
(13)
Table 1
Quantitative evaluation of low-light image enhancement methods on the LOL [35] dataset
Methods
PSNR
SSIM
FSIM
MAE
GMSD
NIQE
MSRCR [8]
13.964
0.514
0.827
0.046
0.151
8.793
LIME [25]
16.758
0.564
0.850
0.097
0.122
9.402
BIMEF [24]
13.875
0.577
0.907
0.103
0.085
8.678
RetinexNet [35]
16.774
0.559
0.759
0.153
0.148
6.335
MBLLEN [36]
18.897
0.755
0.926
0.123
0.124
2.370
GLAD [48]
19.718
0.703
0.923
0.129
0.112
7.709
(2020)DSLR [38]
14.978
0.668
0.856
0.253
0.170
2.913
(2020)ZeroDCE [49]
14.584
0.611
0.912
0.162
0.086
3.336
(2021)EnlightenGAN [50]
17.239
0.678
0.911
0.087
0.084
3.275
KinD [37]
20.726
0.810
0.923
0.123
0.103
3.335
(2021)KinD++ [39]
21.300
0.822
0.869
0.113
0.130
3.276
(2022)STANet [44]
21.350
0.815
0.945
0.069
0.088
3.181
(2023)R2RNet [10]
20.207
0.816
0.933
0.036
0.076
3.201
(2023)DEANet [1]
21.260
0.798
0.938
0.078
0.103
3.015
(2023)SSI [22]
10.224
0.397
0.832
0.057
0.151
8.624
Ours
22.542
0.850
0.964
0.070
0.071
3.580
Note that RetinexNet, KinD, KinD++, STANet, and DEANet are Retinex-based methods
Bold indicates the best results

Adjustment Module

In the adjustment module, a network structurally similar to the decomposition module is proposed, which is primarily designed for the complex fusion of image features and the fine-tuning of image details and illumination intensity. Relying solely on \(I=R\circ L\) to merge the newly generated object reflection component \(R_{new}\) and the new lighting component \(L_{new}\) from the enhancement network is insufficient for achieving intricate feature fusion. Consequently, this paper introduces an adjustment network to address this shortfall. Moreover, the original input \(I_{low}\) is reintegrated into the adjustment network to mitigate the feature loss encountered in the earlier decomposition and enhancement stages. This loss often results in diminished image details and color inaccuracies. Thus, reintegrating the original input into the final adjustment network helps refine details and preserve color fidelity. Additionally, a channel attention mechanism is incorporated into the U-Net’s skip connections within the adjustment network. This mechanism effectively reduces noise and mitigates low-light interference introduced by the original input, thereby enhancing the network’s capability to represent significant features.
To ensure that the adjustment network accurately restores image features, a content loss function is incorporated, utilizing the pre-trained Visual Geometry Group (VGG) 19 network [60]. This network is employed to simultaneously extract feature pairs [\(Feature_{new}\), \(Feature_{real}\)] from the generated image and the normal-light image. The L1 loss, as defined in Eq. 14, is applied to these feature pairs to assure feature similarity. Here, \(Feature_{new}\) denotes the features extracted from the image \(I_{new}\) by the adjustment network, while \(Feature_{real}\) refers to the features extracted from the normal-light image \(I_{high}\).Additionally, the adjustment network implements the L1 loss constraint outlined in Eq. 15 between the generated image and the normal-light image pair [\(I_{new}\), \(I_{high}\)]. This constraint is crucial for ensuring the consistency of color reproduction in the generated image. The overall loss function of the adjustment network is defined in Eq. 16.
$$\begin{aligned} \mathcal {L}_{A-\text{ content }}= & \Vert \text{ Feature}_{\text{ new }}-\text{ Feature}_{\text{ real }} \Vert _{1} \end{aligned}$$
(14)
$$\begin{aligned} \mathcal {L}_{A-\text{ res } }= & \left\| I_{\text{ new } }-I_{\text{ high } }\right\| _{1} \end{aligned}$$
(15)
$$\begin{aligned} \mathcal {L}_{A}= & \mathcal {L}_{A-\text{ content } }+\mathcal {L}_{A-\text{ res } } \end{aligned}$$
(16)
As depicted in Fig. 6, the adjustment network integrates a typical configuration of the five-layer U-Net and ResNet. This architecture incorporates a skip-layer connection strategy enhanced with a channel attention mechanism and employs both nearest-neighbor interpolation and convolution for upsampling. These techniques are strategically utilized to establish mappings that are critical for the restoration of image details and color accuracy.

Experiment

Dataset Details

The LOL dataset [35], meticulously curated through adjustments in camera exposure times and ISO settings, consists of 500 authentic pairs of low/normal illumination images. This dataset, which includes 485 pairs designated for training and 15 pairs earmarked for testing, is a quintessential resource for low-light image enhancement. Unlike datasets composed of artificially synthesized images, the LOL dataset offers a more faithful representation of real-world low-light conditions and serves as a pivotal resource for research in authentic low-light image processing scenarios.
In contrast, LIME, NPE, and MEF are often utilized as benchmark datasets to evaluate low-light image enhancement methods. These smaller-scale datasets lack reference images.

Experimental Details

DEANet++ is implemented using the PyTorch framework, and experiments are conducted with the LOL dataset. To enhance the model’s accuracy and convergence speed, an initial max-min normalization is applied to each image. Subsequently, each image undergoes random cropping and is resized to \(128\times 128\) pixels to bolster the model’s generalization capabilities. Additionally, the cropped images are subjected to random vertical and/or horizontal flips, as well as random rotations at right angles, to further augment the dataset variability. The entire model is trained on an NVIDIA RTX 3060 GPU. The Adam optimizer is utilized with an initial learning rate of 0.0002, and after 1000 epochs, the model reaches its optimal state. Further details will be available in the forthcoming code release.

Comparison with Existing Methods on Real Datasets

The model proposed in this study is benchmarked against a comprehensive control group, including MSRCR [8], BIMEF [24], LIME [25], RetinexNet [35], MBLLE [36], KinD [37], DSLR [38], KinD++ [39], STANet [44], GLAD (unsupervised) [48], ZeroDCE (unsupervised) [49], EnlightenGAN (unsupervised) [50], R2RNet (unsupervised) [10], and DEANet [1], across four public datasets: LOL [35], LIME [25], NPE [26], and MEF [61].
To evaluate image quality comprehensively, five widely used metrics are employed: peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) [62], feature similarity index (FSIM) [63], mean absolute error (MAE), and gradient magnitude similarity deviation (GMSD) [64]. PSNR assesses overall image quality, while SSIM evaluates image distortion and similarity between images. FSIM focuses on feature-based quality assessment, MAE quantifies the mean of absolute errors between predicted and observed values, and GMSD measures local image quality via local gradient magnitude similarity and calculates its standard deviation for a global quality assessment. The proposed model demonstrates superior performance, securing the highest ranks in four metrics and the second-highest in one. Although a leading evaluation index does not always correlate with better visual perception, the visual quality produced by our model significantly surpasses that of the control group..This superior visual effect is illustrated in Figs. 7 and 8. Specific evaluation index values are detailed in Table 1, and a set of control images is presented to delineate the experimental nuances. Notably, the blue-boxed section is magnified in Figs. 9 and 10, showcasing the enhanced detail and visual perception achieved by our results.
Table 2
Quantitative comparison on LIME, NPE, and MEF datasets in terms of NIQE [65]
Metric
NIQE
Datasets
LIME data [25]
NPE data [26]
MEF data [61]
MSRCR [8]
5.742
4.890
6.823
LIME [25]
4.154
3.796
4.448
BIMEF [24]
3.816
4.196
3.423
RetinexNet [35]
4.361
3.943
4.408
MBLLEN [36]
4.073
5.001
3.656
GLAD [48]
4.128
3.278
3.468
DSLR [38]
3.993
4.776
3.568
ZeroDCE [49]
3.912
3.667
4.024
EnlightenGAN [50]
3.719
4.113
3.575
KinD [37]
3.763
3.329
3.647
KinD++ [39]
3.588
3.146
3.329
STANet [44]
3.990
3.454
3.646
R2RNet [10]
3.176
3.355
4.029
DEANet [1]
3.890
3.771
3.574
Ours
3.565
3.069
3.319
Bold indicates the best results
As demonstrated in the graphs, the proposed model, DEANet++, significantly outperforms the control group methods in both peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) on the LOL dataset. DEANet++ achieves the best performance with scores of 22.542 dB in PSNR and 0.850 in SSIM. This represents an improvement over the second-best method, KinD++, by 1.242 dB in PSNR and 0.038 in SSIM. It surpasses the third-best method, STANet, by 1.192 dB in PSNR and 0.045 in SSIM, the fourth-best method, KinD, by 1.816 dB in PSNR and 0.050 in SSIM, the fifth-best method, DEANet, by 1.282 dB in PSNR and 0.062 in SSIM, and the sixth-best method, MBLLEN, by 3.645 dB in PSNR and 0.015 in SSIM. Notably, methods like MSRCR, LIME, RetinexNet, and ZeroDCE are significantly less effective. It is observed that some methods based on Retinex theory, such as LIME and RetinexNet, tend to blur details and/or amplify noise. The enhanced results produced by DEANet++ not only improve both local and global contrast but also exhibit sharper detail and higher color accuracy while effectively mitigating noise in images. These improvements are clearly reflected in the experimental outcomes.
Since LIME, NPE, and MEF consist solely of low-light images, conventional quantitative evaluations using PSNR and SSIM are not applicable. Therefore, the performance on these datasets is assessed using a no-reference image quality evaluation method, the Natural Image Quality Evaluator (NIQE) [65]. NIQE measures the deviation in the multivariate distribution of the test images. The results are tabulated in Table 2. Visual comparisons for other datasets are illustrated in Fig. 9.

Application Experiment

The experimental results demonstrate the application of enhanced low-light images to an advanced visual task: object detection. For robust validation, YOLOv3 [66], a widely utilized object detection algorithm known for its substantial recognition rate and broad applicability, is employed. Both the original low-light images and the enhanced low-light images serve as inputs for YOLOv3. As depicted in Fig. 11, it is evident that the object detector identifies more objects in the enhanced low-light images compared to the original low-light images.

Ablation Experiment

We conduct ablation studies on the composition of the network modules, network structure, and loss functions to demonstrate the rationality and effectiveness of the DEANet++ architecture.
Initially, ablation experiments are performed on the three network modules using the LOL dataset, with the results depicted in Fig. 12. To evaluate the significance of each module, both the enhancement and adjustment modules are removed sequentially to observe the impact on model performance. The results, as evidenced in Table 3, indicate that removing either the enhancement or adjustment modules significantly compromises the model’s effectiveness, underscoring their importance in the DEANet++ structure.
Second, ablation experiments are conducted on the network structure.
1.
In the decomposition, enhancement, and adjustment networks, the ResNet and DenseNet architectures are sequentially removed, compelling the model to rely solely on the U-Net structure for experiments. As U-Net serves as the foundational network, it is retained throughout. The results of these modifications are summarized in Table 4. The removal of the ResNet structure yields experimental results with a PSNR of 20.360 and an SSIM of 0.781. When the DenseNet structure is eliminated, the results show a PSNR of 18.412 and an SSIM of 0.776. These outcomes validate the efficacy and rational design of the newly proposed network structure.
 
2.
Furthermore, removing the attention mechanisms from individual modules did not result in significant changes; therefore, the channel attention mechanisms were subsequently removed from all three modules. The results post-removal are recorded as PSNR = 21.324 dB and SSIM = 0.799.
 
3.
Additionally, the process of concatenating with the original input was omitted in the adjustment module to assess its impact. The findings indicate that such concatenation indeed contributes to enhancing the quality of the final generated image, with experimental results documenting a PSNR of 20.720 dB and an SSIM of 0.801.
 
Table 3
Comparison of module ablation experiments
Operation
SSIM
PSNR
W/o the adjustment module
0.780
19.700
W/o the enhancement module
0.692
16.431
Normal
0.850
22.542
Note that the best results are highlighted in bold
Table 4
Comparison of network structure ablative experiments
Operation
SSIM
PSNR
W/o the ResNet
0.781
20.360
W/o the DenseNet
0.776
18.412
W/o the channel attention
0.799
21.324
W/o the original input
0.801
20.720
Normal
0.850
22.542
Note that the best results are highlighted in bold
The results are shown in Table 4.
Finally, ablation experiments are performed on the loss function. The loss functions are divided into three categories: content loss, MAE loss, and smooth loss. Content loss is \(\mathcal {L}_{A-\text{ content }}\). MAE loss includes \(\mathcal {L}_{R-\text{ res }}\), \(\mathcal {L}_{L-r e s}\), and \(\mathcal {L}_{A-\text{ res } }\). Smoothing loss includes \(\mathcal {L}_{\text{ smooth-loss } }\), \(\mathcal {L}_{R-\text{ smooth }}\), and \(\mathcal {L}_{L-s m o o t h}\). A combination of several sets of loss functions is set up; the results are shown in Table 5. For each module, different loss functions were ablated to assess their individual contributions. Due to the numerous combinations, we selected the most significant ones for our ablation study. Moreover, for methods based on the Retinex model, the decomposition loss function is a prerequisite, and therefore, ablation on this was not feasible. The results indicate that loss functions play a crucial role in guiding the model to learn image enhancement effectively.

Conclusions

This paper introduces an end-to-end low-light image en-hancement network model grounded in Retinex theory. The network comprises three primary modules: a decomposition module, an enhancement module, and an adjustment module. Within the decomposition module, the network adeptly decomposes images under various lighting conditions. The enhancement module includes an object degradation recovery network that effectively restores the degradation induced by poor illumination in low-light images and a lighting adjustment network that significantly enhances image contrast. The adjustment module conducts complex feature alignment, not only perfectly restoring the details and color accuracy with the support of the original image but also generating high-quality images.
DEANet++ has demonstrated exceptional performance on the LOL dataset, achieving the highest PSNR and SSIM scores. This underscores the effectiveness of the proposed method, which utilizes illumination guidance to mitigate object reflection degradation and noise, integrates the recovered object degradation components with illumination enhancement, and subsequently applies these improvements to enhance low-light images. The versatility of the proposed model extends beyond its current application; it has shown impressive results in dehazing images, restoring clarity to seabed images, and eliminating image artifacts. Furthermore, when combined with algorithms for object detection and face recognition, DEANet++ significantly broadens its utility in low-light environments, thereby expanding its applicability in the field of computer vision.
Table 5
Comparison of loss function ablation experiments
Operation
SSIM
PSNR
MAE loss
0.792
23.771
MAE loss (Enhancement Module) + content loss (Adjustment Module)
0.819
21.242
Smoothing loss (Enhancement Module) + MAE loss (Adjustment Module)
0.823
21.962
MAE loss + content loss + smoothing loss
0.850
22.542
Note that the best results are highlighted in bold

Declarations

We hereby declare that there are no conflicts of interest in the submission of this manuscript and that the content is an original work created by the authors. All co-authors have taken responsibility for critically reviewing and revising its content.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by-nc-nd/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
1.
Zurück zum Zitat Jiang Y, Li L, Zhu J, Xue Y, Ma H. DEANet: decomposition enhancement and adjustment network for low-light image enhancement. Tsinghua Sci Technol. 2023;28(4):743–53.CrossRef Jiang Y, Li L, Zhu J, Xue Y, Ma H. DEANet: decomposition enhancement and adjustment network for low-light image enhancement. Tsinghua Sci Technol. 2023;28(4):743–53.CrossRef
2.
Zurück zum Zitat Pisano ED, Zong S, Hemminger BM, DeLuca M, Johnston RE, Muller K, Braeuning MP, Pizer SM. Contrast limited adaptive histogram equalization image processing to improve the detection of simulated spiculations in dense mammograms. J Digit Imaging. 1998;11(4):193–200.CrossRef Pisano ED, Zong S, Hemminger BM, DeLuca M, Johnston RE, Muller K, Braeuning MP, Pizer SM. Contrast limited adaptive histogram equalization image processing to improve the detection of simulated spiculations in dense mammograms. J Digit Imaging. 1998;11(4):193–200.CrossRef
3.
Zurück zum Zitat Cheng H-D, Shi X. A simple and effective histogram equalization approach to image enhancement. Digit Signal Process. 2004;14(2):158–70.CrossRef Cheng H-D, Shi X. A simple and effective histogram equalization approach to image enhancement. Digit Signal Process. 2004;14(2):158–70.CrossRef
4.
Zurück zum Zitat Celik T, Tjahjadi T. Contextual and variational contrast enhancement. IEEE Trans Image Process. 2011;20(12):3431–41.MathSciNetCrossRef Celik T, Tjahjadi T. Contextual and variational contrast enhancement. IEEE Trans Image Process. 2011;20(12):3431–41.MathSciNetCrossRef
5.
Zurück zum Zitat Lee C, Lee C, Kim C-S. Contrast enhancement based on layered difference representation of 2D histograms. IEEE Trans Image Process. 2013;22(12):5372–84.CrossRef Lee C, Lee C, Kim C-S. Contrast enhancement based on layered difference representation of 2D histograms. IEEE Trans Image Process. 2013;22(12):5372–84.CrossRef
6.
Zurück zum Zitat Land EH. The Retinex theory of color vision. Sci Am. 1977;237(6):108–29.CrossRef Land EH. The Retinex theory of color vision. Sci Am. 1977;237(6):108–29.CrossRef
7.
Zurück zum Zitat Jobson DJ, Rahman Z-U, Woodell GA. Properties and performance of a center/surround Retinex. IEEE Trans Image Process. 1997;6(3):451–62.CrossRef Jobson DJ, Rahman Z-U, Woodell GA. Properties and performance of a center/surround Retinex. IEEE Trans Image Process. 1997;6(3):451–62.CrossRef
8.
Zurück zum Zitat Jobson DJ, Rahman Z-U, Woodell GA. A multiscale Retinex for bridging the gap between color images and the human observation of scenes. IEEE Trans Image Process. 1997;6(7):965–76.CrossRef Jobson DJ, Rahman Z-U, Woodell GA. A multiscale Retinex for bridging the gap between color images and the human observation of scenes. IEEE Trans Image Process. 1997;6(7):965–76.CrossRef
9.
Zurück zum Zitat Dabov K, Foi A, Katkovnik V, Egiazarian K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans Image Process. 2007;16(8):2080–95.MathSciNetCrossRef Dabov K, Foi A, Katkovnik V, Egiazarian K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans Image Process. 2007;16(8):2080–95.MathSciNetCrossRef
10.
Zurück zum Zitat Hai J, Xuan Z, Yang R, Hao Y, Zou F, Lin F, Han S. R2RNet: low-light image enhancement via real-low to real-normal network. J Vis Commun Image Represent. 2023;90:103712.CrossRef Hai J, Xuan Z, Yang R, Hao Y, Zou F, Lin F, Han S. R2RNet: low-light image enhancement via real-low to real-normal network. J Vis Commun Image Represent. 2023;90:103712.CrossRef
11.
Zurück zum Zitat Ren X, Yang W, Cheng W-H, Liu J. LR3M: robust low-light enhancement via low-rank regularized Retinex model. IEEE Trans Image Process. 2020;29:5862–76.MathSciNetCrossRef Ren X, Yang W, Cheng W-H, Liu J. LR3M: robust low-light enhancement via low-rank regularized Retinex model. IEEE Trans Image Process. 2020;29:5862–76.MathSciNetCrossRef
12.
Zurück zum Zitat Ma L, Ma T, Liu R, Fan X, Luo Z. Toward fast, flexible, and robust low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022. p. 5637–46. Ma L, Ma T, Liu R, Fan X, Luo Z. Toward fast, flexible, and robust low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022. p. 5637–46.
13.
Zurück zum Zitat Wu W, Weng J, Zhang P, Wang X, Yang W, Jiang J. URetinex-Net: Retinex-based deep unfolding network for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022. p. 5901–10. Wu W, Weng J, Zhang P, Wang X, Yang W, Jiang J. URetinex-Net: Retinex-based deep unfolding network for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022. p. 5901–10.
14.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition; 2016. p. 770–8. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition; 2016. p. 770–8.
15.
Zurück zum Zitat Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks; 2017. p. 4700–8. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks; 2017. p. 4700–8.
16.
Zurück zum Zitat Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Springer; 2015. p. 234–41. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Springer; 2015. p. 234–41.
17.
Zurück zum Zitat Abdullah-Al-Wadud M, Kabir MH, Dewan MAA, Chae O. A dynamic histogram equalization for image contrast enhancement. IEEE Trans Consum Electron. 2007;53(2):593–600.CrossRef Abdullah-Al-Wadud M, Kabir MH, Dewan MAA, Chae O. A dynamic histogram equalization for image contrast enhancement. IEEE Trans Consum Electron. 2007;53(2):593–600.CrossRef
18.
Zurück zum Zitat Liang Z, Xu J, Zhang D, Cao Z, Zhang L. A hybrid l1-l0 layer decomposition model for tone mapping; 2018. p. 4758–66. Liang Z, Xu J, Zhang D, Cao Z, Zhang L. A hybrid l1-l0 layer decomposition model for tone mapping; 2018. p. 4758–66.
19.
Zurück zum Zitat Shibata T, Tanaka M, Okutomi M. Gradient-domain image reconstruction framework with intensity-range and base-structure constraints; 2016. p. 2745–53. Shibata T, Tanaka M, Okutomi M. Gradient-domain image reconstruction framework with intensity-range and base-structure constraints; 2016. p. 2745–53.
20.
Zurück zum Zitat Aydin TO, Stefanoski N, Croci S, Gross M, Smolic A. Temporally coherent local tone mapping of HDR video. ACM Trans Graph (TOG). 2014;33(6):1–13.CrossRef Aydin TO, Stefanoski N, Croci S, Gross M, Smolic A. Temporally coherent local tone mapping of HDR video. ACM Trans Graph (TOG). 2014;33(6):1–13.CrossRef
21.
Zurück zum Zitat Farbman Z, Fattal R, Lischinski D, Szeliski R. Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Trans Graph (TOG). 2008;27(3):1–10.CrossRef Farbman Z, Fattal R, Lischinski D, Szeliski R. Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Trans Graph (TOG). 2008;27(3):1–10.CrossRef
22.
Zurück zum Zitat Demir Y, Kaplan NH. Low-light image enhancement based on sharpening-smoothing image filter. Digit Signal Process. 2023;138:104054.CrossRef Demir Y, Kaplan NH. Low-light image enhancement based on sharpening-smoothing image filter. Digit Signal Process. 2023;138:104054.CrossRef
23.
Zurück zum Zitat Lu C-M, Yang S-J, Fuh C-S. Edge-aware image processing with a Laplacian pyramid by using cascade piecewise linear processing Lu C-M, Yang S-J, Fuh C-S. Edge-aware image processing with a Laplacian pyramid by using cascade piecewise linear processing
24.
25.
Zurück zum Zitat Guo X, Li Y, Ling H. Lime: low-light image enhancement via illumination map estimation. IEEE Trans Image Process. 2016;26(2):982–93.MathSciNetCrossRef Guo X, Li Y, Ling H. Lime: low-light image enhancement via illumination map estimation. IEEE Trans Image Process. 2016;26(2):982–93.MathSciNetCrossRef
26.
Zurück zum Zitat Wang S, Zheng J, Hu H-M, Li B. Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Trans Image Process. 2013;22(9):3538–48.CrossRef Wang S, Zheng J, Hu H-M, Li B. Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Trans Image Process. 2013;22(9):3538–48.CrossRef
27.
Zurück zum Zitat Fu X, Zeng D, Huang Y, Liao Y, Ding X, Paisley J. A fusion-based enhancing method for weakly illuminated images. Signal Process. 2016;129:82–96.CrossRef Fu X, Zeng D, Huang Y, Liao Y, Ding X, Paisley J. A fusion-based enhancing method for weakly illuminated images. Signal Process. 2016;129:82–96.CrossRef
28.
Zurück zum Zitat Fu X, Zeng D, Huang Y, Zhang X-P, Ding X. A weighted variational model for simultaneous reflectance and illumination estimation; 2016. p. 2782–90. Fu X, Zeng D, Huang Y, Zhang X-P, Ding X. A weighted variational model for simultaneous reflectance and illumination estimation; 2016. p. 2782–90.
29.
Zurück zum Zitat Sang Y, Li T, Zhang S, Yang Y. RARNet fusing image enhancement for real-world image rain removal. Appl Intell. 2022;52(2):2037–50.CrossRef Sang Y, Li T, Zhang S, Yang Y. RARNet fusing image enhancement for real-world image rain removal. Appl Intell. 2022;52(2):2037–50.CrossRef
30.
Zurück zum Zitat Xu Q, Liu S, Liu J, Luo B. Cognitively-inspired multi-scale spectral-spatial transformer for hyperspectral image super-resolution. Cogn Comput. 2023;: 1–15. Xu Q, Liu S, Liu J, Luo B. Cognitively-inspired multi-scale spectral-spatial transformer for hyperspectral image super-resolution. Cogn Comput. 2023;: 1–15.
31.
Zurück zum Zitat Chu Y, Qiao Y, Liu H, Han J. Dual attention with the self-attention alignment for efficient video super-resolution. Cogn Comput. 2022;14(3):1140–51.CrossRef Chu Y, Qiao Y, Liu H, Han J. Dual attention with the self-attention alignment for efficient video super-resolution. Cogn Comput. 2022;14(3):1140–51.CrossRef
32.
Zurück zum Zitat Huang X, Mao Y, Li J, Wu S, Chen X, Lu H. CRUN: a super lightweight and efficient network for single-image super resolution. Appl Intell. 2023;:1–13. Huang X, Mao Y, Li J, Wu S, Chen X, Lu H. CRUN: a super lightweight and efficient network for single-image super resolution. Appl Intell. 2023;:1–13.
33.
Zurück zum Zitat Cai B, Xu X, Jia K, Qing C, Tao D. DehazeNet: an end-to-end system for single image haze removal. IEEE Trans Image Process. 2016;25(11):5187–98.MathSciNetCrossRef Cai B, Xu X, Jia K, Qing C, Tao D. DehazeNet: an end-to-end system for single image haze removal. IEEE Trans Image Process. 2016;25(11):5187–98.MathSciNetCrossRef
34.
Zurück zum Zitat Lore KG, Akintayo A, Sarkar S. LLNet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 2017;61:650–62.CrossRef Lore KG, Akintayo A, Sarkar S. LLNet: a deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 2017;61:650–62.CrossRef
36.
Zurück zum Zitat Lv F, Lu F, Wu J, Lim C. MBLLEN: low-light image/video enhancement using CNNs. In: BMVC; 2018. vol. 220, p. 4. Lv F, Lu F, Wu J, Lim C. MBLLEN: low-light image/video enhancement using CNNs. In: BMVC; 2018. vol. 220, p. 4.
37.
Zurück zum Zitat Zhang Y, Zhang J, Guo X. Kindling the darkness: a practical low-light image enhancer. In: Proceedings of the 27th ACM international conference on multimedia; 2019. p. 1632–40. Zhang Y, Zhang J, Guo X. Kindling the darkness: a practical low-light image enhancer. In: Proceedings of the 27th ACM international conference on multimedia; 2019. p. 1632–40.
38.
Zurück zum Zitat Lim S, Kim W. DSLR: deep stacked Laplacian restorer for low-light image enhancement. IEEE Trans Multimed. 2020;23:4272–84.CrossRef Lim S, Kim W. DSLR: deep stacked Laplacian restorer for low-light image enhancement. IEEE Trans Multimed. 2020;23:4272–84.CrossRef
39.
Zurück zum Zitat Zhang Y, Guo X, Ma J, Liu W, Zhang J. Beyond brightening low-light images. Int J Comput Vis. 2021;129(4):1013–37.CrossRef Zhang Y, Guo X, Ma J, Liu W, Zhang J. Beyond brightening low-light images. Int J Comput Vis. 2021;129(4):1013–37.CrossRef
40.
Zurück zum Zitat Li J, Feng X, Hua Z. Low-light image enhancement via progressive-recursive network. IEEE Trans Circ Syst Video Technol. 2021;31(11):4227–40.CrossRef Li J, Feng X, Hua Z. Low-light image enhancement via progressive-recursive network. IEEE Trans Circ Syst Video Technol. 2021;31(11):4227–40.CrossRef
41.
Zurück zum Zitat Dhara SK, Sen D. Exposedness-based noise-suppressing low-light image enhancement. IEEE Trans Circ Syst Video Technol. 2021;32(6):3438–51.CrossRef Dhara SK, Sen D. Exposedness-based noise-suppressing low-light image enhancement. IEEE Trans Circ Syst Video Technol. 2021;32(6):3438–51.CrossRef
42.
Zurück zum Zitat Lv F, Li Y, Lu F. Attention guided low-light image enhancement with a large scale low-light simulation dataset. Int J Comput Vis. 2021;129(7):2175–93.CrossRef Lv F, Li Y, Lu F. Attention guided low-light image enhancement with a large scale low-light simulation dataset. Int J Comput Vis. 2021;129(7):2175–93.CrossRef
43.
Zurück zum Zitat Li J, Feng X, Hua Z. Low-light image enhancement via progressive-recursive network. IEEE Trans Circ Syst Video Technol. 2021;31(11):4227–40.CrossRef Li J, Feng X, Hua Z. Low-light image enhancement via progressive-recursive network. IEEE Trans Circ Syst Video Technol. 2021;31(11):4227–40.CrossRef
44.
Zurück zum Zitat Xu K, Chen H, Xu C, Jin Y, Zhu C. Structure-texture aware network for low-light image enhancement. IEEE Trans Circ Syst Video Technol. 2022. Xu K, Chen H, Xu C, Jin Y, Zhu C. Structure-texture aware network for low-light image enhancement. IEEE Trans Circ Syst Video Technol. 2022.
45.
Zurück zum Zitat Cai Y, Bian H, Lin J, Wang H, Timofte R, Zhang Y. Retinexformer: one-stage Retinex-based transformer for low-light image enhancement. In: Proceedings of the IEEE/CVF international conference on computer vision; 2023. p. 12504–13. Cai Y, Bian H, Lin J, Wang H, Timofte R, Zhang Y. Retinexformer: one-stage Retinex-based transformer for low-light image enhancement. In: Proceedings of the IEEE/CVF international conference on computer vision; 2023. p. 12504–13.
46.
Zurück zum Zitat Qian Y, Jiang Z, He Y, Zhang S, Jiang S. Multi-scale error feedback network for low-light image enhancement. Neural Comput Appl. 2022;34(23):21301–17.CrossRef Qian Y, Jiang Z, He Y, Zhang S, Jiang S. Multi-scale error feedback network for low-light image enhancement. Neural Comput Appl. 2022;34(23):21301–17.CrossRef
47.
Zurück zum Zitat Zhu H, Wang K, Zhang Z, Liu Y, Jiang W. Low-light image enhancement network with decomposition and adaptive information fusion. Neural Comput Appl. 2022;34(10):7733–48.CrossRef Zhu H, Wang K, Zhang Z, Liu Y, Jiang W. Low-light image enhancement network with decomposition and adaptive information fusion. Neural Comput Appl. 2022;34(10):7733–48.CrossRef
48.
Zurück zum Zitat Wang W, Wei C, Yang W, Liu J. GLADNet: low-light enhancement network with global awareness. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE; 2018. p. 751–5. Wang W, Wei C, Yang W, Liu J. GLADNet: low-light enhancement network with global awareness. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE; 2018. p. 751–5.
49.
Zurück zum Zitat Guo C, Li C, Guo J, Loy CC, Hou J, Kwong S, Cong R. Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2020. p. 1780–9. Guo C, Li C, Guo J, Loy CC, Hou J, Kwong S, Cong R. Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2020. p. 1780–9.
50.
Zurück zum Zitat Jiang Y, Gong X, Liu D, Cheng Y, Fang C, Shen X, Yang J, Zhou P, Wang Z. EnlightenGAN: deep light enhancement without paired supervision. IEEE Trans Image Process. 2021;30:2340–9.CrossRef Jiang Y, Gong X, Liu D, Cheng Y, Fang C, Shen X, Yang J, Zhou P, Wang Z. EnlightenGAN: deep light enhancement without paired supervision. IEEE Trans Image Process. 2021;30:2340–9.CrossRef
51.
Zurück zum Zitat Liu R, Ma L, Zhang J, Fan X, Luo Z. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement; 2021. p. 10561–70. Liu R, Ma L, Zhang J, Fan X, Luo Z. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement; 2021. p. 10561–70.
52.
Zurück zum Zitat Zhou Z, Feng Z, Liu J, Hao S. Single-image low-light enhancement via generating and fusing multiple sources. Neural Comput Appl. 2020;32(11):6455–65.CrossRef Zhou Z, Feng Z, Liu J, Hao S. Single-image low-light enhancement via generating and fusing multiple sources. Neural Comput Appl. 2020;32(11):6455–65.CrossRef
53.
Zurück zum Zitat Shen L, Yue Z, Feng F, Chen Q, Liu S, Ma J. Msr-net: Low-light image enhancement using deep convolutional network. 2017. arXiv:1711.02488 Shen L, Yue Z, Feng F, Chen Q, Liu S, Ma J. Msr-net: Low-light image enhancement using deep convolutional network. 2017. arXiv:​1711.​02488
54.
Zurück zum Zitat Ying Z, Li G, Ren Y, Wang R, Wang W. A new low-light image enhancement algorithm using camera response model; 2017. p. 3015–22. Ying Z, Li G, Ren Y, Wang R, Wang W. A new low-light image enhancement algorithm using camera response model; 2017. p. 3015–22.
55.
Zurück zum Zitat Yu S-Y, Zhu H. Low-illumination image enhancement algorithm based on a physical lighting model. IEEE Trans Circ Syst Video Technol. 2017;29(1):28–37.CrossRef Yu S-Y, Zhu H. Low-illumination image enhancement algorithm based on a physical lighting model. IEEE Trans Circ Syst Video Technol. 2017;29(1):28–37.CrossRef
56.
Zurück zum Zitat Zhang Y, Di X, Zhang B, Ji R, Wang C. Better than reference in low-light image enhancement: conditional re-enhancement network. IEEE Trans Image Process. 2021;31:759–72.CrossRef Zhang Y, Di X, Zhang B, Ji R, Wang C. Better than reference in low-light image enhancement: conditional re-enhancement network. IEEE Trans Image Process. 2021;31:759–72.CrossRef
57.
Zurück zum Zitat Tu Z, Talebi H, Zhang H, Yang F, Milanfar P, Bovik A, Li Y. Maxim: Multi-axis mlp for image processing; 2022. p. 5769–80. Tu Z, Talebi H, Zhang H, Yang F, Milanfar P, Bovik A, Li Y. Maxim: Multi-axis mlp for image processing; 2022. p. 5769–80.
58.
Zurück zum Zitat Cui Y, Sun Y, Jian M, Zhang X, Yao T, Gao X, Li Y, Zhang Y. A novel underwater image restoration method based on decomposition network and physical imaging model. Int J Intell Syst. 2022;37(9):5672–90. Cui Y, Sun Y, Jian M, Zhang X, Yao T, Gao X, Li Y, Zhang Y. A novel underwater image restoration method based on decomposition network and physical imaging model. Int J Intell Syst. 2022;37(9):5672–90.
59.
Zurück zum Zitat Deeba F, Dharejo FA, Zawish M, Memon FH, Dev K, Naqvi RA, Zhou Y, Du Y. A novel image dehazing framework for robust vision-based intelligent systems. Int J Intell Syst. 2021 Deeba F, Dharejo FA, Zawish M, Memon FH, Dev K, Naqvi RA, Zhou Y, Du Y. A novel image dehazing framework for robust vision-based intelligent systems. Int J Intell Syst. 2021
60.
61.
Zurück zum Zitat Lee C, Lee C, Lee Y-Y, Kim C-S. Power-constrained contrast enhancement for emissive displays based on histogram equalization. IEEE Trans Image Process. 2011;21(1):80–93.MathSciNet Lee C, Lee C, Lee Y-Y, Kim C-S. Power-constrained contrast enhancement for emissive displays based on histogram equalization. IEEE Trans Image Process. 2011;21(1):80–93.MathSciNet
62.
Zurück zum Zitat Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13(4):600–12.CrossRef Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13(4):600–12.CrossRef
63.
Zurück zum Zitat Zhang L, Zhang L, Mou X, Zhang D. Fsim: A feature similarity index for image quality assessment. IEEE Trans Image Process. 2011;20(8):2378–86.MathSciNetCrossRef Zhang L, Zhang L, Mou X, Zhang D. Fsim: A feature similarity index for image quality assessment. IEEE Trans Image Process. 2011;20(8):2378–86.MathSciNetCrossRef
64.
Zurück zum Zitat Xue W, Zhang L, Mou X, Bovik AC. Gradient magnitude similarity deviation: A highly efficient perceptual image quality index. IEEE Trans Image Process. 2013;23(2):684–95. Xue W, Zhang L, Mou X, Bovik AC. Gradient magnitude similarity deviation: A highly efficient perceptual image quality index. IEEE Trans Image Process. 2013;23(2):684–95.
65.
Zurück zum Zitat Mittal A, Soundararajan R, Bovik AC. Making a “completely blind” image quality analyzer. IEEE Signal Process Lett. 2012;20(3):209–12. Mittal A, Soundararajan R, Bovik AC. Making a “completely blind” image quality analyzer. IEEE Signal Process Lett. 2012;20(3):209–12.
Metadaten
Titel
A Joint Network for Low-Light Image Enhancement Based on Retinex
verfasst von
Yonglong Jiang
Jiahe Zhu
Liangliang Li
Hongbing Ma
Publikationsdatum
16.09.2024
Verlag
Springer US
Erschienen in
Cognitive Computation
Print ISSN: 1866-9956
Elektronische ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-024-10347-4