Skip to main content
Erschienen in: Journal of the Brazilian Computer Society 1/2018

Open Access 01.12.2018 | Research

Referenceless image quality assessment by saliency, color-texture energy, and gradient boosting machines

verfasst von: Pedro Garcia Freitas, Welington Y. L. Akamine, Mylène C. Q. Farias

Erschienen in: Journal of the Brazilian Computer Society | Ausgabe 1/2018

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In most practical multimedia applications, processes are used to manipulate the image content. These processes include compression, transmission, or restoration techniques, which often create distortions that may be visible to human subjects. The design of algorithms that can estimate the visual similarity between a distorted image and its non-distorted version, as perceived by a human viewer, can lead to significant improvements in these processes. Therefore, over the last decades, researchers have been developing quality metrics (i.e., algorithms) that estimate the quality of images in multimedia applications. These metrics can make use of either the full pristine content (full-reference metrics) or only of the distorted image (referenceless metric). This paper introduces a novel referenceless image quality assessment (RIQA) metric, which provides significant improvements when compared to other state-of-the-art methods. The proposed method combines statistics of the opposite color local variance pattern (OC-LVP) descriptor with statistics of the opposite color local salient pattern (OC-LSP) descriptor. Both OC-LVP and OC-LSP descriptors, which are proposed in this paper, are extensions of the opposite color local binary pattern (OC-LBP) operator. Statistics of these operators generate features that are mapped into subjective quality scores using a machine-learning approach. Specifically, to fit a predictive model, features are used as input to a gradient boosting machine (GBM). Results show that the proposed method is robust and accurate, outperforming other state-of-the-art RIQA methods.
Abkürzungen
AGN
Additive gaussian noise
ANCC
Additive noise in color components
BMP
Bit map
BMS
Boolean Map Saliency
BRISQUE
Blind/referenceless image spatial quality evaluator
CA
Chromatic aberration
CC
Contrast change
CCS
Change of color saturation
CD
Contrast decrements
CN
Comfort noise
CNN
Convolutional neural networks
CORNIA
Codebook Representation for No-Reference Image Assessment
CQA
Curvlet-based quality assessment
CSIQ
Computational and subjective image quality
DCT
Discrete cosine transform
DS
Distortion-specific
FF
Fast fading
FR
Full-reference
FR-IQA
Full-reference image quality assessment
GB
Gaussian blur
GBM
Gradient boosting machine
GP
General-purpose
GP-IQA
General-purpose image quality assessment
GP-RIQA
General-purpose referenceless image quality assessment
HFN
High frequency noise
HSV
Hue, saturation, and value
HVS
Human visual system
ICQ
Image color quantization
ID
Image denoising
IN
Impulse noise
IQA
Image quality assessment
IS
Intensity shift
JPEG
Joint photographic experts group
JPEG+TE
JPEG transmission errors
JPEG2k
JPEG 2000
JPEG2k+TE
JPEG2k transmission errors
KRCC
Kendall rank order correlation coefficient
LBP
Local binary pattern
LBP
Local block-wise distortions
LC
Lossy compression
LCC
Linear correlation coefficient
LIVE
Laboratory for Image and Video Engineering
LTP
Local ternary patterns
LVP
Local variance pattern
MGN
Multiplicative Gaussian noise
ML
Machine learning
MN
Masked noise
MSD
Mean squared deviation
NDS
Non-distortion-specific
NDS-GP-RIQA
Non-distortion-specific general-purpose referenceless image quality assessment
NEPN
Non eccentricity pattern noise
NFERM
No-reference free energy principe metric
NSS
Natural scene statistic
OC-LBP
Opponent local binary patterns
OC-LSP
Opposite color local salient patterns
OC-LVP
Opposite color local variance patterns
PN
Pink noise
PNG
Portable network graphics
PSNR
Peak-to-noise ratio
QN
Quantization noise
RGB
Red, green, and blue
RIQA
Referenceless image quality assessment
RIQMC
Reduced-reference image quality metric for contrast change
RR
Reduced-reference
SCN
Spatially correlated noise
SROCC
Spearman rank order correlation coefficient
SSEQ
Spatial and spectral entropies quality assessment
SSIM
Structural similarity
SSR
Sparse sampling and reconstruction
SVR
Support vector regression
VSM
Visual saliency models
WN
White noise
YCbCr
Y luma component, color blue component relative to the green component, and color red component relative to the green component

Background

The rapid growth of current multimedia industry, and the consequent increase in content quality requirements, have prompted the interest in visual quality assessment methodologies [1]. Because most multimedia applications are designed for human observers, visual perception has to be considered when measuring visual quality [2]. Psychophysical experiments (or subjective quality assessment methods) performed with human subjects are considered the most accurate methods to assess visual quality [3]. However, these subjective methods are costly, time-consuming, and, for this reason, not adequate for real-time multimedia applications.
Objective quality assessment metrics predict visual quality employing mathematical methods instead of human subjects. For instance, mean squared deviation (MSD) and peak-to-noise ratio (PSNR) are mathematical methods that can be used to measure the similarity of visual signals. However, MSD and PSNR scores often do not correlate well with the image quality as perceive by human observers (i.e., subjective scores) [4]. It is worth mentioning that, for an objective metric to be used in multimedia applications, its estimates must be well correlated with quality scores available in publicly available quality databases, which use standardized experimental procedures to measure the quality of a comprehensive series of visual signals.
Metrics can be classified according to the quantity of reference information (pristine content) required by the method. While full-reference (FR) metrics require the original content, reduced-reference (RR) metrics demand only parts of original information. Since the reference (or even partial reference information) is not available in many multimedia applications, there is a need for referenceless metrics that do not require any information about the reference image.
The development of referenceless image quality assessment (RIQA) methods remains a challenging problem [2, 5]. A popular approach consists of estimating image quality using distortion-specific (DS) methods that measure the intensity of the most relevant image distortions. Among the state-of-the-art DS methods, we can cite the papers of Fang et al. [6], Bahrami and Kot [7], Golestaneh and Chandler [8], and Li et al. [911]. These methods make assumptions about the type of distortion present in the signal and, as a consequence, have limited applications in more diverse multimedia scenarios.
Non-distortion-specific (NDS) methods, which do not demand a prior knowledge about the type of distortions in the signal, are more suitable for diverse multimedia scenarios. In this case, instead of making assumptions about the main characteristics of specific distortions, the methods make assumptions about the image characteristics. For instance, to find the relationship between gradient information and image quality, Liu et al. [12] and Li et al. [13] make assumptions about the image structure of reference images in the gradient domain. Some methods compare the statistics of impaired and non-impaired (natural) images using a “natural scene statistic” (NSS) approach [14, 15].
In addition to the aforementioned approaches, IQA methods can be classified as feature-based or human visual system (HVS)-based approaches. Feature-based approaches extract and analyze features from image signals to estimate quality. Usually, these approaches require three steps. In the first step, descriptive features are extracted. Then, the extracted features are pooled to produce a quality-aware feature vector. Finally, a model maps the pooled data into a numerical value that represents the quality score of the image under test. One example of a feature-based metric is the work of Mittal et al. [16], which is a spatial-domain method based on the NSS. Saad et al. [14, 17] proposed another feature-based NSS method that operates in the discrete cosine transform (DCT) domain. Finally, Liu et al. [18] proposed a feature-based method that is based on spatial and spectral image entropies. More recently, some works proposed feature extraction used texture information to estimate image quality [1927].
Instead of extracting basic features from images, HVS-based approaches aim to mimic the HVS behavior. Hitherto, various HVS properties have been used in quality metrics, including structural information [28, 29] and error and brightness sensitivities [30, 31]. The acclaimed structural similarity index (SSIM) [32] is based on the assumption that HVS is more sensitive to the structural information of the visual content and, therefore, a structural similarity measure can provide a good estimate of the perceived image quality. The recent free energy theory revealed that the HVS strives to comprehend the input visual signal by reducing the undetermined portions, which affects the perception of quality [33]. Zhang et al. [34] proposed a Riesz transform-based feature similarity index (RFSIM) that characterizes local structures of images and uses a Canny edge detector to generate a pooling mask. More recently, HVS-based methods employing convolutional neural networks (CNN) have been proposed [3537]. These CNN-based methods are established on the comparison between the hierarchy of the human visual areas and the layers of a CNN [38, 39].
In recent years, HVS-based image quality approaches that incorporate visual saliency models (VSM) have been a trend [4043]. Image quality metrics and VSM are inherently correlated because both of them take into account how the HVS perceives the visual content (i.e., how humans perceive suprathreshold distortions) [42]. Since VSM provide a measurement of the region’s importance, they can be successfully used for weight distortions in image quality algorithms. Several researchers have studied how the saliency information can be incorporated into visual quality metrics to enhance their performance [41, 4447]. However, most VSM-based quality metrics are FR approaches. Among the existing VSM-based RIQA methods, most are DS methods that cannot be used as general-purpose RIQA methods (GP-RIQA).
Additionally, most of current GP-IQA methods have no good prediction accuracy for color and contrast-distorted images. For instance, Ortiz-Jaramill et al. [48] demonstrated that current color difference measures (i.e., FR-IQA methods that compute color differences between processed and reference images) present little correlation with subjective quality scores. Also, even though some DS-IQA methods are able to predict the quality of contrast-distorted images [49], most GP-IQA methods have a poor prediction performance. This low performance leads to authors often omitting the results for these types of image distortions [18, 20, 23, 50].
In this paper, we introduce a NDS-GP-RIQA method based on machine learning (ML) that tackles these limitations by taking into account how impairments affect salient color-texture and energy information. The introduced method is based on the statistics of two new proposed descriptors: the opponent color local variance pattern (OC-LVP) and the opposite-color local salient pattern (OC-LSP). These proposed descriptors are extensions of the opponent-color local binary pattern (OC-LBP) [51] that incorporate both feature-based and HVS-based approaches. More specifically, the OC-LSP extends the OC-LBP by encoding both spatial, color, and saliency information using a VSM to weight the OC-LBP statistics. The OC-LVP descriptor uses concepts introduced by the local variant patterns (LVP) [52] to modify the OC-LBP and measure the color-texture energy. The method uses the statistics of OC-LVP and OC-LSP as input of a gradient boosting machine (GBM) [53, 54] that learns the predictive quality model via regression. When compared to previous work [52], in this work, we use OC-LSP and OC-LVP operators, instead of the simpler LVP operator. The metric design of the metric was also modified to use a GBM, instead of the random forest regression algorithm.
The rest of this paper is divided as follows. In the “A brief review of local binary patterns” section, the basis of texture analysis is revised. In the “Opponent color local binary pattern” section, the base color-texture descriptor is summarized. In the “Opponent color local salient pattern” and “Opposite color local variance pattern” sections, the proposed descriptors are detailed. In the “Feature extraction” and “Gradient boosting machine for regression” sections, we describe how to use the proposed descriptors to predict image quality without references. An extensive analysis of the results is presented in the “Results and discussion” section. Finally, the “Conclusions” section concludes this paper.

Methods

In this section, we review the basic texture operator local binary pattern (LBP) and its improved color-texture extension, the opposite color local binary patterns (OC-LBP). Then, we describe the proposed quality-aware descriptor, named the color local salient patterns (OC-LSP) and the color local variance patterns (OC-LVP). Finally, this section finishes with the proposed quality assessment method based on these operators.

A brief review of local binary patterns

Local binary pattern (LBP) is indubitably one of the most effective texture descriptors available for texture analysis of digital images. It was first proposed by Ojala et al. [55] as a specific case of the texture spectrum model [56]. Being \(I \in \mathbb {R}^{m \times n}\) the image whose texture we want to describe, the ordinary LBP takes the form:
$$ {LBP}_{R, P}(I_{c}) = \sum\limits_{p=0}^{P-1} f \left(I_{p}, I_{c}, p \right), $$
(1)
where
$$ f \left(I_{p}, I_{c}, p\right) = S\left(I_{p} - I_{c}\right) \cdot 2^{p} $$
(2)
and
$$ S(t) =\left\{ \begin{array}{ll} 1, & \text{if}~t \geq 0, \\ 0, & \text{otherwise}. \end{array}\right. $$
(3)
In Eq. 1, Ic=I(x,y) is an arbitrary central pixel at the position (x,y) and Ip=I(xp,yp) is a neighboring pixel surrounding Ic, where:
$$ x_{p} = x + R \cos\left(2 \pi \cdot \frac{p}{P}\right), $$
and
$$ y_{p} = y - R \sin\left(2 \pi \cdot \frac{p}{P}\right). $$
In this case, P is the number of neighboring pixels sampled from a distance of R from Ic to Ip. Figure 1 illustrates examples of symmetric samplings for different neighboring points (P) and radius (R) values.
Figure 2 exemplifies the steps for applying the LBP operator on a single pixel (Ic=35), located in the center of a 3×3 image block, as shown in the bottom-left of this figure. The numbers in the yellow squares of the block represent the order in which the operator is computed (counter-clockwise direction starting from 0). In this figure, we use an unitary neighborhood radius (R=1) and eight neighboring pixels (P=8). After calculating S(t) (see Eq. 3) for each neighboring pixel Ip, we obtain a binary output for each Ip (0≤p≤7), as illustrated in the block in the upper-left position of Fig. 2. In this block, black circles correspond to “0” and white circles to “1”. These binary outputs are stored in a binary format, according to their position (yellow squares). Then, the resulting binary number is converted to the decimal format. For a complete image, we use the LBP operator to obtain a decimal number for each pixel of the image, by making Ic equal the current pixel.
When an image is rotated, Ip values move along the perimeter of the circumference (around Ic), generating a circular shift in the binary number generated. As a consequence, a different decimal LBPR,P(Ic) value is obtained. To remove this effect, we assign a unique identifier to each rotation, generating a rotation invariant LBP:
$$ {LBP}_{R,P}^{ri}(I_{c}) = \min \left\{ ROTR\left(LBP_{R, P}(I_{c}), k\right) \right\}, $$
(4)
where k={0,1,2,⋯,P−1} and ROTR(x,k) is the circular bit-wise right shift operator that shifts the t-uple x by moving k positions.
Due to the primitive quantization of the angular space [57, 58], LBPR,P and \({LBP}_{R,P}^{ri}\) operators do not always provide a good discrimination [58]. To improve the discriminability of the LBP operator, Ojala et al. [55] proposed an improved operator that captures fundamental pattern properties. These fundamental patterns are called “uniform” and computed as follows:
$$ {LBP}_{R,P}^{u}(I_{c}) =\left\{ \begin{array}{ll} \sum\limits_{p=0}^{P-1} f\left(I_{p}, I_{c}, p\right) & U\left({LBP}_{R,P}^{ri}\right) \leq 2, \\ P+1 & \text{otherwise}, \end{array}\right. $$
(5)
where U(LBPP,R) is the uniform pattern given by:
$$ U({LBP}_{P,R}) = \Delta\left(I_{P-1}, I_{0}\right) + \sum\limits_{p=1}^{P-1} \Delta\left(I_{p}, I_{p-1}\right), $$
(6)
and
$$ \Delta\left(I_{x}, I_{y}\right) = | S(I_{x} - I_{c}) - S(I_{y} - I_{c}) |. $$
(7)
In addition to a better discriminability, the uniform LBP operator (Eq. 5) has the advantage of generating fewer distinct LBP labels. While the “nonuniform” operator (Eq. 1) produces 2P different output values, the uniform operator produces only P+2 distinct output values, and the “rotation invariant” operator produces P(P−1)+2 points.

Opponent color local binary pattern

The LBP operator is designed to characterize texture of grayscale images. Although this restriction may not affect many applications, it may be unfavorable for image quality assessment purposes because LBP is not sensitive to some types of impairments, such as contrast distortions or chromatic aberrations. As pointed out by Maenpaa et al. [51], texture and color have interdependent roles. When luminance-based texture descriptors (e.g., LBP) achieve good results, color descriptors can also obtain good results. However, when color descriptors are unsuccessful, luminance texture descriptors can still present a good performance. For this reason, operators that integrate both color and texture information tend to be more successful to predict the quality of images with a wider range of distortions.
In order to integrate color and texture into a single descriptor, Maenpaa et al. [51] introduced the opponent color local binary pattern (OC-LBP). The OC-LBP extends the LBP operator by incorporating color information, while keeping texture information. This color-texture descriptor is an extension of the operator proposed by Jain and Healey [59], which replaces the Gabor’s filtering with a variant of the LBP-inspired operator.
The OC-LBP descriptor operates on intra-channel and inter-channel color dimensions. In the intra-channel operation, the LBP operator is applied individually, on each color channel, instead of being applied only on a single luminance channel. This approach is called “intra-channel” because the central pixel and the corresponding sampled neighboring points belong to the same color channels.
In the “inter-channel” operation, the central pixel belongs to a color channel and its corresponding neighboring points are necessarily sampled from another color channel. Therefore, for a three-channel color space, such as HSV, there are six possible combinations of channels: OC-LBP HS, OC-LBP SH, OC-LBP HV, OC-LBP VH, OC-LBP SV, and OC-LBP VS.
Figure 3 illustrates the sampling approach of OC-LBP when the central pixel is sampled in the R channel of a RGB image. From this figure, we can notice that two combinations are possible: OC-LBP RG (left) and OC-LBP RB (right). In OC-LBP RG, the gray circle in the red channel is the central point, while the green circles in the green channel correspond to “0” sampling points and the white circles correspond to “1” sampling points, respectively. Similarly, in OC-LBP RB, the blue circles correspond to “0” sampling points and the white circles correspond to “1” sampling points, respectively.
After computing the OC-LBP operator for all pixels of a given image, a total of six texture maps are generated. As depicted in Fig. 4, three intra-maps and three inter-maps are generated for each color space. Although all possible combinations of the opposite color channels allow six distinct maps, we observed that the symmetric opposing pairs are very redundant (e.g., OC-LBP RG is equivalent to OC-LBP GR, OC-LBP HS is equivalent to OC-LBP SH, and so on). Due to this redundancy, only the three more descriptive inter-maps are used.

Opponent color local salient pattern

Although OC-LBP increases the discriminability of LBP by incorporating color-texture information, it does not necessarily mimics the human visual system (HVS) behavior. To generate general-purpose descriptors that incorporates visual attention, we modify the OC-LBP by incorporating the VS information. The modified descriptor is named opponent color local salient pattern (OC-LSP). Basically, we compute the OC-LBP for all pixels of an image, obtaining the intra- and inter-channel maps of the image (see Fig. 4). In other words, being \(\mathcal {L} \in \{ \text {LBP}_{X}, \text {LBP}_{Y}, \text {LBP}_{Z}, \text {OC-LBP}_{XY}, \text {OC-LBP}_{XZ}, \text {OC-LBP}_{YZ} \}\), where XYZ represents any color space (i.e., HSV, CIE Lab, RGB, and YCbCr) normalized in the range [0,255]. Each label \(\mathcal {L}(x,y)\) corresponds to the local texture associated to the pixel I(x,y). We use a VSM to generate a saliency map \(\mathcal {W}\), where each pixel \(\mathcal {W}(x,y)\) corresponds to the saliency of pixel I(x,y). Figure 5a and h depicts an image and its corresponding saliency map, respectively.
The saliency map \(\mathcal {W}\) is used to weight each pixel of the map \(\mathcal {L}\). This weighting process is used to generate a feature vector based on the histogram of \(\mathcal {L}\) weighted by \(\mathcal {W}\). The histogram is given by the following expression:
$$ \mathcal{H} = \left\{ h_{0}, h_{1}, h_{2}, \cdots, h_{P+1} \right\}, $$
(8)
where hϕ is the count of the label \(\mathcal {L}(x,y)\) weighted by \(\mathcal {W}\), as given by:
$$ h_{\phi} = \sum_{x,y} \mathcal{W}(x,y) \cdot \delta(\mathcal{L}(x, y), \phi), $$
(9)
where
$$ \delta(v, u) =\left\{ \begin{array}{ll} 1, & \text{if}\ v=u, \\ 0, & \text{otherwise}. \\ \end{array}\right. $$
(10)
The number of bins of \(\mathcal {H}\) is the number of distinct labels of \(\mathcal {L}\). Therefore, we can remap each \(\mathcal {L}(x, y)\) to its weighted form, generating the map \(\mathcal {S}(x, y)\) that is the local salient pattern (LSP) map. Figure 5 depicts \(\mathcal {S}\).

Opposite color local variance pattern

The use of the LBP operator (or of its variants) in IQA is based on the assumption that visual distortions affect image textures and their statistics. Particularly, images with similar distortions, at similar strengths, have textures that share analogous statistical properties. Recently, Freitas et al. [52] used a second assumption, which considers the changes in the spread of the local texture energy that are commonly observed in impaired images. For instance, a Gaussian blur impairment decreases the local texture energy, while a white noise impairment increases it. Therefore, we can use techniques that measure texture energy in RIQA algorithms.
To take into consideration the spread of the texture local energy, Freitas et al. proposed the local variance pattern (LVP) descriptor [52] for quality assessment tasks. The LVP descriptor computes the local texture-energy according to the following formula:
$$ {LVP}_{R,P}^{u}(I_{c}) = \left\lfloor \frac{P \cdot V_{R,P}(I_{c}) - \left[ {LBP}_{R,P}^{u}(I_{c}) \right]^{2}}{P^{2}} \right\rceil, $$
(11)
where:
$$ V_{R,P}(I_{c}) = \sum\limits_{p=0}^{P-1} \left[\thinspace {f}(t_{p}, t_{c}, p) \right]^{2}, $$
(12)
and ⌊·⌉ represents the operation of rounding to the nearest integer.
Figure 2 depicts the steps to extract the texture-energy information using the LVP operator. Similar to LBP operator, a LVP map is generated after computing the LVP descriptor for all pixels of a given image. A comparison between LVP and LBP maps is depicted in Fig. 6. In this figure, the first column corresponds to the reference (undistorted) image, while the three other columns correspond to images impaired with blur, white noise, and JPEG-2K distortions. The first row shows the colored images, while the second and third rows show the corresponding LBP and LVP maps, respectively. Notice that textures are affected differently by different impairments. For instance, the LBP maps (second line of Fig. 6) corresponding to noise, blurry, and JPEG-2K compressed images have clear differences among themselves. However, the LBP map corresponding to the noise and reference images are similar. This similarity affects the discrimination between unimpaired and impaired images, affecting the quality prediction. On the other hand, the LVP channels (third line of Fig. 6) clearly show the differences between impaired and reference images.
Although the LVP descriptor presents higher discriminability (when compared with LBP), it does not incorporate color information. To take advantage of the LVP properties and include color information, we combine the OC-LBP and LVP descriptors to produce a new descriptor: the opposite color local variant pattern (OC-LVP). OC-LVP uses a sampling strategy that is similar to the strategy used by the OC-LBP descriptor (see Fig. 3), with a difference that it replaces Eq. 5 by Eq. 11. Similar to OC-LBP, OC-LVP generates six maps. As depicted in Fig. 7, three LVP intra-channel maps are generated by computing LVP independently for each color channel. Likewise, three OC-LVP inter-channels are computed by sampling the central point in a channel and the neighboring points in another channel (across channels).

Feature extraction

The proposed RIQA method uses a supervised ML approach. The set of features is extracted, as depicted in Fig. 8. The first step of the feature extraction process consists of splitting the color channels. Using the individual color channels, we compute the OC-LSP maps. In Fig. 5, we observe that, independent of the color space, the intra-channel maps are very similar. This similarity and the invariance between color spaces indicate that intra-channel statistics do not depend on the chosen color space.
The inter-channel maps, on the other hand, are not similar to each other. Moreover, they show considerable differences for the different color spaces. This indicates that different OC-LSP are able to extract different information, depending of the color space. Therefore, based on these observations, we use Eq. 8 to compute the histograms \(\mathcal {H}\) of LSP H, OC-LSP HS, OC-LSP HV, OC-LSP SV, OC-LSP La, OC-LSP Lb, OC-LSP ab, OC-LSP RG, OC-LSP RB, OC-LSP GB, OC-LSP YCb, OC-LSP YCr, and OC-LSP CbCr maps. The concatenation of these histograms generates the OC-LSP feature set.
Finally, the OC-LVP feature set is generated by computing the mean, variance, skewness, kurtosis, and entropy of each map, as depicted in Fig. 8. The concatenation of OC-LVP and OC-LSP feature sets generates the feature vector \(\vec {x}\), which is used as input to a regression algorithm.

Gradient boosting machine for regression

After concatenating the OC-LVP and OC-LSP feature sets to generate the feature vector \(\vec {x}\), we use it to predict image quality. The prediction is computed using \(\vec {x}\) as input to a gradient boosting machine (GBM). GBMs are a group of powerful ML techniques that have shown substantial success in a wide range of practical applications [53, 54]. In our application, we use a GBM regression model to map \(\vec {x}\) to the database subjective scores.

Results and discussion

In this section, we analyze the proposed method by comparing it with some of the state-of-the-art methods. Specifically, this section describes the experimental setup and configurations used in the analysis of the impact of the color space on the performance of the proposed method and in the comparisons between the proposed method and available state-of-the-art methods.

Experimental setup

There are a number of existing benchmark image quality databases. In this work, we use the following databases:
  • Laboratory for Image and Video Engineering (LIVE) Image Database version 2 [60]: The database presents 982 test images, including 29 originals and 5 categories of distortions. These images are in uncompressed BMP format at several dimensions, including 480 × 720, 610 × 488, 618 × 453, 627 × 482, 632 × 505, 634 × 438, 634 × 505, 640 × 512, and 768 × 512. The distortions include JPEG, JPEG 2000 (JPEG2k), white noise (WN), Gaussian blur (GB), and fast fading (FF).
  • Computational and Subjective Image Quality (CSIQ) Database [28]: The database contains 30 reference images, obtained from public-domain sources, and 6 categories of distortions. These images are in 512 × 512× 24 compressed bitmap (BMP) format (PNG image data). The distortions include JPEG, JPEG 2000 (JPEG2k), white noise (WN), Gaussian blur (GB), global contrast decrements (CD), and additive Gaussian pink noise (PN). In total, there are 866 distorted images.
  • Tampere Image Database 2013 (TID2013) [61]: The database has 25 reference images and 3,000 distorted images (25 reference images × 24 types of distortions × 5 levels of distortions). These images are in 512 × 384× 24 uncompressed BMP format. The distortions include additive Gaussian noise (AGN), additive noise in color components (ANCC), spatially correlated noise (SCN), masked noise (MN), high frequency noise (HFN), impulse noise (IN), quantization noise (QN), Gaussian blur (GB), image denoising (ID), JPEG, JPEG2k, JPEG transmission errors (JPEG+TE), JPEG2k transmission errors (JPEG2k+TE), non eccentricity pattern noise (NEPN), local block-wise distortions (LBD), intensity shift (IS), contrast change (CC), change of color saturation (CCS), multiplicative Gaussian noise (MGN), comfort noise (CN), lossy compression (LC), image color quantization with dither (ICQ), chromatic aberration (CA), and sparse sampling and reconstruction (SSR).
The Boolean Map Saliency (BMS) is used as the VSM algorithm [62]. We compare the proposed method with a set of publicly available methods. The chosen state-of-the-art RIQA methods are the following: Codebook Representation for No-Reference Image Assessment (CORNIA) [23], Curvlet-based Quality Assessment (CQA) [50], Spatial and Spectral Entropies Quality Assessment (SSEQ) [18], Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) [16], local ternary patterns (LTP) [20], and No-Reference Free Energy Principe Metric (NFERM) [33]. Additionally, we also compared the proposed algorithm with three well-established FR-IQA metrics, namely PSNR, structural similarity (SSIM) [32], and reduced-reference image quality metric for contrast change (RIQMC) [49].
The trained-based RIQA methods are performed using the same training-and-testing protocol. The protocol consists of splitting each single database into two content-independent subsets (i.e., one subset for training and another for testing). To avoid overtraining and, therefore, failing to predict quality for other contents, scenes in the testing subset are not present in the training subset, and vice-versa. Considering this constraint, 20% of images are randomly selected for testing and the remaining 80% are used for training. This 80-20 split, training, and testing procedure is a simulation. We performed each simulation 1000 times, and the mean correlation value is reported. To compare the predicted and subjective quality scores, three correlation metrics were used: Spearman rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (LCC), and Kendall rank order correlation coefficient (KRCC).
It is worth pointing out that each simulation is performed using all distortions in training. When the prediction performance “per distortion” is reported, the predicted data for each distortion is generated using the trained data using all distortions for training. For the training-based methods based on the support vector regression (SVR) algorithm, the training and predicting steps are implemented using the Sklearn library [63]. The SVR metaparameters are found using exhaustive grid search methods provided by Sklearn’s API. The proposed method, on the other hand, uses GBM regression implemented with the XGBoost [64] library.

Impact of color space on prediction performance

To investigate the most suitable color space for the proposed method, we perform simulations with the LIVE2 database using the HSV, Lab, RGB, and YCbCr color spaces. For comparison proposes, we also tested the algorithm using the features obtained by combining all color spaces. Table 1 shows the average LCC, SROCC, and KRCC correlation scores (CS) for 1000 simulations.
Table 1
Average LCC, SROCC, and KRCC of 1,000 runs of simulations, using different color spaces on LIVE, CSIQ, and TID2013 databases
  
HSV
LAB
RGB
YCbCr
ALL
 
Distortion
SROCC
LCC
KRCC
SROCC
LCC
KRCC
SROCC
LCC
KRCC
SROCC
LCC
KRCC
SROCC
LCC
KRCC
LIVE
JPEG
0.9294
0.9564
0.7861
0.9069
0.9207
0.7507
0.9131
0.9291
0.7531
0.9335
0.9659
0.7943
0.9325
0.9532
0.7967
 
JPEG2k
0.9324
0.9459
0.7841
0.9187
0.9166
0.7668
0.9126
0.9151
0.7551
0.9457
0.9555
0.8071
0.9497
0.9531
0.8176
 
WN
0.9671
0.9717
0.8562
0.9448
0.9414
0.8145
0.9631
0.9565
0.8498
0.9706
0.9812
0.8691
0.9845
0.9817
0.9012
 
GB
0.9418
0.9441
0.8081
0.9418
0.9217
0.8081
0.9484
0.9406
0.8209
0.9421
0.9494
0.8113
0.9641
0.9672
0.8498
 
FF
0.8727
0.9067
0.7052
0.8431
0.8464
0.6827
0.8581
0.8641
0.6763
0.8868
0.9011
0.7181
0.8977
0.8979
0.7309
 
ALL
0.9385
0.9444
0.7908
0.9152
0.9077
0.7586
0.9252
0.9175
0.7701
0.9435
0.9524
0.8019
0.9492
0.9479
0.8128
CSIQ
JPEG
0.9217
0.9492
0.7718
0.9172
0.9526
0.7594
0.8944
0.9466
0.7165
0.9203
0.9481
0.7718
0.9331
0.9565
0.7655
 
JPEG2k
0.8661
0.8816
0.7057
0.8662
0.8775
0.7057
0.8843
0.9099
0.7195
0.8783
0.8848
0.7243
0.8871
0.9186
0.7272
 
WN
0.8945
0.8633
0.7212
0.8821
0.8856
0.7011
0.8433
0.8523
0.6689
0.9541
0.9596
0.8344
0.9346
0.9279
0.7924
 
GB
0.8987
0.9153
0.7318
0.9083
0.9253
0.7442
0.9151
0.9361
0.7641
0.9119
0.9198
0.7609
0.9197
0.9258
0.7655
 
PN
0.8678
0.8626
0.6951
0.8824
0.8862
0.7195
0.8391
0.8341
0.6551
0.9551
0.9521
0.8331
0.9461
0.9399
0.8068
 
CD
0.8008
0.7782
0.6344
0.7431
0.7457
0.5553
0.4928
0.4409
0.3722
0.6398
0.6774
0.4782
0.8097
0.8235
0.6257
 
ALL
0.8799
0.8909
0.7103
0.8879
0.8993
0.7166
0.8421
0.8693
0.6587
0.8938
0.9057
0.7365
0.8949
0.9152
0.7269
TID2013
AGN
0.8088
0.8063
0.6333
0.7869
0.7827
0.5933
0.7281
0.6942
0.5267
0.8044
0.7911
0.6067
0.9217
0.9221
0.7733
 
ANCC
0.7681
0.7656
0.5800
0.7645
0.7421
0.5733
0.5446
0.5453
0.3933
0.6831
0.6537
0.4950
0.8662
0.8565
0.6867
 
CCS
0.5456
0.5157
0.4067
0.4969
0.4444
0.3686
0.5723
0.5523
0.4333
0.4635
0.4436
0.3600
0.5991
0.5990
0.4533
 
CA
0.6621
0.9102
0.5133
0.7294
0.9404
0.5698
0.4194
0.8087
0.3024
0.5090
0.8492
0.3867
0.7583
0.9537
0.6000
 
CN
0.5052
0.4054
0.3600
0.3410
0.3034
0.2467
0.5815
0.5307
0.4140
0.6172
0.5788
0.4574
0.5765
0.5319
0.4267
 
CC
0.5308
0.5904
0.3867
0.7200
0.7302
0.5200
0.0812
0.0548
0.0568
0.3685
0.3628
0.2705
0.5100
0.4888
0.3706
 
GB
0.8784
0.8695
0.7067
0.8492
0.8665
0.6667
0.8601
0.8816
0.6667
0.8596
0.8858
0.6756
0.8655
0.8685
0.6800
 
HFN
0.9056
0.9217
0.7492
0.8657
0.8891
0.6800
0.8144
0.8453
0.6133
0.9083
0.9244
0.7333
0.9319
0.9424
0.7780
 
ICQ
0.8592
0.8502
0.6867
0.7902
0.8059
0.6067
0.7835
0.7907
0.5933
0.7819
0.7911
0.5843
0.7877
0.8021
0.5933
 
ID
0.8958
0.8925
0.7400
0.8892
0.8975
0.7333
0.7746
0.8610
0.5800
0.8462
0.8756
0.6667
0.8388
0.8927
0.6733
 
IN
0.7317
0.7334
0.5667
0.7700
0.7535
0.5733
0.5041
0.4511
0.3667
0.6015
0.5865
0.4400
0.6699
0.6616
0.5267
 
JPEG2k
0.8835
0.9261
0.7200
0.8808
0.9346
0.7133
0.8796
0.9257
0.7067
0.8923
0.9390
0.7267
0.8792
0.9295
0.7133
 
JPEG2k+TE
0.5031
0.4763
0.3667
0.3088
0.2953
0.2267
0.5962
0.6047
0.4267
0.6165
0.6160
0.4400
0.6160
0.6069
0.4600
 
JPEG
0.8305
0.8897
0.6244
0.8283
0.8934
0.6200
0.8173
0.8802
0.6133
0.7519
0.8701
0.5533
0.8531
0.9232
0.6600
 
JPEG+TE
0.4650
0.5090
0.3400
0.3362
0.3417
0.2467
0.5109
0.5898
0.3867
0.6024
0.7145
0.4641
0.3819
0.3884
0.2733
 
LBD
0.1952
0.1338
0.1533
0.3468
0.2292
0.2771
0.1319
0.1135
0.0933
0.1344
0.1118
0.1000
0.1754
0.1255
0.1302
 
LC
0.7277
0.7359
0.5533
0.8027
0.8106
0.6200
0.5723
0.6058
0.4200
0.6623
0.6602
0.4967
0.9000
0.9066
0.7400
 
MN
0.7582
0.7388
0.5576
0.6577
0.6889
0.4800
0.5114
0.5618
0.3533
0.5920
0.6151
0.4274
0.7738
0.7917
0.5776
 
MS
0.0869
0.0915
0.0600
0.2115
0.1868
0.1467
0.1260
0.1092
0.0902
0.1138
0.0865
0.0835
0.1254
0.0775
0.0867
 
MGN
0.7844
0.7715
0.6165
0.7426
0.7453
0.5442
0.6792
0.6853
0.4908
0.7662
0.7617
0.5667
0.8769
0.8836
0.7045
 
NEPN
0.1929
0.1835
0.1353
0.1408
0.1683
0.0968
0.2058
0.2008
0.1436
0.1960
0.1764
0.1369
0.1985
0.2100
0.1369
 
QN
0.8941
0.8958
0.7267
0.8562
0.8530
0.6800
0.8215
0.8093
0.6400
0.8750
0.8460
0.7045
0.8662
0.8700
0.6912
 
SSR
0.8992
0.9237
0.7400
0.8946
0.9155
0.7267
0.8900
0.9266
0.7200
0.8858
0.9229
0.7200
0.9146
0.9415
0.7600
 
SCN
0.7262
0.7269
0.5533
0.8708
0.8804
0.7000
0.7169
0.7323
0.5467
0.8323
0.8424
0.6400
0.9023
0.9098
0.7133
 
ALL
0.7008
0.7553
0.5230
0.7097
0.7560
0.5214
0.6418
0.7222
0.4656
0.6888
0.7537
0.5058
0.7231
0.7746
0.5444
Average
0.7486
0.7586
0.5973
0.7407
0.7495
0.5845
0.6841
0.7051
0.5322
0.7323
0.7529
0.5837
0.7767
0.7885
0.6282
From these results, we notice that the YCbCr color space provides a statistically superior performance for almost all distortions (23 out of 114 CS or 20.17%) and followed by Lab (13 out of 114 or 11.41%), HSV (10 out of 114 or 8.77%), and RGB (3 of 114 or 2.63%). However, the combination of all color spaces (“ALL” label) provides the best prediction performance (65 out of 114 or 57.02%).

Prediction performance using a single database

Table 2 depicts the results for the tested methods using part of database for training and part for testing. Numbers in italics represent the best correlation values among RIQA and FR-IQA methods, while numbers in bold correspond to the best correlation values considering only the RIQA methods.
Table 2
Mean SROCC of the PSNR, SSIM, RIQMC, BRISQUE, CORNIA, CQA, SSEQ, LTP, NFERM, and the proposed metrics, obtained for 1000 simulation runs on the LIVE, CSIQ, and TID2013 databases
Database
Distortion
PSNR
SSIM
RIQMC
BRISQUE
CORNIA
CQA
SSEQ
LTP
NFERM
PROPOSED
LIVE
JPEG
0.8515
0.9481
0.7794
0.8641
0.9002
0.8257
0.9122
0.9395
0.9645
0.9325
 
JPEG2k
0.8822
0.9438
0.5383
0.8838
0.9246
0.8366
0.9388
0.9372
0.9411
0.9497
 
WN
0.9851
0.9793
0.6628
0.9750
0.9500
0.9764
0.9544
0.9646
0.9838
0.9845
 
GB
0.7818
0.8889
0.8711
0.9304
0.9465
0.8377
0.9157
0.9530
0.9219
0.9641
 
FF
0.8869
0.9335
0.6802
0.8469
0.9132
0.8262
0.9038
0.8758
0.8627
0.8977
 
ALL
0.8013
0.8902
0.6785
0.9098
0.9386
0.8606
0.9356
0.9316
0.9405
0.9492
CSIQ
JPEG
0.9009
0.9309
0.7242
0.8525
0.8319
0.6506
0.8066
0.9292
0.9036
0.9331
 
JPEG2k
0.9309
0.9251
0.5795
0.8458
0.8405
0.8214
0.7302
0.8877
0.9223
0.8871
 
WN
0.9345
0.8761
0.4678
0.6931
0.6187
0.7276
0.7876
0.6454
0.9214
0.9346
 
GB
0.9358
0.9089
0.8007
0.8337
0.8526
0.7486
0.7766
0.9244
0.8962
0.9197
 
PN
0.9315
0.8871
0.3653
0.7740
0.5340
0.5463
0.6661
0.7828
0.6334
0.9461
 
CD
0.8862
0.8128
0.9565
0.4255
0.4458
0.5383
0.4172
0.2082
0.3774
0.8097
 
ALL
0.8088
0.8116
0.5066
0.7597
0.6969
0.6369
0.7007
0.8280
0.9142
0.8949
TID2013
AGC
0.8568
0.7912
0.3555
0.4166
0.2605
0.3964
0.3949
0.5963
0.7077
0.9217
 
AGN
0.9337
0.6421
0.6055
0.6416
0.5689
0.6051
0.6040
0.6631
0.8567
0.8662
 
CA
0.7759
0.7158
0.5726
0.7310
0.6844
0.4380
0.4366
0.6749
0.6357
0.5991
 
CC
0.4608
0.3477
0.8044
0.1849
0.1400
0.2043
0.2006
0.1886
0.2148
0.7583
 
CCS
0.6892
0.7641
0.0581
0.2715
0.2642
0.2461
0.2547
0.2384
0.3106
0.5765
 
CN
0.8838
0.6465
0.6262
0.2176
0.3553
0.1623
0.1642
0.3880
0.1385
0.5100
 
GB
0.8905
0.8196
0.7687
0.8063
0.8341
0.7019
0.7058
0.7465
0.8502
0.8655
 
HFN
0.9165
0.7962
0.4267
0.7103
0.7707
0.7104
0.7061
0.7626
0.8797
0.9319
 
ICQ
0.9087
0.7271
0.8691
0.7663
0.7044
0.6829
0.6834
0.7603
0.4804
0.7877
 
ID
0.9457
0.8327
0.8661
0.5243
0.7227
0.6711
0.6716
0.7063
0.6405
0.8388
 
IN
0.9263
0.8055
0.1222
0.6848
0.5874
0.4231
0.4272
0.6484
0.1735
0.6699
 
IS
0.7647
0.7411
0.5979
0.2224
0.2403
0.2011
0.2013
0.3291
0.0407
0.8792
 
JPEG
0.9252
0.8275
0.7293
0.7252
0.7815
0.6317
0.6284
0.6631
0.8711
0.6160
 
JPEGTE
0.7874
0.6144
0.6009
0.3581
0.5679
0.2221
0.2195
0.2314
0.1281
0.8531
 
JPEG2k
0.8934
0.7531
0.5967
0.7337
0.8089
0.7219
0.7205
0.7780
0.8068
0.3819
 
JPEG2kTE
0.8581
0.7067
0.7189
0.7277
0.6113
0.6529
0.6529
0.6594
0.1686
0.1754
 
LBD
0.1301
0.6213
0.2471
0.2833
0.2157
0.2382
0.2290
0.3813
0.1995
0.9000
 
LC
0.9386
0.8311
0.5346
0.5726
0.6682
0.4561
0.4460
0.6533
0.6516
0.7738
 
MGN
0.9085
0.7863
0.3751
0.5548
0.4393
0.4969
0.4897
0.6209
0.7159
0.1254
 
MN
0.8385
0.7388
0.0438
0.2650
0.2342
0.2506
0.2575
0.4243
0.2238
0.8769
 
NEPN
0.6931
0.5326
0.1496
0.1821
0.2855
0.1308
0.1275
0.1256
0.0667
0.1985
 
QN
0.8636
0.7428
0.8697
0.5383
0.4922
0.7242
0.7214
0.7361
0.7716
0.8662
 
SCN
0.9152
0.7934
0.7811
0.7238
0.7043
0.7121
0.7064
0.7015
0.2181
0.9146
 
SSR
0.9241
0.7774
0.6967
0.7101
0.8594
0.8115
0.8084
0.8457
0.7865
0.9023
 
ALL
0.6869
0.5758
0.4439
0.5416
0.6006
0.4925
0.4900
0.6078
0.3971
0.7231
Average
 
0.8377
0.7807
0.5808
0.6234
0.6262
0.5741
0.5893
0.6563
0.6084
0.7767
From Table 2, we can see that, for most databases, the proposed method achieves the best performance among the RIQA methods. For the LIVE2 database, the proposed method outperforms even the FR-IQA methods for JPEG2, WN, GB, and “ALL” distortions. For the PN and CD distortions of the CSIQ database, the proposed method provides a significantly better performance than the other RIQA methods. The only exception is RIQMC that obtained a mean SROCC of 0.9565, which is expected since it is a contrast-specific metric. The superior performance for PN distortions is probably due to the color-based features. The good performance for CD distortions is an important advantage of the proposed method, given that this distortion is a challenge for most RIQA methods.
For the TID2013 database, the proposed method outperforms other RIQA methods for 18 out of the 25 distortions, followed by NFERM, BRISQUE, and CORNIA. For AGC, HFN, IS, JPEG+TE, SSR, LBP, and MN distortions, the performance of the proposed method surpasses even FR-IQA methods. The performance for AGC distortions is very good, similar to what was obtained for the PN distortions of the CSIQ database. Albeit losing for RIQMC, which is a contrast-specific metric, the performance of the proposed method for CC and CD distortions of the CSIQ database is also good. This shows that the proposed method can handle contrast distortions.
Figure 9 depicts the distributions of the SROCC values computed between the subjective scores (MOS) and the predicted scores obtained using the tested RIQA methods. The bean plots of this figure are generated using the distribution of SROCC values for the set containing all database distortions (corresponding to “ALL” in Table 2). From Fig. 9a, we notice that almost all methods (with the exception CQA) present similar distributions of SROCC scores for the LIVE database. On the other hand, SROCC values vary more for CSIQ and TID2013 databases, as can be seen in Fig. 9b, c.

Statistical difference significance test

We also conduct tests to determine the statistical significance of the differences of the coefficient values reported in Table 3. We used the Welch’s t test on the SROCC values corresponding to each color space, considering all distortions (“ALL” label), with a 95% confidence level. The cells in Table 3 indicate whether the value of the corresponding row is statistically better (), statistically inferior (), or statistically equivalent (\(\circlearrowright \)) to the value of the corresponding column. These results show that the proposed method has a statistically superior performance in all cases.
Table 3
Welch’s t test performed between SROCC average values for TID2013 database: “ ” indicates that the method depicted in the row is statistically superior to the one in the column, “ ” indicates that the row is worse than the column, and “\(\circlearrowright \)” indicates that two methods have statistically the same performance
 
BRISQUE
CORNIA
CQA
SSEQ
LTP
PROPOSED
BRISQUE
\(\circlearrowright \)
\(\circlearrowright \)
CORNIA
 
CQA
  
\(\circlearrowright \)
SSEQ
   
LTP
    
Proposed
     

Performance for a cross-database validation

To investigate the generalization capability of the proposed method, we performed a cross-database validation. This validation consists of training the proposed RIQA method using all images of one database and testing them on the other databases. Table 4 depicts the SROCC values obtained using LIVE as the training database and TID2013 and CSIQ as the testing databases. To perform a straightforward cross-database comparison, only similar distortions were selected from each database. In other words, we select only JPEG, JPEG2k, WN, and GB distortions of CSIQ since these distortions are also present in the training databases. The PN and CD distortions were removed from the test set and, therefore, they are not listed in Table 4. Likewise, for TID2013, only JPEG, JPEG2k, WN, and GB distortions were kept. In TID2013, the HFN distortion was chosen because it is the most similar to the WN distortion.
Table 4
SROCC cross-database validation, when models are trained on LIVE2 and tested on CSIQ and TID2013
 
DIST
BRISQUE
CORNIA
CQA
SSEQ
LTP
Proposed
CSIQ
JPEG
0.8209
0.7062
0.7129
0.8141
0.8784
0.9876
 
JPEG2k
0.8279
0.8459
0.6957
0.7862
0.8914
0.9881
 
WN
0.6951
0.8627
0.6596
0.4613
0.7739
0.9962
 
GB
0.8311
0.8815
0.7648
0.7758
0.8712
0.9934
 
ALL
0.8022
0.7542
0.7114
0.7403
0.8628
0.9914
TID 2013
JPEG
0.8058
0.7423
0.8071
0.7823
0.8472
0.8853
 
JPEG2k
0.8224
0.8837
0.7724
0.8258
0.9046
0.9481
 
WN
0.8621
0.7403
0.8692
0.6959
0.6881
0.9077
 
GB
0.8245
0.8133
0.8214
0.8624
0.8693
0.8693
 
ALL
0.7965
0.7599
0.8214
0.7955
0.8137
0.8923
From Table 4, we can notice that the proposed method outperforms the other RIQA methods for the cross-database validation test. Notice that the proposed method achieves the best performance for all cases, except for one. For TID, the proposed method outperforms the other methods for four out of the five distortions, while for CSIQ, it outperforms the other methods for all five distortions. Therefore, the cross-database validation test indicates that the proposed method has a better generalization capability, when compared to the tested state-of-the-art RIQA methods.

Conclusions

In this paper, we proposed a novel NDS-GP-RIQA method based on the statistics of two new texture descriptors: the OC-LSP and OC-LVP. OC-LSP descriptor extends the capabilities of the (previous) OC-LBP operator by incorporating texture, color, and saliency information. Similarly, OC-LVP fuses OC-LBP and LVP operators to incorporate texture, color, and energy information. Quality is predicted after training a regression model using a gradient boost machine. Experimental results showed that, when compared with state-of-the art RIQA methods, the proposed method has the best performance. More specifically, when considering a wide range of distortions, the proposed method has a clear superiority. Since the proposed method is based on simple descriptors, it can be suitable for video quality assessment. Future works include a parallel implementation of the OC-LSP and OC-LVP descriptors.

Funding

This work was supported in part by the Conselho Nacional de Desenvolvimento Cientfico e Tecnológico (CNPq), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), the Fundação de Apoio a Pesquisa do Distrito Federal (FAP-DF), and the University of Brasília (UnB).
Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Literatur
1.
Zurück zum Zitat Seshadrinathan K, Bovik AC (2011) Automatic prediction of perceptual quality of multimedia signals—a survey. Multimed Tools Appl 51(1):163–186.CrossRef Seshadrinathan K, Bovik AC (2011) Automatic prediction of perceptual quality of multimedia signals—a survey. Multimed Tools Appl 51(1):163–186.CrossRef
3.
Zurück zum Zitat Telecom I (2000) Recommendation 500-10: Methodology for the subjective assessment of the quality of television pictures. ITU-R Rec. BT.500. Telecom I (2000) Recommendation 500-10: Methodology for the subjective assessment of the quality of television pictures. ITU-R Rec. BT.500.
4.
Zurück zum Zitat Wang Z, Bovik AC (2009) Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Proc Mag 26(1):98–117.CrossRef Wang Z, Bovik AC (2009) Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Proc Mag 26(1):98–117.CrossRef
6.
Zurück zum Zitat Fang Y, Ma K, Wang Z, Lin W, Fang Z, Zhai G (2015) No-reference quality assessment of contrast-distorted images based on natural scene statistics. Signal Process Lett IEEE 22(7):838–842. Fang Y, Ma K, Wang Z, Lin W, Fang Z, Zhai G (2015) No-reference quality assessment of contrast-distorted images based on natural scene statistics. Signal Process Lett IEEE 22(7):838–842.
7.
Zurück zum Zitat Bahrami K, Kot AC (2014) A fast approach for no-reference image sharpness assessment based on maximum local variation. Signal Process Lett IEEE 21(6):751–755.CrossRef Bahrami K, Kot AC (2014) A fast approach for no-reference image sharpness assessment based on maximum local variation. Signal Process Lett IEEE 21(6):751–755.CrossRef
8.
Zurück zum Zitat Golestaneh SA, Chandler DM (2014) No-reference quality assessment of JPEG images via a quality relevance map. Signal Process Lett IEEE 21(2):155–158.CrossRef Golestaneh SA, Chandler DM (2014) No-reference quality assessment of JPEG images via a quality relevance map. Signal Process Lett IEEE 21(2):155–158.CrossRef
9.
Zurück zum Zitat Li L, Lin W, Zhu H (2014) Learning structural regularity for evaluating blocking artifacts in jpeg images. Signal Process Lett IEEE 21(8):918–922.CrossRef Li L, Lin W, Zhu H (2014) Learning structural regularity for evaluating blocking artifacts in jpeg images. Signal Process Lett IEEE 21(8):918–922.CrossRef
10.
Zurück zum Zitat Li L, Zhou Y, Lin W, Wu J, Zhang X, Chen B (2016) No-reference quality assessment of deblocked images. Neurocomputing 177:572–584.CrossRef Li L, Zhou Y, Lin W, Wu J, Zhang X, Chen B (2016) No-reference quality assessment of deblocked images. Neurocomputing 177:572–584.CrossRef
11.
Zurück zum Zitat Li L, Zhu H, Yang G, Qian J (2014) Referenceless measure of blocking artifacts by Tchebichef kernel analysis. Signal Process Lett IEEE 21(1):122–125.CrossRef Li L, Zhu H, Yang G, Qian J (2014) Referenceless measure of blocking artifacts by Tchebichef kernel analysis. Signal Process Lett IEEE 21(1):122–125.CrossRef
12.
Zurück zum Zitat Liu L, Hua Y, Zhao Q, Huang H, Bovik AC (2016) Blind image quality assessment by relative gradient statistics and adaboosting neural network. Signal Process Image Commun 40:1–15.CrossRef Liu L, Hua Y, Zhao Q, Huang H, Bovik AC (2016) Blind image quality assessment by relative gradient statistics and adaboosting neural network. Signal Process Image Commun 40:1–15.CrossRef
14.
Zurück zum Zitat Saad MA, Bovik AC, Charrier C (2012) Blind image quality assessment: a natural scene statistics approach in the DCT domain. Image Process IEEE Trans 21(8):3339–3352.MathSciNetCrossRefMATH Saad MA, Bovik AC, Charrier C (2012) Blind image quality assessment: a natural scene statistics approach in the DCT domain. Image Process IEEE Trans 21(8):3339–3352.MathSciNetCrossRefMATH
15.
Zurück zum Zitat Moorthy AK, Bovik AC (2011) Blind image quality assessment: from natural scene statistics to perceptual quality. Image Process IEEE Trans 20(12):3350–3364.MathSciNetCrossRefMATH Moorthy AK, Bovik AC (2011) Blind image quality assessment: from natural scene statistics to perceptual quality. Image Process IEEE Trans 20(12):3350–3364.MathSciNetCrossRefMATH
16.
Zurück zum Zitat Mittal A, Moorthy AK, Bovik AC (2012) No-reference image quality assessment in the spatial domain. Image Process IEEE Trans 21(12):4695–4708.MathSciNetCrossRefMATH Mittal A, Moorthy AK, Bovik AC (2012) No-reference image quality assessment in the spatial domain. Image Process IEEE Trans 21(12):4695–4708.MathSciNetCrossRefMATH
17.
Zurück zum Zitat Saad MA, Bovik AC, Charrier C (2010) A DCT statistics-based blind image quality index. IEEE Signal Process Lett 17(6):583–586.CrossRef Saad MA, Bovik AC, Charrier C (2010) A DCT statistics-based blind image quality index. IEEE Signal Process Lett 17(6):583–586.CrossRef
18.
Zurück zum Zitat Liu L, Liu B, Huang H, Bovik AC (2014) No-reference image quality assessment based on spatial and spectral entropies. Signal Process Image Commun 29(8):856–863.CrossRef Liu L, Liu B, Huang H, Bovik AC (2014) No-reference image quality assessment based on spatial and spectral entropies. Signal Process Image Commun 29(8):856–863.CrossRef
19.
Zurück zum Zitat Freitas PG, Akamine WY, Farias MC (2016) Blind image quality assessment using multiscale local binary patterns. J Imaging Sci Technol 60(6):60405–1.CrossRef Freitas PG, Akamine WY, Farias MC (2016) Blind image quality assessment using multiscale local binary patterns. J Imaging Sci Technol 60(6):60405–1.CrossRef
22.
Zurück zum Zitat Ye P, Doermann D (2012) No-reference image quality assessment using visual codebooks. Image Process IEEE Trans 21(7):3129–3138.MathSciNetCrossRefMATH Ye P, Doermann D (2012) No-reference image quality assessment using visual codebooks. Image Process IEEE Trans 21(7):3129–3138.MathSciNetCrossRefMATH
24.
Zurück zum Zitat Zhang M, Muramatsu C, Zhou X, Hara T, Fujita H (2015) Blind image quality assessment using the joint statistics of generalized local binary pattern. Signal Process Lett IEEE 22(2):207–210.CrossRef Zhang M, Muramatsu C, Zhou X, Hara T, Fujita H (2015) Blind image quality assessment using the joint statistics of generalized local binary pattern. Signal Process Lett IEEE 22(2):207–210.CrossRef
27.
Zurück zum Zitat Wu J, Lin W, Shi G (2014) Image quality assessment with degradation on spatial structure. Signal Process Lett IEEE 21(4):437–440.CrossRef Wu J, Lin W, Shi G (2014) Image quality assessment with degradation on spatial structure. Signal Process Lett IEEE 21(4):437–440.CrossRef
28.
Zurück zum Zitat Larson EC, Chandler DM (2010) Most apparent distortion: full-reference image quality assessment and the role of strategy. J Electron Imaging 19(1):011006–011006.CrossRef Larson EC, Chandler DM (2010) Most apparent distortion: full-reference image quality assessment and the role of strategy. J Electron Imaging 19(1):011006–011006.CrossRef
30.
Zurück zum Zitat Sheikh HR, Bovik AC (2006) Image information and visual quality. IEEE Trans Image Process 15(2):430–444.CrossRef Sheikh HR, Bovik AC (2006) Image information and visual quality. IEEE Trans Image Process 15(2):430–444.CrossRef
31.
Zurück zum Zitat Chandler DM, Hemami SS (2007) VSNR: a wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans Image Process 16(9):2284–2298.MathSciNetCrossRef Chandler DM, Hemami SS (2007) VSNR: a wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans Image Process 16(9):2284–2298.MathSciNetCrossRef
32.
Zurück zum Zitat Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. Image Process IEEE Trans 13(4):600–612.CrossRef Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. Image Process IEEE Trans 13(4):600–612.CrossRef
33.
Zurück zum Zitat Gu K, Zhai G, Yang X, Zhang W (2015) Using free energy principle for blind image quality assessment. IEEE Trans Multimed 17(1):50–63.CrossRef Gu K, Zhai G, Yang X, Zhang W (2015) Using free energy principle for blind image quality assessment. IEEE Trans Multimed 17(1):50–63.CrossRef
36.
Zurück zum Zitat Li J, Zou L, Yan J, Deng D, Qu T, Xie G (2016) No-reference image quality assessment using Prewitt magnitude based on convolutional neural networks. SIViP 10(4):609–616.CrossRef Li J, Zou L, Yan J, Deng D, Qu T, Xie G (2016) No-reference image quality assessment using Prewitt magnitude based on convolutional neural networks. SIViP 10(4):609–616.CrossRef
39.
Zurück zum Zitat Yamins DL, DiCarlo JJ (2016) Using goal-driven deep learning models to understand sensory cortex. Nat Neurosci 19(3):356.CrossRef Yamins DL, DiCarlo JJ (2016) Using goal-driven deep learning models to understand sensory cortex. Nat Neurosci 19(3):356.CrossRef
40.
Zurück zum Zitat Zhang L, Shen Y, Li H (2014) VSI: a visual saliency-induced index for perceptual image quality assessment. IEEE Trans Image Process 23(10):4270–4281.MathSciNetCrossRefMATH Zhang L, Shen Y, Li H (2014) VSI: a visual saliency-induced index for perceptual image quality assessment. IEEE Trans Image Process 23(10):4270–4281.MathSciNetCrossRefMATH
41.
Zurück zum Zitat Farias MC, Akamine WY (2012) On performance of image quality metrics enhanced with visual attention computational models. Electron Lett 48(11):631–633.CrossRef Farias MC, Akamine WY (2012) On performance of image quality metrics enhanced with visual attention computational models. Electron Lett 48(11):631–633.CrossRef
42.
Zurück zum Zitat Engelke U, Kaprykowsky H, Zepernick HJ, Ndjiki-Nya P (2011) Visual attention in quality assessment. IEEE Signal Proc Mag 28(6):50–59.CrossRef Engelke U, Kaprykowsky H, Zepernick HJ, Ndjiki-Nya P (2011) Visual attention in quality assessment. IEEE Signal Proc Mag 28(6):50–59.CrossRef
43.
Zurück zum Zitat Gu K, Wang S, Yang H, Lin W, Zhai G, Yang X, Zhang W (2016) Saliency-guided quality assessment of screen content images. IEEE Trans Multimed 18(6):1098–1110.CrossRef Gu K, Wang S, Yang H, Lin W, Zhai G, Yang X, Zhang W (2016) Saliency-guided quality assessment of screen content images. IEEE Trans Multimed 18(6):1098–1110.CrossRef
45.
Zurück zum Zitat Le Meur O, Ninassi A, Le Callet P, Barba D (2010) Overt visual attention for free-viewing and quality assessment tasks: impact of the regions of interest on a video quality metric. Signal Process Image Commun 25(7):547–558.CrossRef Le Meur O, Ninassi A, Le Callet P, Barba D (2010) Overt visual attention for free-viewing and quality assessment tasks: impact of the regions of interest on a video quality metric. Signal Process Image Commun 25(7):547–558.CrossRef
46.
Zurück zum Zitat Le Meur O, Ninassi A, Le Callet P, Barba D (2010) Do video coding impairments disturb the visual attention deployment?. Signal Process Image Commun 25(8):597–609.CrossRef Le Meur O, Ninassi A, Le Callet P, Barba D (2010) Do video coding impairments disturb the visual attention deployment?. Signal Process Image Commun 25(8):597–609.CrossRef
47.
Zurück zum Zitat Akamine WY, Farias MC (2014) Video quality assessment using visual attention computational models. J Electron Imaging 23(6):061107.CrossRef Akamine WY, Farias MC (2014) Video quality assessment using visual attention computational models. J Electron Imaging 23(6):061107.CrossRef
49.
Zurück zum Zitat Gu K, Zhai G, Lin W, Liu M (2016) The analysis of image contrast: from quality assessment to automatic enhancement. IEEE Trans Cybern 46(1):284–297.CrossRef Gu K, Zhai G, Lin W, Liu M (2016) The analysis of image contrast: from quality assessment to automatic enhancement. IEEE Trans Cybern 46(1):284–297.CrossRef
50.
Zurück zum Zitat Liu L, Dong H, Huang H, Bovik AC (2014) No-reference image quality assessment in curvelet domain. Signal Process Image Commun 29(4):494–505.CrossRef Liu L, Dong H, Huang H, Bovik AC (2014) No-reference image quality assessment in curvelet domain. Signal Process Image Commun 29(4):494–505.CrossRef
54.
Zurück zum Zitat Natekin A, Knoll A (2013) Gradient boosting machines, a tutorial. Front Neurorobotics 7:21.CrossRef Natekin A, Knoll A (2013) Gradient boosting machines, a tutorial. Front Neurorobotics 7:21.CrossRef
55.
Zurück zum Zitat Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal Mach Intell IEEE Trans 24(7):971–987.CrossRefMATH Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal Mach Intell IEEE Trans 24(7):971–987.CrossRefMATH
56.
Zurück zum Zitat He DC, Wang L (1990) Texture unit, texture spectrum, and texture analysis. Geosci Remote Sens IEEE Trans 28(4):509–512.CrossRef He DC, Wang L (1990) Texture unit, texture spectrum, and texture analysis. Geosci Remote Sens IEEE Trans 28(4):509–512.CrossRef
57.
Zurück zum Zitat Ojala T, Pietikäinen M, Mäenpää T (2000) Gray scale and rotation invariant texture classification with local binary patterns In: Computer Vision-ECCV 2000, 404–420.. Springer, Berlin.CrossRef Ojala T, Pietikäinen M, Mäenpää T (2000) Gray scale and rotation invariant texture classification with local binary patterns In: Computer Vision-ECCV 2000, 404–420.. Springer, Berlin.CrossRef
58.
Zurück zum Zitat Pietikäinen M, Ojala T, Xu Z (2000) Rotation-invariant texture classification using feature distributions. Pattern Recog 33(1):43–52.CrossRef Pietikäinen M, Ojala T, Xu Z (2000) Rotation-invariant texture classification using feature distributions. Pattern Recog 33(1):43–52.CrossRef
59.
Zurück zum Zitat Jain A, Healey G (1998) A multiscale representation including opponent color features for texture recognition. IEEE Trans Image Process 7(1):124–128.CrossRef Jain A, Healey G (1998) A multiscale representation including opponent color features for texture recognition. IEEE Trans Image Process 7(1):124–128.CrossRef
60.
Zurück zum Zitat Sheikh HR, Sabir MF, Bovik AC (2006) A statistical evaluation of recent full reference image quality assessment algorithms. Image Process IEEE Trans 15(11):3440–3451.CrossRef Sheikh HR, Sabir MF, Bovik AC (2006) A statistical evaluation of recent full reference image quality assessment algorithms. Image Process IEEE Trans 15(11):3440–3451.CrossRef
61.
Zurück zum Zitat Ponomarenko N, Jin L, Ieremeiev O, Lukin V, Egiazarian K, Astola J, Vozel B, Chehdi K, Carli M, Battisti F, et al. (2015) Image database TID2013: peculiarities, results and perspectives. Signal Process Image Commun 30:57–77.CrossRef Ponomarenko N, Jin L, Ieremeiev O, Lukin V, Egiazarian K, Astola J, Vozel B, Chehdi K, Carli M, Battisti F, et al. (2015) Image database TID2013: peculiarities, results and perspectives. Signal Process Image Commun 30:57–77.CrossRef
62.
Zurück zum Zitat Zhang J, Sclaroff S (2016) Exploiting surroundedness for saliency detection: a boolean map approach. IEEE Trans Pattern Anal Mach Intell 38(5):889–902.CrossRef Zhang J, Sclaroff S (2016) Exploiting surroundedness for saliency detection: a boolean map approach. IEEE Trans Pattern Anal Mach Intell 38(5):889–902.CrossRef
63.
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830.MathSciNetMATH Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830.MathSciNetMATH
Metadaten
Titel
Referenceless image quality assessment by saliency, color-texture energy, and gradient boosting machines
verfasst von
Pedro Garcia Freitas
Welington Y. L. Akamine
Mylène C. Q. Farias
Publikationsdatum
01.12.2018
Verlag
Springer London
Erschienen in
Journal of the Brazilian Computer Society / Ausgabe 1/2018
Print ISSN: 0104-6500
Elektronische ISSN: 1678-4804
DOI
https://doi.org/10.1186/s13173-018-0073-3

Weitere Artikel der Ausgabe 1/2018

Journal of the Brazilian Computer Society 1/2018 Zur Ausgabe