nach oben

Neural Processing Letters

Erschienen in:

12.06.2019

An End-to-End Perceptual Quality Assessment Method via Score Distribution Prediction

verfasst von: Jing Liu, Jingting Wang, Weizhi Nie, Yuting Su, Anan Liu

Erschienen in: Neural Processing Letters | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Image quality assessment (IQA) has become a rapidly growing field of technology as it automatically predicts the perceptual quality, which is of vital importance for consumer-centric services. However, most existing IQA algorithms focus on predicting the mean opinion score regardless of the inevitable opinion diversity. To address this shortcoming, in this paper, we propose to predict the distribution of opinion scores via an end-to-end convolutional neural network. The network is based on a pre-trained ResNet with 50 layers and a novel Statistical Region-of-Interest (ROI) Pooling layer is introduced for lower model complexity, which enables effective training with few datum. Meanwhile, instead of using traditional mean-square-error as loss function, our model is trained with cross-entropy loss, which is more suitable for probability distribution learning. Extensive experiments have been carried out on ESPL-LIVE HDR datasets with highly diverse opinion scores. It is shown that the statistical ROI Pooling is more efficient than traditional ROI Pooling layers and classical dimensionality reduction of principle component analysis. And the proposed algorithm achieves superior performance than state-of-the-art label distribution learning methods in terms of six representative evaluation metrics.

Vorheriger Artikel Inferring Personality Traits from Attentive Regions of User Liked Images Via Weakly Supervised Dual Convolutional Network

Nächster Artikel Blind Image Deconvolution via Enhancing Significant Segments

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Zhao S, Yao H, Gao Y, Ding G, Chua T (2018) Predicting personalized image emotion perceptions in social networks. IEEE Trans Affect Comput 9(4):526–540CrossRef

Jing P, Su Y, Nie L, Bai X, Liu J, Wang M (2018) Low-rank multi-view embedding learning for micro-video popularity prediction. IEEE Trans Knowl Data Eng 30:1519–1532CrossRef

Jing P, Su Y, Nie L, Gu H, Liu J, Wang M (2018) A framework of joint low-rank and sparse regression for image memorability prediction. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2018.2832095

Liu A, Shi Y, Jing P, Liu J, Su Y (2018) Low-rank regularized multi-view inverse-covariance estimation for visual sentiment distribution prediction. J Vis Commun Image Represent 57:243–252CrossRef

Liu A, Wang J, Liu J, Su Y (2018) Comprehensive image quality assessment via predicting the distribution of opinion score. Multimed Tools Appl. https://doi.org/10.1007/s11042-018-6985-2

Ma S, Liu J, Chen W (2017) A-Lamp: adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In: IEEE conference on computer vision and pattern recognition

Min X, Gu K, Zhai G, Liu J, Yang X, Chen CW (2018) Blind quality assessment based on pseudo-reference image. IEEE Trans Multimed 20:2049–2062CrossRef

Liu J, Zhai G, Yang X, Chen L (2014) Lossless predictive coding for images with Bayesian treatment. IEEE Trans Image Process 23(12):5519–5530MathSciNetCrossRef

Liu J, Yang X, Zhai G, Chen L (2013) Hybrid image interpolation with soft-decision kernel regression. In: IEEE international symposium on circuits and systems, Beijing, pp 765–768

10.

Liu J, Zhai G, Yang X, Yang B, Chen L (2015) Spatial error concealment with an adaptive linear predictor. IEEE Trans Circuits Syst Video Technol 25(3):353–366CrossRef

11.

Xu H, Zhai G, Yang X (2013) Single image super-resolution with detail enhancement based on local fractal analysis of gradient. IEEE Trans Circuits Syst Video Technol 23(10):1740–1754CrossRef

12.

Liu J, Zhai G, Liu A, Yang X, Zhao X, Chen CW (2018) IPAD: intensity potential for adaptive De-quantization. IEEE Trans Image Process 27(10):4860–4872MathSciNetCrossRef

13.

Liu J, Liu P, Su Y, Jing P, Yang X (2019) Spatiotemporal symmetric convolutional neural network for video bit-depth enhancement. IEEE Trans Multimed. https://doi.org/10.1109/TMM.2019.2897909

14.

Xu H, Zhai G, Wu X, Yang X (2014) Generalized equalization model for image enhancement. IEEE Trans Multimed 16(1):68–82CrossRef

15.

Zhu W, Zhai G, Hu M, Liu J, Yang X (2018) Arrow’s impossibility theorem inspired subjective image quality assessment approach. Signal Process 145:193–201CrossRef

16.

Sheikh H, Bovik A, De V (2005) An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Trans Image Process 14(12):2117–2128CrossRef

17.

Min X, Gu K, Zhai G, Hu M, Yang X (2018) Saliency-induced reduced reference quality index for natural scene and screen content images. Signal Process 145:127–136CrossRef

18.

Min X, Ma K, Gu K, Zhai G (2017) Unified blind quality assessment compressed natural, graphic, and screen content images. IEEE Trans Image Process 26(11):5462–5474MathSciNetCrossRef

19.

Min X, Zhai G, Gu K, Liu Y, Yang X (2018) Blind image quality estimation via distortion aggravation. IEEE Trans Broadcast 64(2):508–517CrossRef

20.

Geng X, Chao Y, Zhou ZH (2013) Facial age estimation by learning from label distributions. IEEE Trans Pattern Anal Mach Intell 35(10):2401–2412CrossRef

21.

Geng X, Ji R (2013) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748CrossRef

22.

Gao H, Lin S, Li C, Yang Y (2018) Application of hyperspectral image classification based on overlap pooling. Neural Process Lett 49(3):1335–1354CrossRef

23.

Zhang X, Xiong H, Zhou W, Lin W, Tian Q (2017) Picking neural activations for fine-grained recognition. IEEE Trans Multimed 19(12):2736–2750

24.

Ding P, Zhang Y, Jia P, Chang X (2018) A comparison: different DCNN models for intelligent object detection in remote sensing images. Neural Process Lett 1:1–11

25.

Zhang X, Feng J, Xiong H, Tian Q (2018) Zigzag learning for weakly supervised object detection. In: IEEE conference on computer vision and pattern recognition, pp 4262–4270

26.

Liu J, Sun W, Su Y, Jing P, Yang X (2019) BE-CALF: bit-depth enhancement by concatenating all level features of DNN. IEEE Trans Image Process. https://doi.org/10.1109/TIP.2019.2912294

27.

Su Y, Sun W, Liu J, Zhai G, Jing P (2019) Photo-realistic image bit-depth enhancement via residual transposed convolutional neural network. Neurocomputing 347:200–211CrossRef

28.

Ciresan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition, pp 3642–3649

29.

Barker A, Varghese B, Ward JS, Sommerville I (2014) Academic cloud computing research: five pitfalls and five opportunities. In: 6th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 14)

30.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778

31.

He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916CrossRef

32.

Kong S, Shen X, Lin Z, Mech R, Fowlkes C (2016) Photo aesthetics ranking network with attributes and content adaptation. In: European conference on computer vision, Springer, pp 662–679

33.

Jin X, Wu L, Li X, Chen S, Peng S, Chi J, Ge S, Song C, Zhao G (2018) Predicting aesthetic score distribution through cumulative jensen-shannon divergence. In: Thirty-Second AAAI Conference on Artificial Intelligence, 28 April

34.

Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, OxfordMATH

35.

Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, CambridgeMATH

36.

Ponomarenko N, Ieremeiev O, Lukin V, Egiazarian K, Jin L, Astola J, Vozel B, Chehdi K, Carli M, Battisti F (2013) Color image database TID2013: peculiarities and preliminary results. In: European workshop on visual information processing, pp 106–111

37.

Kundu D, Ghadiyaram D, Bovik A, Evans B (2017) No-reference quality assessment of tone-mapped HDR pictures. IEEE Trans Image Process 26(6):2957–2971MathSciNetCrossRef

38.

Larson GW, Rushmeier H, Piatko C (1997) How to assess image quality within a workflow chain: an overview. International journal on digital libraries. IEEE Trans Vis Comput Graph 3(4):291–306CrossRef

39.

Fattal R, Lischinski D, Werman M (2002) Gradient domain high dynamic range compression. ACM Trans Graph 21(3):249–256CrossRef

40.

Durand F, Dorsey J (2002) Fast bilateral filtering for the display of high dynamic range images. In: ACM SIGGRAPH, pp 257–266

41.

Reinhard E, Stark M, Shirley P, Ferwerda J (2002) Photographic tone reproduction for digital images. ACM Trans Graph 21(3):267–276CrossRef

42.

Paul S, Sevcenco I, Agathoklis P (2016) Multi-exposure and multi-focus image fusion in gradient domain. J Circuits Syst Comput 25(10):1650123CrossRef

43.

Pece F, Kautz J, Agathoklis P (2010) Bitmap movement detection: HDR for dynamic scenes. In: Proceedings of the conference on visual media production, pp 1–8

44.

Raman S, Chaudhuri S (2009) Bilateral filter based compositing for variable exposure photography. In: Eurographics - short papers, pp 1–4

45.

Cover TM, Thomas JA (2012) Elements of information theory. Wiley, HobokenMATH

46.

Vedaldi A, Lenc K (2015) MatConvNet—convolutional neural network for MATLAB. In: ACM international conference on multimedia

47.

Deng J, Dong W, Socher R, Li LJ, Kai L, Li F (2009) ImageNet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255

48.

Cha S-H (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 1:300–307

49.

Hou L, Yu CP, Samaras D (2016) Squared earth mover’s distance-based loss for training deep neural networks. Arxiv Preprint, arxiv:1611.05916

50.

Shalev-Shwartz S, Tewari A (2011) Stochastic methods for l1-regularized loss minimization. J Mach Learn Res 12:1865–1892MathSciNetMATH

51.

Kuhn HW, Tucker AW (2014) Nonlinear programming. In: Traces and emergence of nonlinear programming. Birkhäuser, Basel, pp 247–258

Titel: An End-to-End Perceptual Quality Assessment Method via Score Distribution Prediction
verfasst von: Jing Liu
Jingting Wang
Weizhi Nie
Yuting Su
Anan Liu
Publikationsdatum: 12.06.2019
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 3/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-019-10057-1

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2020

State Estimation of Quaternion-Valued Neural Networks with Leakage Time Delay and Mixed Two Additive Time-Varying Delays

Infrared Image Extraction Algorithm Based on Adaptive Growth Immune Field

A Balanced Feature Fusion SSD for Object Detection

RSDCN: A Road Semantic Guided Sparse Depth Completion Network

Visual Sentiment Prediction with Attribute Augmentation and Multi-attention Mechanism

Md-Net: Multi-scale Dilated Convolution Network for CT Images Segmentation

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.