research-article

Deep bilateral learning for real-time image enhancement

Authors:
Michaël Gharbi

MIT CSAIL

MIT CSAIL
View Profile

,
Jiawen Chen

Google Research

Google Research
View Profile

,
Jonathan T. Barron

Google Research

Google Research
View Profile

,
Samuel W. Hasinoff

Google Research

Google Research
View Profile

,
Frédo Durand

Université Côte d'Azur

Université Côte d'Azur
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 36 Issue 4Article No.: 118pp 1–12https://doi.org/10.1145/3072959.3073592

Published:20 July 2017Publication History

ACM Transactions on Graphics

Abstract

Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.

Supplemental Material

papers-0027.mp4

mp4

217.8 MB

Download

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http://tensorflow.org/Google Scholar
Andrew Adams, Jongmin Baek, and Myers Abraham Davis. 2010. Fast High-Dimensional Filtering Using the Permutohedral Lattice. Computer Graphics Forum (2010).Google Scholar
Mathieu Aubry, Sylvain Paris, Samuel W Hasinoff, Jan Kautz, and Frédo Durand. 2014. Fast local laplacian filters: Theory and applications. ACM TOG (2014). Google ScholarDigital Library
Jonathan T Barron, Andrew Adams, YiChang Shih, and Carlos Hernández. 2015. Fast bilateral-space stereo for synthetic defocus. CVPR (2015).Google Scholar
Jonathan T Barron and Ben Poole. 2016. The Fast Bilateral Solver. ECCV (2016).Google Scholar
Adrien Bousseau, Sylvain Paris, and Frédo Durand. 2009. User-assisted intrinsic images. ACM TOG (2009). Google ScholarDigital Library
Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs. CVPR (2011). Google ScholarDigital Library
Jiawen Chen, Andrew Adams, Neal Wadhwa, and Samuel W Hasinoff. 2016. Bilateral guided upsampling. ACM TOG (2016). Google ScholarDigital Library
Jiawen Chen, Sylvain Paris, and Frédo Durand. 2007. Real-time edge-aware image processing with the bilateral grid. ACM TOG (2007). Google ScholarDigital Library
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. ECCV (2014).Google Scholar
David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. NIPS (2014). Google ScholarDigital Library
Zeev Farbman, Raanan Fattal, and Dani Lischinski. 2011. Convolution pyramids. ACM TOG (2011). Google ScholarDigital Library
Michaël Gharbi, Gaurav Chaurasia, Sylvain Paris, and Frédo Durand. 2016. Deep Joint Demosaicking and Denoising. ACM TOG (2016). Google ScholarDigital Library
Michaël Gharbi, YiChang Shih, Gaurav Chaurasia, Jonathan Ragan-Kelley, Sylvain Paris, and Frédo Durand. 2015. Transform Recipes for Efficient Cloud Photo Enhancement. ACM TOG (2015). Google ScholarDigital Library
Samuel W Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jonathan T Barron, Florian Kainz, Jiawen Chen, and Marc Levoy. 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM TOG (2016). Google ScholarDigital Library
Kaiming He and Jian Sun. 2015. Fast Guided Filter. CoRR (2015).Google Scholar
Kaiming He, Jian Sun, and Xiaoou Tang. 2013. Guided image filtering. TPAMI (2013).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. CoRR (2015). Google ScholarDigital Library
James Hegarty, John Brunhaver, Zachary DeVito, Jonathan Ragan-Kelley, Noy Cohen, Steven Bell, Artem Vasilyev, Mark Horowitz, and Pat Hanrahan. 2014. Darkroom: compiling high-level image processing code into hardware pipelines. ACM TOG (2014). Google ScholarDigital Library
Sung Ju Hwang, Ashish Kapoor, and Sing Bing Kang. 2012. Context-based automatic local image enhancement. ECCV (2012).Google Scholar
Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color!: joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM TOG (2016). Google ScholarDigital Library
Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. 2016. Flownet 2.0: Evolution of optical flow estimation with deep networks. CoRR (2016).Google Scholar
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. ICML (2015). Google ScholarDigital Library
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. CoRR (2016).Google Scholar
Max Jaderberg, Karen Simonyan, Andrew Zisserman, and others. 2015. Spatial transformer networks. In Advances in Neural Information Processing Systems. 2017--2025. Google ScholarDigital Library
Vidit Jain and Erik Learned-Miller. 2010. FDDB: A Benchmark for Face Detection in Unconstrained Settings. Technical Report UM-CS-2010--009. University of Massachusetts, Amherst.Google Scholar
Varun Jampani, Martin Kiefel, and Peter V. Gehler. 2016. Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks. CVPR (2016).Google Scholar
Liad Kaufman, Dani Lischinski, and Michael Werman. 2012. Content-Aware Automatic Photo Enhancement. Computer Graphics Forum (2012). Google ScholarDigital Library
Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR (2015).Google Scholar
Johannes Kopf, Michael F Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint bilateral upsampling. ACM TOG (2007). Google ScholarDigital Library
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet classification with deep convolutional neural networks. NIPS (2012). Google ScholarDigital Library
Anat Levin, Dani Lischinski, and Yair Weiss. 2008. A closed-form solution to natural image matting. TPAMI (2008). Google ScholarDigital Library
Sifei Liu, Jinshan Pan, and Ming-Hsuan Yang. 2016. Learning recursive filters for low-level vision via a hybrid neural network. ECCV (2016).Google Scholar
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. CVPR (2015).Google Scholar
Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, and Kayvon Fatahalian. 2016. Automatically Scheduling Halide Image Processing Pipelines. ACM TOG (2016). Google ScholarDigital Library
Sylvain Paris and Frédo Durand. 2006. A fast approximation of the bilateral filter using a signal processing approach. ECCV (2006). Google ScholarDigital Library
Sylvain Paris, Samuel W Hasinoff, and Jan Kautz. 2011. Local Laplacian filters: edge-aware image processing with a Laplacian pyramid. ACM TOG (2011). Google ScholarDigital Library
Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. 2012. Decoupling Algorithms from Schedules for Easy Optimization of Image Processing Pipelines. ACM TOG (2012). Google ScholarDigital Library
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Google ScholarCross Ref
Xiaoyong Shen, Xin Tao, Hongyun Gao, Chao Zhou, and Jiaya Jia. 2016. Deep Automatic Portrait Matting. ECCV (2016).Google Scholar
Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM TOG (2013). Google ScholarDigital Library
Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. ICCV (1998). Google ScholarDigital Library
Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep Edge-Aware Filters. ICML (2015). Google ScholarDigital Library
Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, and Yizhou Yu. 2016. Automatic photo adjustment using deep neural networks. ACM TOG (2016). Google ScholarDigital Library
Fisher Yu and Vladlen Koltun. 2015. Multi-scale context aggregation by dilated convolutions. CoRR (2015).Google Scholar
Lu Yuan and Jian Sun. 2011. High quality image reconstruction from raw and jpeg image pair. ICCV (2011). Google ScholarDigital Library
Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2016. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. CoRR (2016). Google ScholarDigital Library

Index Terms

Deep bilateral learning for real-time image enhancement
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        Computational photography
  2. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Deep learning for multisensor image resolution enhancement
GeoAI '17: Proceedings of the 1st Workshop on Artificial Intelligence and Deep Learning for Geographic Knowledge Discovery

We describe a deep learning convolutional neural network (CNN) for enhancing low resolution multispectral satellite imagery without the use of a panchromatic image. For training, low resolution images are used as input and corresponding high resolution ...
Read More
A survey of deep learning approaches to image restoration
Abstract
In this paper, we present an extensive review on deep learning methods for image restoration tasks. Deep learning techniques, led by convolutional neural networks, have received a great deal of attention in almost all areas of image ...
Read More
An end-to-end deep learning approach for real-time single image dehazing
Abstract
Image dehazing methods can restore clean images from hazy images and are popularly used as a preprocessing step to improve performance in various image analysis tasks. In recent times, deep learning-based methods have been used to sharply increase ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 36, Issue 4
August 2017
2155 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3072959
Issue’s Table of Contents

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 July 2017
Published in tog Volume 36, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
convolutional neural networks
data-driven methods
deep learning
real-time image processing
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 543
  Total Citations
  View Citations
- 3,787
  Total Downloads
- Downloads (Last 12 months)373
- Downloads (Last 6 weeks)38
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep bilateral learning for real-time image enhancement

ACM Transactions on Graphics

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Deep learning for multisensor image resolution enhancement

A survey of deep learning approaches to image restoration

An end-to-end deep learning approach for real-time single image dehazing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Deep bilateral learning for real-time image enhancement

ACM Transactions on Graphics

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Deep learning for multisensor image resolution enhancement

A survey of deep learning approaches to image restoration

An end-to-end deep learning approach for real-time single image dehazing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media