skip to main content
research-article

Smart Scribbles for Image Matting

Authors Info & Claims
Published:17 December 2020Publication History
Skip Abstract Section

Abstract

Image matting is an ill-posed problem that usually requires additional user input, such as trimaps or scribbles. Drawing a fine trimap requires a large amount of user effort, while using scribbles can hardly obtain satisfactory alpha mattes for non-professional users. Some recent deep learning–based matting networks rely on large-scale composite datasets for training to improve performance, resulting in the occasional appearance of obvious artifacts when processing natural images. In this article, we explore the intrinsic relationship between user input and alpha mattes and strike a balance between user effort and the quality of alpha mattes. In particular, we propose an interactive framework, referred to as smart scribbles, to guide users to draw few scribbles on the input images to produce high-quality alpha mattes. It first infers the most informative regions of an image for drawing scribbles to indicate different categories (foreground, background, or unknown) and then spreads these scribbles (i.e., the category labels) to the rest of the image via our well-designed two-phase propagation. Both neighboring low-level affinities and high-level semantic features are considered during the propagation process. Our method can be optimized without large-scale matting datasets and exhibits more universality in real situations. Extensive experiments demonstrate that smart scribbles can produce more accurate alpha mattes with reduced additional input, compared to the state-of-the-art matting methods.

References

  1. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 11 (2012), 2274--2282.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Y. Aksoy, T. O. Aydin, and M. Pollefeys. 2017. Designing effective inter-pixel information flow for natural image matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 228--236.Google ScholarGoogle Scholar
  3. V. Badrinarayanan, A. Kendall, and R. Cipolla. 2017. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 12 (2017), 2481--2495.Google ScholarGoogle ScholarCross RefCross Ref
  4. S. Cai, X. Zhang, H. Fan, H. Huang, J. Liu, J. Liu, J. Liu, J. Wang, and J. Sun. 2019. Disentangled image matting. In Proceedings of the International Conference on Computer Vision (ICCV’19). 8818--8827.Google ScholarGoogle Scholar
  5. L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. 2018. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2018), 834--848.Google ScholarGoogle ScholarCross RefCross Ref
  6. Quan Chen, Tiezheng Ge, Yanyu Xu, Zhiqiang Zhang, Xinxin Yang, and Kun Gai. 2018. Semantic human matting. In Proceedings of the ACM International Conference on Multimedia (MM’18). 618--626.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Q. Chen, D. Li, and C. Tang. 2013. KNN matting. IEEE Trans. Pattern Anal. Mach. Intell. 35, 9 (2013), 2175--2188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Donghyeon Cho, Yu-Wing Tai, and Inso Kweon. 2016. Natural image matting using deep convolutional neural networks. In Proceedings of the European Conference on Computer Vision (ECCV’16). 626--643.Google ScholarGoogle ScholarCross RefCross Ref
  9. Yuki Endo, Satoshi Iizuka, Yoshihiro Kanamori, and Jun Mitani. 2016. DeepProp: Extracting deep features from a single image for edit propagation. In Proceedings of the Annual Conference of the European Association for Computer Graphics (EG’16). 189--201.Google ScholarGoogle ScholarCross RefCross Ref
  10. Xiaoxue Feng, Xiaohui Liang, and Zili Zhang. 2016. A cluster sampling method for image matting via sparse coding. In Proceedings of the European Conference on Computer Vision (ECCV’16). 204--219.Google ScholarGoogle ScholarCross RefCross Ref
  11. Mauro Gasparini. 1997. Markov chain Monte Carlo in practice. Technometrics 39, 3 (1997), 338--338.Google ScholarGoogle ScholarCross RefCross Ref
  12. Eduardo S. L. Gastal and Manuel M. Oliveira. 2010. Shared sampling for real-time alpha matting. Comput. Graph. Forum 29, 2 (2010), 575--584.Google ScholarGoogle ScholarCross RefCross Ref
  13. Leo Grady, Thomas Schiwietz, Shmuel Aharon, and Rüdiger Westermann. 2005. Random walks for interactive alpha-matting. In Proceedings of the Visualization, Imaging, and Image Processing (VIIP’05). 423--429.Google ScholarGoogle Scholar
  14. Yu Guan, Wei Chen, Xiao Liang, Zi’ang Ding, and Qunsheng Peng. 2006. Easy matting—A stroke based approach for continuous image matting. Comput. Graph. Forum 25, 3 (2006), 567--576.Google ScholarGoogle ScholarCross RefCross Ref
  15. Q. Hou and F. Liu. 2019. Context-aware image matting for simultaneous foreground and alpha estimation. In Proceedings of the International Conference on Computer Vision (ICCV’19). 4129--4138.Google ScholarGoogle Scholar
  16. L. Karacan, A. Erdem, and E. Erdem. 2015. Image matting with KL-divergence based sparse sampling. In Proceedings of the International Conference on Computer Vision (ICCV’15). 424--432.Google ScholarGoogle Scholar
  17. P. Lee and Ying Wu. 2011. Nonlocal matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 2193--2200.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Anat Levin, Dani Lischinski, and Yair Weiss. 2007. A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30, 2 (2007), 228--242.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Anat Levin, Alex Rav-Acha, and Dani Lischinski. 2008. Spectral matting. IEEE Trans. Pattern Anal. Mach. Intell. 30, 10 (2008), 1699--1712.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Chao Li, Ping Wang, Xiangyu Zhu, and Huali Pi. 2017. Three-layer graph framework with the sumD feature for alpha matting. Comput. Vision Image Understand. 162 (2017), 34--45.Google ScholarGoogle ScholarCross RefCross Ref
  21. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 3431--3440.Google ScholarGoogle ScholarCross RefCross Ref
  22. H. Lu, Y. Dai, C. Shen, and S. Xu. 2019. Indices matter: Learning to index for deep image matting. In Proceedings of the International Conference on Computer Vision (ICCV’19). 3265--3274.Google ScholarGoogle Scholar
  23. Sebastian Lutz, Konstantinos Amplianitis, and Aljoscha Smolic. 2018. AlphaGAN: Generative adversarial networks for natural image matting. In Proceedings of the British Machine Vision Conference (BMVC’18). 259.Google ScholarGoogle Scholar
  24. Yu Qiao, Yuhao Liu, Xin Yang, Dongsheng Zhou, Mingliang Xu, Qiang Zhang, and Xiaopeng Wei. 2020. Attention-guided hierarchical structure aggregation for image matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’20).Google ScholarGoogle ScholarCross RefCross Ref
  25. C. Rhemann and C. Rother. 2011. A global sampling method for alpha matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 2049--2056.Google ScholarGoogle Scholar
  26. C. Rhemann, C. Rother, Jue Wang, M. Gelautz, P. Kohli, and P. Rott. 2009. A perceptually motivated online benchmark for image matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). 1826--1833.Google ScholarGoogle Scholar
  27. Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. 2004. “GrabCut”: Interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 3 (2004), 309--314.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ehsan Shahrian, Deepu Rajan, Brian Price, and Scott Cohen. 2013. Improving image matting using comprehensive sampling sets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). 636--643.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. E. Shelhamer, J. Long, and T. Darrell. 2017. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 4 (2017), 640--651.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Xiaoyong Shen, Xin Tao, Hongyun Gao, Chao Zhou, and Jiaya Jia. 2016. Deep automatic portrait matting. In Proceedings of the European Conference on Computer Vision (ECCV’16). 92--107.Google ScholarGoogle ScholarCross RefCross Ref
  31. Jian Sun, Jiaya Jia, Chi Keung Tang, and Heung Yeung Shum. 2004. Poisson matting. ACM Trans. Graph. 23, 3 (2004), 315--321.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Sun, H. Lu, and X. Liu. 2015. Saliency region detection based on Markov absorption probabilities. IEEE Trans. Image Process. 24, 5 (2015), 1639--1649.Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Tang, Y. Aksoy, C. Oztireli, M. Gross, and T. O. Aydin. 2019. Learning-based sampling for natural image matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 3050--3058.Google ScholarGoogle Scholar
  34. Jue Wang and Michael F. Cohen. 2005. An iterative optimization approach for unified image segmentation and matting. In Proceedings of the International Conference on Computer Vision (ICCV’05). 936--943.Google ScholarGoogle Scholar
  35. Jue Wang and Michael F. Cohen. 2007. Optimized color sampling for robust matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’07). 1--8.Google ScholarGoogle Scholar
  36. Yuhang Wang, Jing Liu, Yong Li, Junjie Yan, and Hanqing Lu. 2016. Objectness-aware semantic segmentation. In Proceedings of the ACM International Conference on Multimedia (MM’16). 307--311.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ke Xu, Xin Wang, Xin Yang, Shengfeng He, Qiang Zhang, Baocai Yin, Xiaopeng Wei, and Rynson W. H. Lau. 2018. Efficient image super-resolution integration. Visual Comput. 34, 6–8 (2018), 1065--1076.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Ning Xu, Brian Price, Scott Cohen, and Thomas Huang. 2017. Deep image matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 311--320.Google ScholarGoogle ScholarCross RefCross Ref
  39. Xin Yang, Haiyang Mei, Jiqing Zhang, Ke Xu, Baocai Yin, Qiang Zhang, and Xiaopeng Wei. 2019. DRFN: Deep recurrent fusion network for single-image super-resolution with large factors. IEEE Trans. Multimedia 21, 2 (2019), 328--337.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Xin Yang, Ke Xu, Shaozhe Chen, Shengfeng He, Baocai Yin Yin, and Rynson Lau. 2018. Active matting. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS’18). 4590--4600.Google ScholarGoogle Scholar
  41. Yung-Yu Chuang, B. Curless, D. H. Salesin, and R. Szeliski. 2001. A Bayesian approach to digital matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01). II--II.Google ScholarGoogle Scholar
  42. Jiqing Zhang, Chengjiang Long, Yuxin Wang, Xin Yang, Haiyang Mei, and Baocai Yin. 2020. Multi-context and enhanced reconstruction network for single image super resolution. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’20). 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  43. Y. Zhang, L. Gong, L. Fan, P. Ren, Q. Huang, H. Bao, and W. Xu. 2019. A late fusion CNN for digital matting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 7461--7470.Google ScholarGoogle Scholar
  44. Yuanjie Zheng and Chandra Kambhamettu. 2009. Learning based digital matting. In Proceedings of the International Conference on Computer Vision (ICCV’09). 889--896.Google ScholarGoogle Scholar
  45. C. Lawrence Zitnick and Piotr Dollar. 2014. Edge boxes: Locating object proposals from edges. In Proceedings of the European Conference on Computer Vision (ECCV’14). 391--405.Google ScholarGoogle Scholar

Index Terms

  1. Smart Scribbles for Image Matting

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 4
      November 2020
      372 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3444749
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 December 2020
      • Accepted: 1 May 2020
      • Revised: 1 April 2020
      • Received: 1 January 2020
      Published in tomm Volume 16, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format