skip to main content
research-article
Open Access

Automatically scheduling halide image processing pipelines

Published:11 July 2016Publication History
Skip Abstract Section

Abstract

The Halide image processing language has proven to be an effective system for authoring high-performance image processing code. Halide programmers need only provide a high-level strategy for mapping an image processing pipeline to a parallel machine (a schedule), and the Halide compiler carries out the mechanical task of generating platform-specific code that implements the schedule. Unfortunately, designing high-performance schedules for complex image processing pipelines requires substantial knowledge of modern hardware architecture and code-optimization techniques. In this paper we provide an algorithm for automatically generating high-performance schedules for Halide programs. Our solution extends the function bounds analysis already present in the Halide compiler to automatically perform locality and parallelism-enhancing global program transformations typical of those employed by expert Halide developers. The algorithm does not require costly (and often impractical) auto-tuning, and, in seconds, generates schedules for a broad set of image processing benchmarks that are performance-competitive with, and often better than, schedules manually authored by expert Halide developers on server and mobile CPUs, as well as GPUs.

Skip Supplemental Material Section

Supplemental Material

a83.mp4

mp4

140.5 MB

References

  1. Adams, A., Talvala, E.-V., Park, S. H., Jacobs, D. E., Ajdin, B., Gelfand, N., Dolson, J., Vaquero, D., Baek, J., Tico, M., Lensch, H. P. A., Matusik, W., Pulli, K., Horowitz, M., and Levoy, M. 2010. The frankencamera: An experimental platform for computational photography. ACM Transactions on Graphics 29, 4 (July), 29:1--29:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ansel, J., Kamil, S., Veeramachaneni, K., Ragan-Kelley, J., Bosboom, J., O'Reilly, U.-M., and Amarasinghe, S. 2014. OpenTuner: An extensible framework for program autotuning. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, ACM, 303--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chen, J., Paris, S., and Durand, F. 2007. Real-time edge-aware image processing with the bilateral grid. ACM Transactions on Graphics 26, 3 (July), 103:1--103:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Darbon, J., Cunha, A., Chan, T. F., Osher, S., and Jensen, G. J. 2008. Fast nonlocal filtering applied to electron cryomicroscopy. In Biomedical Imaging: From Nano to Macro, 2008. ISBI 2008. 5th IEEE International Symposium on, IEEE, 1331--1334.Google ScholarGoogle Scholar
  5. Farbman, Z., Fattal, R., and Lischinski, D. 2011. Convolution pyramids. ACM Transactions on Graphics 30, 6 (Dec.), 175:1--175:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Harris, C., and Stephens, M. 1988. A combined corner and edge detector. In In Proc. of Fourth Alvey Vision Conference, 147--151.Google ScholarGoogle Scholar
  7. Hegarty, J., Brunhaver, J., DeVito, Z., Ragan-Kelley, J., Cohen, N., Bell, S., Vasilyev, A., Horowitz, M., and Hanrahan, P. 2014. Darkroom: compiling high-level image processing code into hardware pipelines. ACM Transactions on Graphics 33, 4 (July), 144:1--144:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hegarty, J., Daly, R., DeVito, Z., Ragan-Kelley, J., Horowitz, M., and Hanrahan, P. 2016. Rigel: Flexible multi-rate image processing hardware. ACM Transactions on Graphics 36, 4 (July). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.Google ScholarGoogle Scholar
  10. Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097--1105.Google ScholarGoogle Scholar
  11. Mullapudi, R. T., Vasista, V., and Bondhugula, U. 2015. PolyMage: Automatic optimization for image processing pipelines. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 429--443. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Paris, S., Hasinoff, S. W., and Kautz, J. 2011. Local Laplacian filters: edge-aware image processing with a Laplacian pyramid. ACM Transactions on Graphics 30, 4 (July), 68:1--68:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ragan-Kelley, J., Adams, A., Paris, S., Levoy, M., Amarasinghe, S., and Durand, F. 2012. Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Transactions on Graphics 31, 4 (July), 32:1--32:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., and Amarasinghe, S. 2013. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, 519--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ragan-Kelley, J., Adams, A., and Sharlet, D. 2015. An introduction to Halide. In ACM SIGGRAPH 2015 Courses, ACM, 3:1--3:160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rhemann, C., Hosni, A., Bleyer, M., Rother, C., and Gelautz, M. 2011. Fast cost-volume filtering for visual correspondence and beyond. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, 3017--3024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Simonyan, K., and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.Google ScholarGoogle Scholar

Index Terms

  1. Automatically scheduling halide image processing pipelines

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 35, Issue 4
      July 2016
      1396 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/2897824
      Issue’s Table of Contents

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 July 2016
      Published in tog Volume 35, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader