skip to main content
research-article
Open Access

Green AI

Published:17 November 2020Publication History
Skip Abstract Section

Abstract

Creating efficiency in AI research will decrease its carbon footprint and increase its inclusivity as deep learning study should not require the deepest pockets.

References

  1. Acharyya, P., Rosario, S.D., Flor, F., Joshi, R., Li, D., Linares, R, and Zhang, H. Autopilot of cement plants for reduction of fuel consumption and emissions. In Proceedings of ICML Workshop on Climate Change, 2019.Google ScholarGoogle Scholar
  2. Amodei, D. and Hernandez, D. AI and compute, 2018. Blog post.Google ScholarGoogle Scholar
  3. Bergstra, J.S., Bardenet, R., Bengio, Y. and Kégl, B. Algorithms for hyper-parameter optimization. In Proceedings of NeurIPS, 2011.Google ScholarGoogle Scholar
  4. Brown, T.B. et al. Language models are few-shot learners, 2020; arXiv:2005.14165.Google ScholarGoogle Scholar
  5. Canziani, A., Paszke, A. and Culurciello, E. An analysis of deep neural network models for practical applications. In Proceedings of ISCAS, 2017.Google ScholarGoogle Scholar
  6. Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S. and Feng, J. Dual path networks. In Proceedings of NeurIPS, 2017.Google ScholarGoogle Scholar
  7. Deng, J., Dong, W., Socher, R., Li, L-J, Li, K. and Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of CVPR, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  8. Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. BERT: Pretraining of deep bidirectional transformers for language understanding. In Proceedings of NAACL, 2019.Google ScholarGoogle Scholar
  9. Dodge, J., Gururangan, S., Card, D., Schwartz, R. and Smith, N.A. Show your work: Improved reporting of experimental results. In Proceedings of EMNLP, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  10. Dodge, J., Ilharco, G., Schwartz, R., Farhadi, A., Hajishirzi, H. and Smith, N.A. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping, 2020; arXiv:2002.06305.Google ScholarGoogle Scholar
  11. Dodge, J., Jamieson, K. and Smith, N.A. Open loop hyperparameter optimization and determinantal point processes. In Proceedings of AutoML, 2017.Google ScholarGoogle Scholar
  12. Duhart, C., Dublon, G., Mayton, B., Davenport, G. and Paradiso, J.A. Deep learning for wildlife conservation and restoration efforts. In Proceedings of ICML Workshop on Climate Change, 2019.Google ScholarGoogle Scholar
  13. Gordon, A., Eban, E., Nachum, O., Chen, B., Wu, H., Yang, T-J, and Choi, E. MorphNet: Fast & simple resource-constrained structure learning of deep networks. In Proceedings of CVPR, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  14. Halevy, A., Norvig, P. and Pereira, F. The unreasonable effectiveness of data. IEEE Intelligent Systems 24 (2009), 8--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. He, K., Zhang, X., Ren, S. and Sun, J. Deep residual learning for image recognition. In Proceedings of CVPR, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  16. Henderson, P., Hu, J., Romoff, J., Brunskill, E., Jurafsky, D. and Pineau, J. Towards the systematic reporting of the energy and carbon footprints of machine learning, 2020; arXiv:2002.05651.Google ScholarGoogle Scholar
  17. Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Howard, A.G. et al. MobileNets: Efficient convolutional neural networks for mobile vision applications, 2017; arXiv:1704.04861.Google ScholarGoogle Scholar
  19. Hu, J., Shen, L. and Sun, G. Squeeze-and-excitation networks. In Proceedings of CVPR, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  20. Huang, J. et al. Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of CVPR, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  21. Jeon, Y. and Kim, J. Constructing fast network through deconstruction of convolution. In Proceedings of NeurIPS, 2018.Google ScholarGoogle Scholar
  22. Jouppi, N.P. et al. In-datacenter performance analysis of a tensor processing unit. In Proceedings of ISCA 1, 1 (2017), Publ. date: June 2020.Google ScholarGoogle Scholar
  23. Kamthe, S. and Deisenroth, M.P. Data-efficient reinforcement learning with probabilistic model predictive control. In Proceedings of AISTATS, 2018.Google ScholarGoogle Scholar
  24. Krizhevsky, A., Sutskever, I. and Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of NeurIPS, 2012.Google ScholarGoogle Scholar
  25. Lacoste, A., Luccioni, A., Schmidt, V. and Dandres, T. Quantifying the carbon emissions of machine learning. In Proceedings of the Climate Change AI Workshop, 2019.Google ScholarGoogle Scholar
  26. Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A. and Talwalkar, A. Hyperband: Bandit-based configuration evaluation for hyperparameter optimization. In Proceedings of ICLR, 2017.Google ScholarGoogle Scholar
  27. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S. Fu, C-Y and Berg, A.C. SSD: Single shot multibox detector. In Proceedings of ECCV, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  28. Liu, Y. et al. RoBERTa: A robustly optimized BERT pretraining approach, 2019; arXiv:1907.11692.Google ScholarGoogle Scholar
  29. Ma, N., Zhang, X., Zheng, H.T and Sun, J. ShuffleNet V2: Practical guidelines for efficient cnn architecture design. In Proceedings of ECCV, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  30. Mahajan, D. et al. Exploring the limits of weakly supervised pretraining, 2018; arXiv:1805.00932.Google ScholarGoogle Scholar
  31. Melis, G., Dyer, C. and Blunsom, P. On the state of the art of evaluation in neural language models. In Proceedings of EMNLP, 2018.Google ScholarGoogle Scholar
  32. Molchanov, P., Tyree, S., Karras, T., Aila, T. and Kautz, J. Pruning convolutional neural networks for resource efficient inference. In Proceedings of ICLR, 2017.Google ScholarGoogle Scholar
  33. Moore, G.E. Cramming more components onto integrated circuits, 1965.Google ScholarGoogle Scholar
  34. Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. Deep contextualized word representations. In Proceedings of NAACL, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  35. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. and Sutskever, I. Language models are unsupervised multitask learners.. OpenAI Blog, 2019.Google ScholarGoogle Scholar
  36. Raffel, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer, 2019; arXiv:1910.10683.Google ScholarGoogle Scholar
  37. Rastegari, M., Ordonez, V., Redmon, J. and Farhadi, A. Xnornet: Imagenet classification using binary convolutional neural networks. In Proceedings of ECCV, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  38. Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of CVPR, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  39. Rolnick, D. et al. Tackling climate change with machine learning, 2019; arXiv:1905.12616.Google ScholarGoogle Scholar
  40. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of CVPR, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  41. Schwartz, R., Thomson, S. and Smith, N.A. SoPa: Bridging CNNs, RNNs, and weighted finite-state machines. In Proceedings of ACL, 2018.Google ScholarGoogle Scholar
  42. Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., Catanzaro, B. Megatron-LM: Training multi-billion parameter language models using GPU model parallelism, 2019; arXiv:1909.08053.Google ScholarGoogle Scholar
  43. Shoham, Y. et al. The AI index 2018 annual report. AI Index Steering Committee, Human-Centered AI Initiative, Stanford University; http://cdn.aiindex.org/2018/AI%20Index%202018%20Annual%20Report.pdf.Google ScholarGoogle Scholar
  44. Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016) 484.Google ScholarGoogle ScholarCross RefCross Ref
  45. Silver, D. et al. Mastering chess and shogi by self-play with a general reinforcement learning algorithm, 2017; arXiv:1712.01815.Google ScholarGoogle Scholar
  46. Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 7676 (2017), 354.Google ScholarGoogle Scholar
  47. Strubell, E., Ganesh, A. and McCallum, A. Energy and policy considerations for deep learning in NLP. In Proceedings of ACL, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  48. Sun, C., Shrivastava, A., Singh, S. and Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of ICCV, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  49. Tsang, I., Kwok, J.T. and Cheung, P.M. Core vector machines: Fast SVM training on very large data sets. JMLR 6 (Apr. 2005), 363--392.Google ScholarGoogle Scholar
  50. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L. and Polosukhin, I. Attention is all you need. In Proceedings of NeurIPS, 2017.Google ScholarGoogle Scholar
  51. Veniat, T. and Denoyer, L. Learning time/memory-efficient deep architectures with budgeted super networks. In Proceedings of CVPR, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  52. Walsman, A., Bisk, Y., Gabriel, S., Misra, D., Artzi, Y., Choi, Y. and Fox, D. Early fusion for goal directed robotic vision. In Proceedings of IROS, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Wang, A. Pruksachatkun, Y., Nangia, N., Singh, A., Michael, J., Hill, F., Levy, O. and Bowman, S.R. SuperGLUE: A stickier benchmark for general-purpose language understanding systems, 2019; arXiv:1905.00537.Google ScholarGoogle Scholar
  54. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O. and Bowman, S.R. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of ICLR, 2019.Google ScholarGoogle Scholar
  55. Xie, S., Girshick, R., Dollar, P., Tu, Z. and He, K. Aggregated residual transformations for deep neural networks. In Proceedings of CVPR, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  56. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. and Le, Q.V. XLNet: Generalized autoregressive pretraining for language understanding, 2019; arXiv:1906.08237.Google ScholarGoogle Scholar
  57. Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F. and Choi, Y. Defending against neural fake news, 2019; arXiv:1905.12616.Google ScholarGoogle Scholar
  58. Zhang, X., Zhou, X., Lin, M. and Sun, J. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of CVPR, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  59. Zoph, B. and Le, Q.V. Neural architecture search with reinforcement learning. In Proceedings of ICLR, 2017.Google ScholarGoogle Scholar

Index Terms

  1. Green AI

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Communications of the ACM
          Communications of the ACM  Volume 63, Issue 12
          December 2020
          92 pages
          ISSN:0001-0782
          EISSN:1557-7317
          DOI:10.1145/3437360
          Issue’s Table of Contents

          Copyright © 2020 Owner/Author

          This work is licensed under a Creative Commons Attribution International 4.0 License.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 17 November 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Popular
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format