Skip to main content
Erschienen in: International Journal of Parallel Programming 3/2019

29.05.2019

Float-Fix: An Efficient and Hardware-Friendly Data Type for Deep Neural Network

verfasst von: Dong Han, Shengyuan Zhou, Tian Zhi, Yibo Wang, Shaoli Liu

Erschienen in: International Journal of Parallel Programming | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recent years, as deep learning rose in prominence, neural network accelerators boomed. The existing research shows that both speed and energy-efficiency can be improved by low precision data structure. However, decreasing the precision of data might compromise the usefulness and accuracy of the underlying AI. And the existing studies can not meet all AI application requirements. In the paper, we propose a new data type, called Float-Fix (FF). We introduce the structure of FF and compare it with other data types. In our evaluation, the accuracy loss of 8-bit FF is less than 0.12% on a subset of known neural network models, 7\(\times \) better than fixed-point, DFX and floating-point on average. We implement the hardware architectures of operators and neural processing unit using 8-bit FF data type with TSMC 65 nm Gplus High VT library. The experiments show that the hardware cost of convertors converting between 16-bit fixed-point and FF is really small. And the multiplier of 8-bit FF only needs 1188 \(\upmu \mathrm{m}^2\) area, which is nearly 8-bit fixed-point. Comparing with the neural processing unit of DianNao, FF reduces 34.3% area.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Albericio, J., Judd, P., Hetherington, T., Aamodt, T., Jerger, N.E., Moshovos, A.: Cnvlutin: ineffectual-neuron-free deep neural network computing. In: International Symposium on Computer Architecture, pp. 1–13 (2016) Albericio, J., Judd, P., Hetherington, T., Aamodt, T., Jerger, N.E., Moshovos, A.: Cnvlutin: ineffectual-neuron-free deep neural network computing. In: International Symposium on Computer Architecture, pp. 1–13 (2016)
2.
Zurück zum Zitat Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., Temam, O.: Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 269–284. ACM (2014) Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., Temam, O.: Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 269–284. ACM (2014)
3.
Zurück zum Zitat Chen, Y., Luo, T., Liu, S., Zhang, S., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N.: Dadiannao: a machine-learning supercomputer. In: IEEE/ACM International Symposium on Microarchitecture, pp. 609–622 (2014) Chen, Y., Luo, T., Liu, S., Zhang, S., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N.: Dadiannao: a machine-learning supercomputer. In: IEEE/ACM International Symposium on Microarchitecture, pp. 609–622 (2014)
4.
Zurück zum Zitat Courbariaux, M., Hubara, I., Soudry, D., et al.: Binarized neural networks: training deep neural networks with weights and activations constrained to \(+\)1 or \(-\)1 (2016). arXiv:1602.02830 Courbariaux, M., Hubara, I., Soudry, D., et al.: Binarized neural networks: training deep neural networks with weights and activations constrained to \(+\)1 or \(-\)1 (2016). arXiv:​1602.​02830
5.
Zurück zum Zitat Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations, pp. 3123–3131 (2015) Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations, pp. 3123–3131 (2015)
6.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: a large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, 248–255 (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: a large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, 248–255 (2009)
7.
Zurück zum Zitat Dettmers, T.: 8-bit approximations for parallelism in deep learning (2016) Dettmers, T.: 8-bit approximations for parallelism in deep learning (2016)
8.
Zurück zum Zitat Ding, C., Liao, S., Wang, Y., Li, Z., Liu, N., Zhuo, Y., Wang, C., Qian, X., Bai, Y., Yuan, G.: Circnn: accelerating and compressing deep neural networks using block-circulant weight matrices (2017) Ding, C., Liao, S., Wang, Y., Li, Z., Liu, N., Zhuo, Y., Wang, C., Qian, X., Bai, Y., Yuan, G.: Circnn: accelerating and compressing deep neural networks using block-circulant weight matrices (2017)
9.
Zurück zum Zitat Du, Z., Fasthuber, R., Chen, T., Ienne, P., Li, L., Luo, T., Feng, X., Chen, Y., Temam, O.: Shidiannao. ACM SIGARCH Comput. Architect. News 43(3), 92–104 (2015)CrossRef Du, Z., Fasthuber, R., Chen, T., Ienne, P., Li, L., Luo, T., Feng, X., Chen, Y., Temam, O.: Shidiannao. ACM SIGARCH Comput. Architect. News 43(3), 92–104 (2015)CrossRef
10.
Zurück zum Zitat Du, Z., Lingamneni, A., Chen, Y., Palem, K., Temam, O., Wu, C.: Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. In: Design Automation Conference, pp. 201–206 (2014) Du, Z., Lingamneni, A., Chen, Y., Palem, K., Temam, O., Wu, C.: Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. In: Design Automation Conference, pp. 201–206 (2014)
11.
Zurück zum Zitat Ewe, C.T., Cheung, P.Y.K., Constantinides, G.A.: Dual fixed-point: an efficient alternative to floating-point computation, pp. 200–208 (2004) Ewe, C.T., Cheung, P.Y.K., Constantinides, G.A.: Dual fixed-point: an efficient alternative to floating-point computation, pp. 200–208 (2004)
12.
Zurück zum Zitat Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. Fiber 56(4), 3–7 (2015) Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. Fiber 56(4), 3–7 (2015)
13.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, pp. 770–778 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, pp. 770–778 (2015)
14.
Zurück zum Zitat Hubara, I., Courbariaux, M., Soudry, D., et al.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18(1), 6869–6898 (2017)MathSciNetMATH Hubara, I., Courbariaux, M., Soudry, D., et al.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18(1), 6869–6898 (2017)MathSciNetMATH
15.
Zurück zum Zitat Jouppi, N.P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., Borchers, A., et al.: In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, pp. 1–12. ACM (2017) Jouppi, N.P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., Borchers, A., et al.: In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, pp. 1–12. ACM (2017)
16.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105 (2012)
17.
Zurück zum Zitat Köster, U., Webb, T.J., Wang, X., Nassar, M., Bansal, A.K., Constable, W.H., Elibol, O.H., Gray, S., Hall, S., Hornof, L.: Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (2017) Köster, U., Webb, T.J., Wang, X., Nassar, M., Bansal, A.K., Constable, W.H., Elibol, O.H., Gray, S., Hall, S., Hornof, L.: Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (2017)
18.
Zurück zum Zitat Lin, D.D., Talathi, S.S., Annapureddy, V.S., et al.: Fixed point quantization of deep convolutional networks. In: International Conference on Machine Learning, pp. 2849–2858 (2016) Lin, D.D., Talathi, S.S., Annapureddy, V.S., et al.: Fixed point quantization of deep convolutional networks. In: International Conference on Machine Learning, pp. 2849–2858 (2016)
19.
Zurück zum Zitat Liu, D., Chen, T., Liu, S., Zhou, J., Zhou, S., Teman, O., Feng, X., Zhou, X., Chen, Y.: Pudiannao: a polyvalent machine learning accelerator. In: Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 369–381 (2015) Liu, D., Chen, T., Liu, S., Zhou, J., Zhou, S., Teman, O., Feng, X., Zhou, X., Chen, Y.: Pudiannao: a polyvalent machine learning accelerator. In: Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 369–381 (2015)
20.
Zurück zum Zitat Liu, S., Du, Z., Tao, J., Han, D., Luo, T., Xie, Y., Chen, Y., Chen, T.: Cambricon: an instruction set architecture for neural networks. In: Proceedings of the 43rd International Symposium on Computer Architecture, pp. 393–405. IEEE Press (2016) Liu, S., Du, Z., Tao, J., Han, D., Luo, T., Xie, Y., Chen, Y., Chen, T.: Cambricon: an instruction set architecture for neural networks. In: Proceedings of the 43rd International Symposium on Computer Architecture, pp. 393–405. IEEE Press (2016)
21.
Zurück zum Zitat Luo, T., Luo, T., Liu, S., He, L., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N.: Dadiannao: a machine-learning supercomputer. In: IEEE/ACM International Symposium on Microarchitecture, pp. 609–622 (2015) Luo, T., Luo, T., Liu, S., He, L., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N.: Dadiannao: a machine-learning supercomputer. In: IEEE/ACM International Symposium on Microarchitecture, pp. 609–622 (2015)
22.
Zurück zum Zitat Mellempudi, N., Kundu, A., Das, D., Mudigere, D., Kaul, B.: Mixed low-precision deep learning inference using dynamic fixed point (2017) Mellempudi, N., Kundu, A., Das, D., Mudigere, D., Kaul, B.: Mixed low-precision deep learning inference using dynamic fixed point (2017)
23.
Zurück zum Zitat Miyashita, D., Lee, E.H., Murmann, B.: Convolutional neural networks using logarithmic data representation (2016) Miyashita, D., Lee, E.H., Murmann, B.: Convolutional neural networks using logarithmic data representation (2016)
24.
Zurück zum Zitat Parashar, A., Rhu, M., Mukkara, A., Puglielli, A., Venkatesan, R., Khailany, B., Emer, J., Keckler, S.W., Dally, W.J.: Scnn: an accelerator for compressed-sparse convolutional neural networks, pp. 27–40 (2017) Parashar, A., Rhu, M., Mukkara, A., Puglielli, A., Venkatesan, R., Khailany, B., Emer, J., Keckler, S.W., Dally, W.J.: Scnn: an accelerator for compressed-sparse convolutional neural networks, pp. 27–40 (2017)
25.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Science (2014) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Science (2014)
26.
Zurück zum Zitat Te Ewe, C., Cheung, P.Y.K., Constantinides, G.A.: Dual fixed-point: an efficient alternative to floating-point computation. In: International Conference on Field Programmable Logic and Applications, pp. 200–208 (2004) Te Ewe, C., Cheung, P.Y.K., Constantinides, G.A.: Dual fixed-point: an efficient alternative to floating-point computation. In: International Conference on Field Programmable Logic and Applications, pp. 200–208 (2004)
27.
Zurück zum Zitat Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on CPUS. In: Deep Learning and Unsupervised Feature Learning Workshop, NIPS (2011) Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on CPUS. In: Deep Learning and Unsupervised Feature Learning Workshop, NIPS (2011)
28.
Zurück zum Zitat Zhang, S., Du, Z., Zhang, L., Lan, H., Liu, S., Li, L., Guo, Q., Chen, T., Chen, Y.: Cambricon-x: an accelerator for sparse neural networks. In: IEEE/ACM International Symposium on Microarchitecture, pp. 1–12 (2016) Zhang, S., Du, Z., Zhang, L., Lan, H., Liu, S., Li, L., Guo, Q., Chen, T., Chen, Y.: Cambricon-x: an accelerator for sparse neural networks. In: IEEE/ACM International Symposium on Microarchitecture, pp. 1–12 (2016)
29.
Zurück zum Zitat Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: towards lossless CNNS with low-precision weights (2017) Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: towards lossless CNNS with low-precision weights (2017)
Metadaten
Titel
Float-Fix: An Efficient and Hardware-Friendly Data Type for Deep Neural Network
verfasst von
Dong Han
Shengyuan Zhou
Tian Zhi
Yibo Wang
Shaoli Liu
Publikationsdatum
29.05.2019
Verlag
Springer US
Erschienen in
International Journal of Parallel Programming / Ausgabe 3/2019
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-018-00626-7

Weitere Artikel der Ausgabe 3/2019

International Journal of Parallel Programming 3/2019 Zur Ausgabe