ABSTRACT
Virtually all real-valued computations are carried out using floating-point data types and operations. The precision of these data types must be set with the goals of reducing the overall round-off error, but also emphasizing performance improvements. Often, a mixed-precision allocation achieves this optimum; unfortunately, there are no techniques available to compute such allocations and conservatively meet a given error target across all program inputs. In this work, we present a rigorous approach to precision allocation based on formal analysis via Symbolic Taylor Expansions, and error analysis based on interval functions. This approach is implemented in an automated tool called FPTuner that generates and solves a quadratically constrained quadratic program to obtain a precision-annotated version of the given expression. FPTuner automatically introduces all the requisite precision up and down casting operations. It also allows users to flexibly control precision allocation using constraints to cap the number of high precision operators as well as group operators to allocate the same precision to facilitate vectorization. We evaluate FPTuner by tuning several benchmarks and measuring the proportion of lower precision operators allocated as we increase the error threshold. We also measure the reduction in energy consumption resulting from executing mixed-precision tuned code on a real hardware platform. We observe significant energy savings in response to mixed-precision tuning, but also observe situations where unexpected compiler behaviors thwart intended optimizations.
- J.-M. Alliot, N. Durand, D. Gianazza, and J.-B. Gotteland. Finding and proving the optimum: Cooperative stochastic and deterministic search. In Proceedings of the 20th European Conference on Artificial Intelligence (ECAI), pages 55–60, 2012. Google ScholarDigital Library
- ARM NEON. ARM NEON general-purpose SIMD engine, 2016.Google Scholar
- D. Bailey and J. Borwein. High-precision arithmetic: Progress and challenges, 2013.Google Scholar
- S. Boldo. Deductive Formal Verification: How To Make Your Floating-Point Programs Behave. Thèse d’habilitation, Université Paris-Sud, 2014.Google Scholar
- S. Boldo, J.-H. Jourdan, X. Leroy, and G. Melquiond. Verified compilation of floating-point computations. Journal of Automated Reasoning, 54(2):135–163, 2015. Google ScholarDigital Library
- M. Brain, V. D’Silva, A. Griggio, L. Haller, and D. Kroening. Deciding floating-point logic with abstract conflict driven clause learning. Formal Methods in System Design (FMSD), 45(2):213–245, 2014. Google ScholarDigital Library
- M. Brain, C. Tinelli, P. Rümmer, and T. Wahl. An automatable formal semantics for IEEE-754 floating-point arithmetic. In Proceedings of the 22nd IEEE Symposium on Computer Arithmetic (ARITH), pages 160–167, 2015. Google ScholarDigital Library
- A. Brillout, D. Kroening, and T. Wahl. Mixed abstractions for floatingpoint arithmetic. In Proceedings of 9th International Conference on Formal Methods in Computer-Aided Design (FMCAD), pages 69–76, 2009.Google Scholar
- M. Burtscher, I. Zecena, and Z. Zong. Measuring GPU power with the K20 built-in sensor. In Proceedings of the 7th Workshop on General Purpose Processing Using GPUs (GPGPU), pages 28–36, 2014. Google ScholarCross Ref
- A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, and S. Tomov. Exploiting mixed precision floating point hardware in scientific computations. In High Performance Computing and Grids in Action. IOS Press, 2008.Google Scholar
- A. Buttari, J. Dongarra, J. Kurzak, P. Luszczek, and S. Tomov. Using mixed precision for sparse matrix computations to enhance the performance while achieving 64-bit accuracy. ACM Transactions on Mathematical Software (TOMS), 34(4):17:1–17:22, 2008. Google ScholarDigital Library
- C. Chan, J. Ansel, Y. L. Wong, S. Amarasinghe, and A. Edelman. Autotuning multigrid with PetaBricks. In Proceedings of the 21th International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pages 1–12, 2009. Google ScholarDigital Library
- S. Chaudhuri, A. Farzan, and Z. Kincaid. Consistency analysis of decision-making programs. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 555–567, 2014. Google ScholarDigital Library
- W.-F. Chiang, G. Gopalakrishnan, Z. Rakamari´c, and A. Solovyev. Efficient search for inputs causing high floating-point errors. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 43–52, 2014. Google ScholarDigital Library
- W.-F. Chiang, G. Gopalakrishnan, and Z. Rakamari´c. Practical floating-point divergence detection. In Proceedings of the 29th International Workshop on Languages and Compilers for Parallel Computing (LCPC), pages 271–286, 2016. Google ScholarDigital Library
- Coq. The Coq proof assistant, 2016.Google Scholar
- E. Darulova and V. Kuncak. Trustworthy numerical computation in Scala. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA), pages 325–344, 2011. Google ScholarDigital Library
- E. Darulova and V. Kuncak. Sound compilation of reals. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 235–248, 2014. Google ScholarDigital Library
- E. Darulova and V. Kuncak. Towards a compiler for reals. CoRR, abs/1410.0198, 2016.Google Scholar
- E. Darulova, V. Kuncak, R. Majumdar, and I. Saha. Synthesis of fixed-point programs. In Proceedings of the 11th ACM International Conference on Embedded Software (EMSOFT), pages 22:1–22:10, 2013. Google ScholarDigital Library
- F. De Dinechin, C. Q. Lauter, and G. Melquiond. Assisted verification of elementary functions using Gappa. In Proceedings of the 21st ACM Symposium on Applied Computing (SAC), pages 1318–1322, 2006. Google ScholarDigital Library
- S. Gao, S. Kong, and E. M. Clarke. dReal: An SMT solver for nonlinear theories over the reals. In Proceedings of the 24th International Conference on Automated Deduction (CADE), pages 208–214, 2013. Google ScholarDigital Library
- Gaol. Gaol, not just another interval library., 2016.Google Scholar
- D. Goldberg. What every computer scientist should know about floating-point arithmetic. ACM Computing Surveys, 23(1):5–48, 1991. Google ScholarDigital Library
- F. Goualard. How do you compute the midpoint of an interval? ACM Transactions on Mathematical Software (TOMS), 40(2):11:1–11:25, 2014. Google ScholarDigital Library
- E. Goubault and S. Putot. Static analysis of numerical algorithms. In Proceedings of the 13th International Static Analysis Symposium (SAS), pages 18–34, 2006. Google ScholarDigital Library
- S. Graillat, F. Jézéquel, R. Picot, F. Févotte, and B. Lathuiliére. PROMISE : floating-point precision tuning with stochastic arithmetic. In 17th International Symposium on Scientific Computing, Computer Arithmetics and Verified Numerics (SCAN), 2016.Google Scholar
- Gurobi. Gurobi optimizer, 2016.Google Scholar
- J. L. Gustafson. The End of Error: Unum Computing. Chapman and Hall/CRC, 2015.Google Scholar
- J. Harrison. Floating-point verification using theorem proving. In Proceedings of the 6th International Conference on Formal Methods for the Design of Computer, Communication, and Software Systems (SFM), pages 211–242, 2006. Google ScholarDigital Library
- HOL Light. The HOL Light theorem prover, 2016.Google Scholar
- IEEE 754. IEEE standard for floating-point arithmetic, 2008.Google Scholar
- Intel Intrinsics for Type Casting. Intel intrinsics guide, 2016.Google Scholar
- C. Jacobsen, A. Solovyev, and G. Gopalakrishnan. A parameterized floating-point formalizaton in HOL Light. In Proceedings of the 8th International Workshop on Numerical Software Verification (NSV), pages 101–107, 2015. Google ScholarDigital Library
- W. Kahan. How futile are mindless assessments of roundoff in floating-point computation?, 2006.Google Scholar
- M. Lagoudakis and M. Littman. Algorithm selection using reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning (ICML), pages 511–518, 2000. Google ScholarDigital Library
- M. Lam, J. Hollingsworth, B. de Supinski, and M. Legendre. Automatically adapting programs for mixed-precision floating-point computation. In Proceedings of the 27th International Conference on Supercomputing (ICS), pages 369–378, 2013. Google ScholarDigital Library
- W. Lee, R. Sharma, and A. Aiken. Verifying bit-manipulations of floating-point. In Proceedings of the 37th annual ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), pages 70–84, 2016. Google ScholarDigital Library
- M. Leeser, S. Mukherjee, J. Ramachandran, and T. Wahl. Make it real: Effective floating-point reasoning via exact arithmetic. In Proceedings of the Conference on Design, Automation & Test in Europe (DATE), pages 117:1–117:4, 2014. Google ScholarDigital Library
- lquadmath. The GCC quad-precision math library, 2016.Google Scholar
- V. Magron, G. A. Constantinides, and A. F. Donaldson. Certified roundoff error bounds using semidefinite programming. CoRR, abs/1507.03331, 2015.Google Scholar
- M. Martel. Program transformation for numerical precision. In ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM), pages 101–110, 2009. Google ScholarDigital Library
- T. P. Morgan. Nvidia tweaks Pascal GPU for deep learning push, 2015.Google Scholar
- S. Muralidharan, M. Shantharam, M. Hall, M. Garland, and B. Catanzaro. Nitro: A framework for adaptive code variant tuning. In Proceedings of the 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS), pages 501–512, 2014. Google ScholarDigital Library
- P. Panchekha, A. Sanchez-Stern, J. R. Wilcox, and Z. Tatlock. Automatically improving accuracy for floating point expressions. In Proceedings of the 36th annual ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), pages 1–11, 2015. Google ScholarDigital Library
- QCQP. Quadratically constrained quadratic program, 2016.Google Scholar
- C. Rubio-González, C. Nguyen, H. D. Nguyen, J. Demmel, W. Kahan, K. Sen, D. H. Bailey, C. Iancu, and D. Hough. Precimonious: Tuning assistant for floating-point precision. In Proceedings of the 25th International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pages 1–12, 2013. Google ScholarDigital Library
- C. Rubio-González, C. Nguyen, B. Mehne, K. Sen, J. Demmel, W. Kahan, C. Iancu, W. Lavrijsen, D. H. Bailey, and D. Hough. Floatingpoint precision tuning using blame analysis. In Proceedings of the 38th International Conference on Software Engineering (ICSE), pages 1074–1085, 2016. Google ScholarDigital Library
- P. Rümmer and T. Wahl. An SMT-LIB theory of binary floating-point arithmetic. In Informal Proceedings of 8th International Workshop on Satisfiability Modulo Theories (SMT), 2010.Google Scholar
- E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In Proceedings of the 35th annual ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), pages 53–64, 2014. Google ScholarDigital Library
- J. Shewchuk. Adaptive precision floating-point arithmetic and fast robust geometric predicates. Discrete and Computational Geometry, 18(3):305–363, 1997.Google ScholarCross Ref
- A. Solovyev, C. Jacobsen, Z. Rakamari´c, and G. Gopalakrishnan. Rigorous estimation of floating-point round-off errors with Symbolic Taylor Expansions. In Proceedings of the 20th International Symposium on Formal Methods Formal (FM), pages 532–550, 2015.Google ScholarCross Ref
- E. Tang, E. Barr, X. Li, and Z. Su. Perturbing numerical calculations for statistical analysis of floating-point program (in)stability. In Proceedings of the 8th International Symposium on Software Testing and Analysis (ISSTA), pages 131–142, 2010. Google ScholarDigital Library
- R. Vuduc, J. Demmel, and J. Bilmes. Statistical models for empirical search-based performance tuning. International Journal of High Performance Computing Applications, 18(1):65–94, 2004. Google ScholarDigital Library
- D. Zou, R. Wang, Y. Xiong, L. Zhang, Z. Su, and H. Mei. A genetic algorithm for detecting significant floating-point inaccuracies. In Proceedings of the 37th International Conference on Software Engineering (ICSE), pages 529–539, 2015. Google ScholarDigital Library
Index Terms
- Rigorous floating-point mixed-precision tuning
Recommendations
Automatically adapting programs for mixed-precision floating-point computation
ICS '13: Proceedings of the 27th international ACM conference on International conference on supercomputingAs scientific computation continues to scale, efficient use of floating-point arithmetic processors is critical. Lower precision allows streaming architectures to perform more operations per second and can reduce memory bandwidth pressure on all ...
Rigorous floating-point mixed-precision tuning
POPL '17Virtually all real-valued computations are carried out using floating-point data types and operations. The precision of these data types must be set with the goals of reducing the overall round-off error, but also emphasizing performance improvements. ...
Comments