Skip to main content

2018 | OriginalPaper | Buchkapitel

4. Basic Properties and Algorithms

verfasst von : Jean-Michel Muller, Nicolas Brunie, Florent de Dinechin, Claude-Pierre Jeannerod, Mioara Joldes, Vincent Lefèvre, Guillaume Melquiond, Nathalie Revol, Serge Torres

Erschienen in: Handbook of Floating-Point Arithmetic

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this chapter, we present some short yet useful algorithms and some basic properties that can be derived from specifications of floating-point arithmetic systems, such as the ones given in the successive IEEE 754 standards. Thanks to these standards, we now have an accurate definition of floating-point formats and operations. The behavior of a sequence of operations becomes at least partially for more details on this). We therefore can build algorithms and proofs that refer to these specifications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
In some cases, intermediate calculations may be performed in a wider internal format. Some examples are given in Section 3.​2.
 
2
Beware! We remind the reader that by “no underflow” we mean that the absolute value of the result (before or after rounding, this depends on the definition) is not less than the smallest normal number \(\beta ^{e_{\mathrm{min}}}\). When subnormal numbers are available, as requested by the IEEE 754 standards, it is possible to represent smaller nonzero numbers, but with a precision that does not always suffice to represent the product exactly.
 
3
When | μ | ≤ β p − 1, the difference sa is representable with exponent e a , but not necessarily in normal form. This is why the availability of subnormal numbers is necessary.
 
4
Do not forget that | a | ≥ | b | implies that the exponent of a is larger than or equal to that of b. Hence, it suffices to compare the two variables.
 
6
In radix 2, we will use the fact that a 2g + 1-bit number can be split into two g-bit numbers. This explains why Dekker’s algorithm works if the precision is even or if the radix is 2.
 
7
These assumptions hold on any “reasonable” floating-point system.
 
8
For example, in the IEEE 754 binary64 format, with x = (253 − 1) ⋅ 2940 and y = 231, we obtain x h = 2993 and y h = y. The floating-point multiplication \(\mathop{\mathrm{RN}}\nolimits (x_{h}y_{h})\) overflows, whereas \(\mathop{\mathrm{RN}}\nolimits (xy) =\varOmega = (2^{53} - 1) \cdot 2^{971}\).
 
9
Caution: M is not necessarily the integral significand of x.
 
10
Or q is the largest finite floating-point number Ω, in the case where xy is between that number and the overflow threshold (the same thing applies on the negative side).
 
11
Part of what we are going to explain does not generalize to decimal arithmetic.
 
12
For instance, the frcpa instruction of the IA-64 instruction set returns approximations to reciprocals with relative error less than or equal to 2−8. 886. Such tables are easily implemented using the bipartite method, See [131].
 
13
When the radix is an odd number, values exactly halfway between two consecutive floating-point numbers are represented with infinitely many digits.
 
14
A very similar study can be done when it is a power of 2.
 
15
A necessary and sufficient condition for all numbers representable in radix β with a finite number of digits to be representable in radix γ with a finite number of digits is that β should divide an integer power of γ.
 
16
This formula is valid for all possible values of p 2 and \(e_{\mathrm{min}}\) (provided \(e_{\mathrm{min}} \approx -e_{\mathrm{max}}\)). And yet, for all usual formats, it can be simplified: A simple continued fraction argument (see Section A.1) shows that for p 2 ≥ 16 and \(e_{\mathrm{min}} \geq -28000\), it is equal to
$$\displaystyle{-e_{\mathrm{min}} + p_{2} + \left \lfloor (e_{\mathrm{min}} + 1)\log _{10}(2)\right \rfloor.}$$
 
17
At the time of writing this book, it can be obtained at http://​www.​netlib.​org/​fp/​ (file dtoa.c).
 
18
The algorithm works for other radices. See [86] for details.
 
19
In round-to-nearest modes, it required that the error introduced by the conversion should be at most 0. 97 ulps. The major reason for this somewhat weak requirement is that the conversion algorithms presented here were not known at the time that standard was designed.
 
20
Another solution consists in using a precomputed table of powers of 10 in the binary format.
 
21
If a wider internal format is available, one can use it and possibly save one step.
 
22
A straightforward analysis of the error induced by the truncation of the digit chain D would give | ε 2 | ≤ 10−min{n−1, j}, but when j ≥ (n − 1), \(D^{{\ast}} =\hat{ D}\) and there is no truncation error at all.
 
23
The IEEE 754-2008 standard allows the conversion of out-of-range numbers, infinity or NaN. In that case, either there should be a dedicated signaling mechanism or the invalid operation exception should be signaled.
 
24
Unless u 2 is a power of 2, but this case is easily handled separately.
 
Literatur
[15]
Zurück zum Zitat S. F. Anderson, J. G. Earle, R. E. Goldschmidt, and D. M. Powers. The IBM 360/370 model 91: floating-point execution unit. IBM Journal of Research and Development, 1967. Reprinted in [583]. S. F. Anderson, J. G. Earle, R. E. Goldschmidt, and D. M. Powers. The IBM 360/370 model 91: floating-point execution unit. IBM Journal of Research and Development, 1967. Reprinted in [583].
[16]
Zurück zum Zitat M. Andrysco, R. Jhala, and S. Lerner. Printing floating-point numbers: A faster, always correct method. In 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 555–567, 2016. M. Andrysco, R. Jhala, and S. Lerner. Printing floating-point numbers: A faster, always correct method. In 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 555–567, 2016.
[41]
Zurück zum Zitat G. Bohlender, W. Walter, P. Kornerup, and D. W. Matula. Semantics for exact floating point operations. In 10th IEEE Symposium on Computer Arithmetic (ARITH-10), pages 22–26, June 1991. G. Bohlender, W. Walter, P. Kornerup, and D. W. Matula. Semantics for exact floating point operations. In 10th IEEE Symposium on Computer Arithmetic (ARITH-10), pages 22–26, June 1991.
[42]
Zurück zum Zitat S. Boldo. Pitfalls of a full floating-point proof: example on the formal proof of the Veltkamp/Dekker algorithms. In 3rd International Joint Conference on Automated Reasoning (IJCAR), volume 4130 of Lecture Notes in Computer Science, pages 52–66, Seattle, WA, USA, 2006.CrossRef S. Boldo. Pitfalls of a full floating-point proof: example on the formal proof of the Veltkamp/Dekker algorithms. In 3rd International Joint Conference on Automated Reasoning (IJCAR), volume 4130 of Lecture Notes in Computer Science, pages 52–66, Seattle, WA, USA, 2006.CrossRef
[46]
Zurück zum Zitat S. Boldo and M. Daumas. Representable correcting terms for possibly underflowing floating point operations. In 16th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 79–86, Santiago de Compostela, Spain, 2003. S. Boldo and M. Daumas. Representable correcting terms for possibly underflowing floating point operations. In 16th IEEE Symposium on Computer Arithmetic (ARITH-16), pages 79–86, Santiago de Compostela, Spain, 2003.
[49]
Zurück zum Zitat S. Boldo, S. Graillat, and J.-M. Muller. On the robustness of the 2Sum and Fast2Sum algorithms. ACM Transactions on Mathematical Software, 44(1):4:1–4:14, 2017.MathSciNetCrossRef S. Boldo, S. Graillat, and J.-M. Muller. On the robustness of the 2Sum and Fast2Sum algorithms. ACM Transactions on Mathematical Software, 44(1):4:1–4:14, 2017.MathSciNetCrossRef
[51]
Zurück zum Zitat S. Boldo, J.-H. Jourdan, X. Leroy, and G. Melquiond. Verified compilation of floating-point computations. Journal of Automated Reasoning, 54(2):135–163, 2015.MathSciNetCrossRef S. Boldo, J.-H. Jourdan, X. Leroy, and G. Melquiond. Verified compilation of floating-point computations. Journal of Automated Reasoning, 54(2):135–163, 2015.MathSciNetCrossRef
[52]
Zurück zum Zitat S. Boldo and G. Melquiond. Emulation of FMA and correctly rounded sums: proved algorithms using rounding to odd. IEEE Transactions on Computers, 57(4):462–471, 2008.MathSciNetCrossRef S. Boldo and G. Melquiond. Emulation of FMA and correctly rounded sums: proved algorithms using rounding to odd. IEEE Transactions on Computers, 57(4):462–471, 2008.MathSciNetCrossRef
[54]
Zurück zum Zitat S. Boldo and G. Melquiond. Computer Arithmetic and Formal Proofs. ISTE Press – Elsevier, 2017. S. Boldo and G. Melquiond. Computer Arithmetic and Formal Proofs. ISTE Press – Elsevier, 2017.
[55]
Zurück zum Zitat S. Boldo and J.-M. Muller. Exact and approximated error of the FMA. IEEE Transactions on Computers, 60(2):157–164, 2011.MathSciNetCrossRef S. Boldo and J.-M. Muller. Exact and approximated error of the FMA. IEEE Transactions on Computers, 60(2):157–164, 2011.MathSciNetCrossRef
[57]
Zurück zum Zitat A. D. Booth. A signed binary multiplication technique. Quarterly Journal of Mechanics and Applied Mathematics, 4(2):236–240, 1951. Reprinted in [583]. A. D. Booth. A signed binary multiplication technique. Quarterly Journal of Mechanics and Applied Mathematics, 4(2):236–240, 1951. Reprinted in [583].
[74]
Zurück zum Zitat N. Brisebarre and J.-M. Muller. Correctly rounded multiplication by arbitrary precision constants. IEEE Transactions on Computers, 57(2):165–174, 2008.MathSciNetCrossRef N. Brisebarre and J.-M. Muller. Correctly rounded multiplication by arbitrary precision constants. IEEE Transactions on Computers, 57(2):165–174, 2008.MathSciNetCrossRef
[86]
Zurück zum Zitat R. G. Burger and R. K. Dybvig. Printing floating-point numbers quickly and accurately. In SIGPLAN’96 Conference on Programming Languages Design and Implementation (PLDI), pages 108–116, June 1996. R. G. Burger and R. K. Dybvig. Printing floating-point numbers quickly and accurately. In SIGPLAN’96 Conference on Programming Languages Design and Implementation (PLDI), pages 108–116, June 1996.
[101]
Zurück zum Zitat W. D. Clinger. How to read floating-point numbers accurately. ACM SIGPLAN Notices, 25(6):92–101, 1990.CrossRef W. D. Clinger. How to read floating-point numbers accurately. ACM SIGPLAN Notices, 25(6):92–101, 1990.CrossRef
[102]
Zurück zum Zitat W. D. Clinger. Retrospective: how to read floating-point numbers accurately. ACM SIGPLAN Notices, 39(4):360–371, 2004.CrossRef W. D. Clinger. Retrospective: how to read floating-point numbers accurately. ACM SIGPLAN Notices, 39(4):360–371, 2004.CrossRef
[118]
Zurück zum Zitat M. Cornea, J. Harrison, and P. T. P. Tang. Scientific Computing on Itanium ®; -based Systems. Intel Press, Hillsboro, OR, 2002. M. Cornea, J. Harrison, and P. T. P. Tang. Scientific Computing on Itanium ®; -based Systems. Intel Press, Hillsboro, OR, 2002.
[120]
Zurück zum Zitat M. A. Cornea-Hasegan, R. A. Golliver, and P. Markstein. Correctness proofs outline for Newton–Raphson based floating-point divide and square root algorithms. In 14th IEEE Symposium on Computer Arithmetic (ARITH-14), pages 96–105, April 1999. M. A. Cornea-Hasegan, R. A. Golliver, and P. Markstein. Correctness proofs outline for Newton–Raphson based floating-point divide and square root algorithms. In 14th IEEE Symposium on Computer Arithmetic (ARITH-14), pages 96–105, April 1999.
[128]
Zurück zum Zitat A. Dahan-Dalmedico and J. Pfeiffer. Histoire des Mathématiques. Editions du Seuil, Paris, 1986. In French. A. Dahan-Dalmedico and J. Pfeiffer. Histoire des Mathématiques. Editions du Seuil, Paris, 1986. In French.
[130]
Zurück zum Zitat D. Das Sarma and D. W. Matula. Measuring the accuracy of ROM reciprocal tables. IEEE Transactions on Computers, 43(8):932–940, 1994.CrossRef D. Das Sarma and D. W. Matula. Measuring the accuracy of ROM reciprocal tables. IEEE Transactions on Computers, 43(8):932–940, 1994.CrossRef
[131]
Zurück zum Zitat D. Das Sarma and D. W. Matula. Faithful bipartite ROM reciprocal tables. In 12th IEEE Symposium on Computer Arithmetic (ARITH-12), pages 17–28, June 1995. D. Das Sarma and D. W. Matula. Faithful bipartite ROM reciprocal tables. In 12th IEEE Symposium on Computer Arithmetic (ARITH-12), pages 17–28, June 1995.
[132]
Zurück zum Zitat D. Das Sarma and D. W. Matula. Faithful interpolation in reciprocal tables. In 13th IEEE Symposium on Computer Arithmetic (ARITH-13), pages 82–91, July 1997. D. Das Sarma and D. W. Matula. Faithful interpolation in reciprocal tables. In 13th IEEE Symposium on Computer Arithmetic (ARITH-13), pages 82–91, July 1997.
[142]
Zurück zum Zitat F. de Dinechin, A. V. Ershov, and N. Gast. Towards the post-ultimate libm. In 17th IEEE Symposium on Computer Arithmetic (ARITH-17), pages 288–295, 2005. F. de Dinechin, A. V. Ershov, and N. Gast. Towards the post-ultimate libm. In 17th IEEE Symposium on Computer Arithmetic (ARITH-17), pages 288–295, 2005.
[156]
Zurück zum Zitat F. de Dinechin and A. Tisserand. Multipartite table methods. IEEE Transactions on Computers, 54(3):319–330, 2005.CrossRef F. de Dinechin and A. Tisserand. Multipartite table methods. IEEE Transactions on Computers, 54(3):319–330, 2005.CrossRef
[158]
Zurück zum Zitat T. J. Dekker. A floating-point technique for extending the available precision. Numerische Mathematik, 18(3):224–242, 1971.MathSciNetCrossRef T. J. Dekker. A floating-point technique for extending the available precision. Numerische Mathematik, 18(3):224–242, 1971.MathSciNetCrossRef
[186]
Zurück zum Zitat M. D. Ercegovac and T. Lang. Division and Square Root: Digit-Recurrence Algorithms and Implementations. Kluwer Academic Publishers, Boston, MA, 1994.MATH M. D. Ercegovac and T. Lang. Division and Square Root: Digit-Recurrence Algorithms and Implementations. Kluwer Academic Publishers, Boston, MA, 1994.MATH
[187]
Zurück zum Zitat M. D. Ercegovac and T. Lang. Digital Arithmetic. Morgan Kaufmann Publishers, San Francisco, CA, 2004. M. D. Ercegovac and T. Lang. Digital Arithmetic. Morgan Kaufmann Publishers, San Francisco, CA, 2004.
[189]
Zurück zum Zitat M. Ercegovac, J.-M. Muller, and A. Tisserand. Simple seed architectures for reciprocal and square root reciprocal. In 39th Asilomar Conference on Signals, Systems, and Computers, November 2005. M. Ercegovac, J.-M. Muller, and A. Tisserand. Simple seed architectures for reciprocal and square root reciprocal. In 39th Asilomar Conference on Signals, Systems, and Computers, November 2005.
[197]
Zurück zum Zitat G. Even, P.-M. Seidel, and W. E. Ferguson. A parametric error analysis of Goldschmidt’s division algorithm. Journal of Computer and System Sciences, 70(1):118–139, 2005.MathSciNetCrossRef G. Even, P.-M. Seidel, and W. E. Ferguson. A parametric error analysis of Goldschmidt’s division algorithm. Journal of Computer and System Sciences, 70(1):118–139, 2005.MathSciNetCrossRef
[199]
Zurück zum Zitat W. E. Ferguson, Jr. Exact computation of a sum or difference with applications to argument reduction. In 12th IEEE Symposium on Computer Arithmetic (ARITH-12), pages 216–221, Bath, UK, July 1995. W. E. Ferguson, Jr. Exact computation of a sum or difference with applications to argument reduction. In 12th IEEE Symposium on Computer Arithmetic (ARITH-12), pages 216–221, Bath, UK, July 1995.
[205]
Zurück zum Zitat D. Fowler and E. Robson. Square root approximations in old Babylonian mathematics: YBC 7289 in context. Historia Mathematica, 25:366–378, 1998.MathSciNetCrossRef D. Fowler and E. Robson. Square root approximations in old Babylonian mathematics: YBC 7289 in context. Historia Mathematica, 25:366–378, 1998.MathSciNetCrossRef
[211]
Zurück zum Zitat D. M. Gay. Correctly-rounded binary-decimal and decimal-binary conversions. Technical Report Numerical Analysis Manuscript 90–10, ATT & Bell Laboratories (Murray Hill, NJ), November 1990. D. M. Gay. Correctly-rounded binary-decimal and decimal-binary conversions. Technical Report Numerical Analysis Manuscript 90–10, ATT & Bell Laboratories (Murray Hill, NJ), November 1990.
[212]
Zurück zum Zitat W. M. Gentleman and S. B. Marovitch. More on algorithms that reveal properties of floating-point arithmetic units. Communications of the ACM, 17(5):276–277, 1974.CrossRef W. M. Gentleman and S. B. Marovitch. More on algorithms that reveal properties of floating-point arithmetic units. Communications of the ACM, 17(5):276–277, 1974.CrossRef
[215]
Zurück zum Zitat I. B. Goldberg. 27 bits are not enough for 8-digit accuracy. Commun. ACM, 10(2):105–106, 1967.CrossRef I. B. Goldberg. 27 bits are not enough for 8-digit accuracy. Commun. ACM, 10(2):105–106, 1967.CrossRef
[217]
Zurück zum Zitat R. E. Goldschmidt. Applications of division by convergence. Master’s thesis, Dept. of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA, June 1964. R. E. Goldschmidt. Applications of division by convergence. Master’s thesis, Dept. of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA, June 1964.
[249]
Zurück zum Zitat J. R. Hauser. Handling floating-point exceptions in numeric programs. ACM Transactions on Programming Languages and Systems, 18(2):139–174, 1996.CrossRef J. R. Hauser. Handling floating-point exceptions in numeric programs. ACM Transactions on Programming Languages and Systems, 18(2):139–174, 1996.CrossRef
[313]
Zurück zum Zitat W. Kahan. Pracniques: further remarks on reducing truncation errors. Communications of the ACM, 8(1):40, 1965.CrossRef W. Kahan. Pracniques: further remarks on reducing truncation errors. Communications of the ACM, 8(1):40, 1965.CrossRef
[329]
Zurück zum Zitat R. Karpinsky. PARANOIA: a floating-point benchmark. BYTE, 10(2), 1985. R. Karpinsky. PARANOIA: a floating-point benchmark. BYTE, 10(2), 1985.
[342]
Zurück zum Zitat D. E. Knuth. The Art of Computer Programming, volume 2. Addison-Wesley, Reading, MA, 3rd edition, 1998. D. E. Knuth. The Art of Computer Programming, volume 2. Addison-Wesley, Reading, MA, 3rd edition, 1998.
[347]
Zurück zum Zitat P. Kornerup, V. Lefèvre, N. Louvet, and J.-M. Muller. On the computation of correctly rounded sums. IEEE Transactions on Computers, 61(3):289–298, 2012.MathSciNetCrossRef P. Kornerup, V. Lefèvre, N. Louvet, and J.-M. Muller. On the computation of correctly rounded sums. IEEE Transactions on Computers, 61(3):289–298, 2012.MathSciNetCrossRef
[350]
Zurück zum Zitat P. Kornerup and J.-M. Muller. Choosing starting values for certain Newton–Raphson iterations. Theoretical Computer Science, 351(1):101–110, 2006.MathSciNetCrossRef P. Kornerup and J.-M. Muller. Choosing starting values for certain Newton–Raphson iterations. Theoretical Computer Science, 351(1):101–110, 2006.MathSciNetCrossRef
[391]
Zurück zum Zitat S. Linnainmaa. Software for doubled-precision floating-point computations. ACM Transactions on Mathematical Software, 7(3):272–283, 1981.MathSciNetCrossRef S. Linnainmaa. Software for doubled-precision floating-point computations. ACM Transactions on Mathematical Software, 7(3):272–283, 1981.MathSciNetCrossRef
[394]
Zurück zum Zitat F. Loitsch. Printing floating-point numbers quickly and accurately with integers. In 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ‘10), pages 233–243, 2010. F. Loitsch. Printing floating-point numbers quickly and accurately with integers. In 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ‘10), pages 233–243, 2010.
[403]
Zurück zum Zitat M. A. Malcolm. Algorithms to reveal properties of floating-point arithmetic. Communications of the ACM, 15(11):949–951, 1972.CrossRef M. A. Malcolm. Algorithms to reveal properties of floating-point arithmetic. Communications of the ACM, 15(11):949–951, 1972.CrossRef
[405]
Zurück zum Zitat P. Markstein. Computation of elementary functions on the IBM RISC System/6000 processor. IBM Journal of Research and Development, 34(1):111–119, 1990.MathSciNetCrossRef P. Markstein. Computation of elementary functions on the IBM RISC System/6000 processor. IBM Journal of Research and Development, 34(1):111–119, 1990.MathSciNetCrossRef
[406]
Zurück zum Zitat P. Markstein. IA-64 and Elementary Functions: Speed and Precision. Hewlett-Packard Professional Books. Prentice-Hall, Englewood Cliffs, NJ, 2000. P. Markstein. IA-64 and Elementary Functions: Speed and Precision. Hewlett-Packard Professional Books. Prentice-Hall, Englewood Cliffs, NJ, 2000.
[440]
Zurück zum Zitat J.-M. Muller. Avoiding double roundings in scaled Newton-Raphson division. In 47th Asilomar Conference on Signals, Systems, and Computers, pages 396–399, November 2013. J.-M. Muller. Avoiding double roundings in scaled Newton-Raphson division. In 47th Asilomar Conference on Signals, Systems, and Computers, pages 396–399, November 2013.
[442]
Zurück zum Zitat J.-M. Muller. Elementary Functions, Algorithms and Implementation. Birkhäuser Boston, MA, 3rd edition, 2016.CrossRef J.-M. Muller. Elementary Functions, Algorithms and Implementation. Birkhäuser Boston, MA, 3rd edition, 2016.CrossRef
[458]
Zurück zum Zitat I. Newton. Methodus Fluxionum et Serierum Infinitarum. 1664–1671. I. Newton. Methodus Fluxionum et Serierum Infinitarum. 1664–1671.
[481]
Zurück zum Zitat A. Panhaleux. Génération d’itérations de type Newton-Raphson pour la division de deux flottants à l’aide d’un FMA. Master’s thesis, École Normale Supérieure de Lyon, Lyon, France, 2008. In French. A. Panhaleux. Génération d’itérations de type Newton-Raphson pour la division de deux flottants à l’aide d’un FMA. Master’s thesis, École Normale Supérieure de Lyon, Lyon, France, 2008. In French.
[482]
Zurück zum Zitat B. Parhami. On the complexity of table lookup for iterative division. IEEE Transactions on Computers, C-36(10):1233–1236, 1987.MathSciNetCrossRef B. Parhami. On the complexity of table lookup for iterative division. IEEE Transactions on Computers, C-36(10):1233–1236, 1987.MathSciNetCrossRef
[492]
Zurück zum Zitat J. A. Pineiro and J. D. Bruguera. High-speed double-precision computation of reciprocal, division, square root, and inverse square root. IEEE Transactions on Computers, 51(12):1377–1388, 2002.MathSciNetCrossRef J. A. Pineiro and J. D. Bruguera. High-speed double-precision computation of reciprocal, division, square root, and inverse square root. IEEE Transactions on Computers, 51(12):1377–1388, 2002.MathSciNetCrossRef
[496]
Zurück zum Zitat D. M. Priest. On Properties of Floating-Point Arithmetics: Numerical Stability and the Cost of Accurate Computations. Ph.D. thesis, University of California at Berkeley, 1992. D. M. Priest. On Properties of Floating-Point Arithmetics: Numerical Stability and the Cost of Accurate Computations. Ph.D. thesis, University of California at Berkeley, 1992.
[517]
Zurück zum Zitat S. M. Rump. Solving algebraic problems with high accuracy (Habilitationsschrift). In A New Approach to Scientific Computation, pages 51–120, 1983. S. M. Rump. Solving algebraic problems with high accuracy (Habilitationsschrift). In A New Approach to Scientific Computation, pages 51–120, 1983.
[531]
Zurück zum Zitat S. M. Rump, T. Ogita, and S. Oishi. Accurate floating-point summation part I: Faithful rounding. SIAM Journal on Scientific Computing, 31(1):189–224, 2008.MathSciNetCrossRef S. M. Rump, T. Ogita, and S. Oishi. Accurate floating-point summation part I: Faithful rounding. SIAM Journal on Scientific Computing, 31(1):189–224, 2008.MathSciNetCrossRef
[555]
Zurück zum Zitat J. R. Shewchuk. Adaptive precision floating-point arithmetic and fast robust geometric predicates. Discrete Computational Geometry, 18:305–363, 1997.MathSciNetCrossRef J. R. Shewchuk. Adaptive precision floating-point arithmetic and fast robust geometric predicates. Discrete Computational Geometry, 18:305–363, 1997.MathSciNetCrossRef
[558]
Zurück zum Zitat T. Simpson. Essays on several curious and useful subjects in speculative and mix’d mathematicks, illustrated by a variety of examples. London, 1740. T. Simpson. Essays on several curious and useful subjects in speculative and mix’d mathematicks, illustrated by a variety of examples. London, 1740.
[566]
Zurück zum Zitat G. L. Steele, Jr. and J. L. White. How to print floating-point numbers accurately. ACM SIGPLAN Notices, 25(6):112–126, 1990.CrossRef G. L. Steele, Jr. and J. L. White. How to print floating-point numbers accurately. ACM SIGPLAN Notices, 25(6):112–126, 1990.CrossRef
[567]
Zurück zum Zitat G. L. Steele, Jr. and J. L. White. Retrospective: how to print floating-point numbers accurately. ACM SIGPLAN Notices, 39(4):372–389, 2004.CrossRef G. L. Steele, Jr. and J. L. White. Retrospective: how to print floating-point numbers accurately. ACM SIGPLAN Notices, 39(4):372–389, 2004.CrossRef
[572]
Zurück zum Zitat P. H. Sterbenz. Floating-Point Computation. Prentice-Hall, Englewood Cliffs, NJ, 1974. P. H. Sterbenz. Floating-Point Computation. Prentice-Hall, Englewood Cliffs, NJ, 1974.
[574]
Zurück zum Zitat J. E. Stine and M. J. Schulte. The symmetric table addition method for accurate function approximation. Journal of VLSI Signal Processing, 21:167–177, 1999.CrossRef J. E. Stine and M. J. Schulte. The symmetric table addition method for accurate function approximation. Journal of VLSI Signal Processing, 21:167–177, 1999.CrossRef
[616]
Zurück zum Zitat G. W. Veltkamp. ALGOL procedures voor het berekenen van een inwendig product in dubbele precisie. Technical Report 22, RC-Informatie, Technishe Hogeschool Eindhoven, 1968. G. W. Veltkamp. ALGOL procedures voor het berekenen van een inwendig product in dubbele precisie. Technical Report 22, RC-Informatie, Technishe Hogeschool Eindhoven, 1968.
[617]
Zurück zum Zitat G. W. Veltkamp. ALGOL procedures voor het rekenen in dubbele lengte. Technical Report 21, RC-Informatie, Technishe Hogeschool Eindhoven, 1969. G. W. Veltkamp. ALGOL procedures voor het rekenen in dubbele lengte. Technical Report 21, RC-Informatie, Technishe Hogeschool Eindhoven, 1969.
[639]
Metadaten
Titel
Basic Properties and Algorithms
verfasst von
Jean-Michel Muller
Nicolas Brunie
Florent de Dinechin
Claude-Pierre Jeannerod
Mioara Joldes
Vincent Lefèvre
Guillaume Melquiond
Nathalie Revol
Serge Torres
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-76526-6_4