Skip to main content

2015 | OriginalPaper | Buchkapitel

Capturing the Expert: Generating Fast Matrix-Multiply Kernels with Spiral

verfasst von : Richard Veras, Franz Franchetti

Erschienen in: High Performance Computing for Computational Science -- VECPAR 2014

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Matrix-Matrix Multiplication (MMM) is a fundamental operation in scientific computing. Achieving the floating point peak with this operation requires expert knowledge of linear algebra and computer architecture to craft a tuned implementation, for a given microarchitecture. To do this an expert follows a mechanical process for implementing MMM, by deriving an algorithm from models found in the literature. Then, the expert applies optimizations which are well suited for the target architecture. Lastly, the expert expresses that implementation in assembly code. In this paper, we argue that this process is mechanical and can be captured in a rule based program generation system such as Spiral. We then show that given this machinery, Spiral can produce code for large size MMM implementations that are competitive with hand tuned code.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Goto, K., van de Geijn, R.: Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw. 34, 12:1–12:25 (2008)CrossRef Goto, K., van de Geijn, R.: Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw. 34, 12:1–12:25 (2008)CrossRef
2.
Zurück zum Zitat Van Zee, F., van de Geijn, R.: BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans. Math. Softw. (2013) Van Zee, F., van de Geijn, R.: BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans. Math. Softw. (2013)
3.
Zurück zum Zitat Spampinato, D., Püschel, M.: A Basic Linear Algebra Compiler. ACM CG 23 (2014) Spampinato, D., Püschel, M.: A Basic Linear Algebra Compiler. ACM CG 23 (2014)
4.
Zurück zum Zitat Qian, W., Xianyi, Z., Yunquan, Z., Yi, Q.: AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs. In: International Conference on High Performance Computing (2013) Qian, W., Xianyi, Z., Yunquan, Z., Yi, Q.: AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs. In: International Conference on High Performance Computing (2013)
5.
Zurück zum Zitat Franchetti, F., de Mesmay, F., McFarlin, D., Püschel, M.: Operator language: a program generation framework for fast kernels. In: Taha, W.M. (ed.) DSL 2009. LNCS, vol. 5658, pp. 385–409. Springer, Heidelberg (2009) CrossRef Franchetti, F., de Mesmay, F., McFarlin, D., Püschel, M.: Operator language: a program generation framework for fast kernels. In: Taha, W.M. (ed.) DSL 2009. LNCS, vol. 5658, pp. 385–409. Springer, Heidelberg (2009) CrossRef
6.
Zurück zum Zitat Franchetti, F., Püschel, M.: Formal loop merging for signal transforms. In: PLDI, pp. 315–326 (2005) Franchetti, F., Püschel, M.: Formal loop merging for signal transforms. In: PLDI, pp. 315–326 (2005)
7.
Zurück zum Zitat Püschel, M., Moura, J., Johnson, J., Padua, D., Veloso, M., Singer, B., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R., Rizzolo, N.: SPIRAL: code generation for DSP transforms. In: Proceedings of IEEE on “Program Generation, Optimization and Adaptation”, vol.93, pp. 232–275 (2005) Püschel, M., Moura, J., Johnson, J., Padua, D., Veloso, M., Singer, B., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R., Rizzolo, N.: SPIRAL: code generation for DSP transforms. In: Proceedings of IEEE on “Program Generation, Optimization and Adaptation”, vol.93, pp. 232–275 (2005)
8.
Zurück zum Zitat Siek, J., Karlin, I., Jessup, E.: Build to order linear algebra kernels. In: Workshop on Performance Optimization of High-level Languages and Libraries (POHLL08) (2009) Siek, J., Karlin, I., Jessup, E.: Build to order linear algebra kernels. In: Workshop on Performance Optimization of High-level Languages and Libraries (POHLL08) (2009)
9.
Zurück zum Zitat Marker, B.: Design by transformation: from domain knowledge to optimized program generation. Doctoral Dissertation,Department of Computer Science, The University of Texas at Austin (2014) Marker, B.: Design by transformation: from domain knowledge to optimized program generation. Doctoral Dissertation,Department of Computer Science, The University of Texas at Austin (2014)
10.
Zurück zum Zitat Marker, B., Smith, T., Batory, D., Van Zee, F., Van de Geijn, R.: Code generation to aid parallel code development. Technical report TR-14-08, The University of Texas at Austin, Department of Computer Science (2014) Marker, B., Smith, T., Batory, D., Van Zee, F., Van de Geijn, R.: Code generation to aid parallel code development. Technical report TR-14-08, The University of Texas at Austin, Department of Computer Science (2014)
11.
Zurück zum Zitat Lam, M.: Software pipelining: an effective scheduling technique for VLIW machines. In: PLDI, pp. 318–328 (2008) Lam, M.: Software pipelining: an effective scheduling technique for VLIW machines. In: PLDI, pp. 318–328 (2008)
12.
Zurück zum Zitat Whaley. C.R., Dongarra, J.: Automatically tuned linear algebra software. In: SIAM Conference on Parallel Processing for Scientific Computing (1999) Whaley. C.R., Dongarra, J.: Automatically tuned linear algebra software. In: SIAM Conference on Parallel Processing for Scientific Computing (1999)
Metadaten
Titel
Capturing the Expert: Generating Fast Matrix-Multiply Kernels with Spiral
verfasst von
Richard Veras
Franz Franchetti
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-17353-5_20