2008 | OriginalPaper | Buchkapitel
On the Efficiency of Python for High-Performance Computing: A Case Study Involving Stencil Updates for Partial Differential Equations
verfasst von : Hans Petter Langtangen, Xing Cai
Erschienen in: Modeling, Simulation and Optimization of Complex Processes
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
The purpose of this paper is to assess the loss of computational efficiency that may occur when scientific codes are written in the Python programming language instead of Fortran or C. Our test problems concern the application of a seven-point finite stencil for a three-dimensional, variable coefficient, Laplace operator. This type of computation appears in lots of codes solving partial differential equations, and the variable coefficient is a key ingredient to capture the arithmetic complexity of stencils arising in advanced multi-physics problems in heterogeneous media.
Different implementations of the stencil operation are described: pure Python loops over Python arrays, Psyco-acceleration of pure Python loops, vectorized loops (via shifted slice expressions), inline C++ code (via Weave), and migration of stencil loops to Fortran 77 (via F2py) and C. The performance of these implementations are compared against codes written entirely in Fortran 77 and C. We observe that decent performance is obtained with vectorization or migration of loops to compiled code. Vectorized loops run between two and five times slower than the pure Fortran and C codes. Mixed-language implementations, Python-Fortran and Python-C, where only the loops are implemented in Fortran or C, run at the same speed as the pure Fortran and C codes.
At present, there are three alternative (and to some extent competing) implementations of Numerical Python: numpy, numarray, and Numeric. Our tests uncover significant performance differences between these three alternatives. Numeric is fastest on scalar operations with array indexing, while numpy is fastest on vectorized operations with array slices.
We also present parallel versions of the stencil operations, where the loops are migrated to C for efficiency, and where the message passing statements are written in Python, using the high-level pypar interface to MPI. For the current test problems, there are hardly any efficiency loss by doing the message passing in Python. Moreover, adopting the Python interface of MPI gives a more elegant parallel implementation, both due to a simpler syntax of MPI calls and due to the efficient array slicing functionality that comes with Numerical Python.