Elsevier

Electric Power Systems Research

Volume 116, November 2014, Pages 87-93
Electric Power Systems Research

GPU-based power flow analysis with Chebyshev preconditioner and conjugate gradient method

https://doi.org/10.1016/j.epsr.2014.05.005Get rights and content

Highlights

  • In this paper, a GPU-based Chebyshev preconditioner is developed with the integration of an iterative conjugate gradient (CG) solver.

  • This work targets at solving the power flow equations in power systems, as well as any sparse linear systems that are symmetric positive definite.

  • Our work considers Chebyshev preconditioner and conjugate gradient method together to choose the proper degree for Chebyshev preconditioner.

  • The max speedup can reach 46× for Chebyshev preconditioner and 4× for CG solver among all sample systems over the corresponding Matlab baseline.

  • The results suggest that the iterative solver should be considered to further improve the overall performance of solving linear equations.

Abstract

Traditionally, linear equations in power system applications are solved by direct methods based on LU decomposition. With the development of advanced power system controls, the industrial and research community is more interested in simulating larger, interconnected power grids. Iterative methods such as the conjugate gradient method have been applied to power system applications in the literature for its parallelism potential with larger systems. Preconditioner, used for preconditioning the linear system for a better convergence rate in iterative computations, is an indispensable part of iterative solving process. This work implemented a polynomial preconditioner Chebyshev preconditioner with graphic processing unit (GPU), and integrated a GPU-based conjugate gradient solver. Results show that GPU-based Chebyshev preconditioner can reach around 46× speedup for the largest test system, and conjugate gradient can gain more than 4× speedup. This demonstrates great potentials for GPU application in power system simulation.

Introduction

Power flow is the fundamental component in power system analysis and simulation. It is usually modeled as a nonlinear system. The Newton–Raphson method converts this nonlinear system to a group of linear equations with the introduction of Jacobian matrix as Eq. (1) shows.ΔPΔQ=ΔPδΔPVΔQδΔQVΔδΔV

Each iteration in Newton–Raphson method requires solving a set of sparse linear equations. We measured the linear equation solving time and total run time for large systems in MATPOWER. The results show that about 40–50% of the total time is spent on solving linear equations. Therefore improving the efficiency of solving linear system is of great importance for accelerating power flow analysis.

LU factorization, the most commonly used direct method, is widely deployed in solving power flows. However, LU factorization is intrinsically a serial algorithm and difficult to parallelize due to the tight data dependency during factorization. LU factorization will be effective for smaller systems while its performance is limited for larger systems. Leon and Semlyen have proven that the iterative solver can provide about 25% performance improvement over LU direct solver for the systems larger than 3183-bus system [1].

On one hand, iterative methods have been adapted to power system computation in various aspects. Pai et al. [2] have implemented the Generalized Minimal Residual (GMRES) method on a Cray machine for dynamic power system simulation. Pai and Dag [3] further applied several iterative solvers including conjugate gradient and GMRES to dynamic power flow simulation and state estimation. On the other hand, the graphic processing unit (GPU) has been widely adopted in high performance computing recently as a parallel hardware architecture. The GPU was originally designed for graphic displaying and processing. It has massive parallel computing units on board to perform graphic computations. Computational unified device architecture (CUDA) [4] provides a C-like programing interface for users to utilize these computation resources. GPU as a co-processor helps a commodity server deliver more computational throughput.

There has been research about GPU implementation of iterative method in different realms. Helfenstein and Koko implemented a SSOR preconditioner and conjugate gradient solver on GPU to solve generalized Poisson equations [5]. Zhang and Zhang presented a sliced block ELLPACK format to implement a least-square polynomial preconditioned conjugate gradient method for finite element problem [6]. Researches based on high performance computing especially GPU related methods begin to emerge in power system applications too. Garcia implemented a preconditioned biconjugate gradient method with GPU and CUDA [7]. Li et al. [8] discussed the limitation of using direct solver to solve large systems, and introduced GPU-based conjugate gradient normal residual with Jacobi preconditioner. Multifrontal method with CUDA library solved AC power flow was adapted as well [9]. Gopal et al. [10] implemented a DC power flow based contingency analysis with GPU. Kamiabad also did a prototype implementation of a Chebyshev polynomial preconditioner and conjugate gradient method with CUBLAS [11]. However, the speedup for their Chebyshev preconditioner is limited to up to 8×.

In this work, a polynomial preconditioner Chebyshev preconditioner with graphic processing unit (GPU) will be implemented and integrated with a GPU-based conjugate gradient solver for linearized DC power flows. Results show that our GPU-based Chebyshev preconditioner can reach around 46× speedup and the conjugate gradient can gain more than 4× speedup compared with corresponding CPU implementation. This demonstrates great potentials for GPU applications in power systems.

The rest of this paper is organized as follows. Section 2 takes a closer look at iterative solutions and the polynomial preconditioner Chebyshev preconditioner. Section 3 introduces GPU and CUDA technology in detail. Section 4 presents the algorithm that this work uses and the corresponding GPU-based implementation. Computing experiments are shown in Section 5. A further discussion is extended in Section 6. Section 7 closes the whole paper.

Section snippets

Conjugate gradient

The linear system Ax = b can be solved either directly by using LU decomposition, or indirectly by finding out the minimum value for a quadratic form as Eq. (2) shows:f(x)=12xAxbx+c

In Eq. (2), A is a symmetric positive definite matrix and b is a vector. The derivative of Eq. (2) is f(x)=Axb. Therefore the x which provides minimum value of Eq. (2) satisfies Ax  b = 0. Thus, it is the solution of Ax = b as well.

An intuitive way to find out the minimum value of Eq. (2) is to use the steepest descent

GPU and CUDA

GPU for general purpose computations has been widely deployed nowadays. GPU was originally designed for graphic processing, which requires intensive floating point computations. Before the release of CUDA, there were only a few general purpose computations that could run on GPU. However they have to be done through graphic application programming interface (API). The higher learning cost for using graphic API limits the development of general purpose computation on GPU. CUDA introduces a C-like

Chebyshev preconditioner algorithm

Chebyshev preconditioner algorithm is presented in Fig. 1. β is the largest eigenvalue of matrix A. α is the smallest eigenvalue of matrix A. ratio is used to estimate the value of α. Z transforms A's spectrum to [−1,1]. The decay rate ck is related to α and β. r is the degree of Chebyshev preconditioner. Dag and Semlyen [12] discussed how to choose ratio and r in detail. Matrix G is the approximation of A's inverse. The output of Chebyshev preconditioner algorithm is matrix G. Bolded lines are

Computational experiment

This section will present computational experiment results beginning with selecting the degree for the Chebyshev Preconditioner, and then the performance comparison between Matlab implementation and our GPU implementation. Finally further performance improvement is discussed. Since Matlab's default floating point processing precision is double precision, our GPU implementations are all based on double precision floating point numbers for fair comparison purpose.

Note, throughout the

Discussion

Our work discusses the GPU-based implementation of an iterative solver: conjugate gradient solver, and a polynomial preconditioner: the Chebyshev preconditioner. Because of its potential in parallelism and scalability, iterative linear solvers have been adapted to power system applications [3], [12], [21]. Preconditioner plays an important role in the iterative solver. Previously, preconditioner like ILU was widely used to precondition the matrix. However, they suffer a tight data dependency

Conclusion

Power system applications such as power system optimization, control and analysis require intensive computational ability [22]. Solving sparse linear systems is a critical computation element involved in these applications. Our work presents a GPU-based Chebyshev preconditioner, and integrates the iterative conjugate gradient solver for a whole iterative solving chain. Our implementation uses native functions from CUSPARSE and CUBLAS libraries which are already optimized. Implementations based

Acknowledgement

This work was supported in part by US NSF grant ECCS-1128381. This work also made use of the Shared Facilities and the Industry Partnership Program supported by CURENT, an Engineering Research Center (ERC) Program of NSF and DOE under NSF grant EEC-1041877.

Xue Li received her BS degree from Northwestern Polytechnical University China in 2007 and MS degree in 2011 from University of Tennessee. She is presently pursuing her Ph.D. degree in the Department of EECS, The University of Tennessee, Knoxville, TN, 37922, USA. Her research interest is computational methods for power system analysis.

References (22)

  • A.A. Kamiabad

    Implementing a Preconditioned Iterative Linear Solver Using Massively Parallel Graphics Processing Units

    (2011)
  • Cited by (49)

    • Recycling Newton–Krylov algorithm for efficient solution of large scale power systems

      2023, International Journal of Electrical Power and Energy Systems
      Citation Excerpt :

      The iterative solvers based on Krylov subspaces are preferred candidates to obtain efficient solutions for large systems, especially on high-performance computing platforms [24]. However, iterative solvers may need a proper design of a preconditioner for improving their efficiencies [25]. Besides the well-known Krylov subspace-based solvers, there are several attempts in the literature to improve the efficiency of the solution procedure.

    • GPU-based matrix structure driven state estimation for large-scale power systems

      2021, International Journal of Electrical Power and Energy Systems
    • GPU-accelerated sparse matrices parallel inversion algorithm for large-scale power systems

      2019, International Journal of Electrical Power and Energy Systems
    View all citing articles on Scopus

    Xue Li received her BS degree from Northwestern Polytechnical University China in 2007 and MS degree in 2011 from University of Tennessee. She is presently pursuing her Ph.D. degree in the Department of EECS, The University of Tennessee, Knoxville, TN, 37922, USA. Her research interest is computational methods for power system analysis.

    Fangxing Li, also known as Fran Li, received his Ph.D. degree from Virginia Tech in 2001. Presently, he is an Associate Professor at The University of Tennessee, Knoxville, TN, USA. Dr. Li is a registered Professional Engineer (P.E.) in North Carolina, and a Fellow of IET (formerly IEE).

    View full text