Skip to main content
Top

2020 | OriginalPaper | Chapter

16. Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture

Authors : Hana Alyahya, Rashid Mehmood, Iyad Katib

Published in: Smart Infrastructure and Applications

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Many important scientific, engineering, and smart city applications require solving large sparse linear equation systems. The numerical methods for solving linear equations can be categorised into direct methods and iterative methods. Jacobi method is one of the iterative solvers that has been widely used due to its simplicity and efficiency. Its performance is affected by factors including the storage format, the specific computational algorithm, and its implementation. While the performance of Jacobi has been studied extensively on conventional CPU architectures, research on its performance on emerging architectures, such as the Intel Many Integrated Core (MIC) architecture, is still in its infancy. In this chapter, we investigate the performance of parallel implementations of the Jacobi method on Knights Corner (KNC), the first generation of the Intel MIC architectures. We implement Jacobi with two storage formats, Compressed Sparse Row (CSR) and Modified Sparse Row (MSR), and measure their performance in terms of execution time, offloading time, and speedup. We report results of sparse matrices with over 28 million rows and 640 million non-zero elements acquired from 13 diverse application domains. The experimental results show that our Jacobi parallel implementation on MIC achieves speedups of up to 27.75× compared to the sequential implementation. It also delivers a speedup of up to 3.81× compared to a powerful node comprising 24 cores in two Intel Xeon E5-2695v2 processors.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Mehmood, R., Alturki, R., Zeadally, S.: Multimedia applications over metropolitan area networks (MANs). J. Netw. Comput. Appl. 34, 1518–1529 (2011)CrossRef Mehmood, R., Alturki, R., Zeadally, S.: Multimedia applications over metropolitan area networks (MANs). J. Netw. Comput. Appl. 34, 1518–1529 (2011)CrossRef
2.
go back to reference Mehmood, R., Meriton, R., Graham, G., Hennelly, P., Kumar, M.: Exploring the influence of big data on city transport operations: a Markovian approach. Int. J. Oper. Prod. Manag. 37, 75–104 (2017)CrossRef Mehmood, R., Meriton, R., Graham, G., Hennelly, P., Kumar, M.: Exploring the influence of big data on city transport operations: a Markovian approach. Int. J. Oper. Prod. Manag. 37, 75–104 (2017)CrossRef
3.
go back to reference Mehmood, R., Graham, G.: Big data logistics: a health-care transport capacity sharing model. Proc. Comput. Sci. 64, 1107–1114 (2015)CrossRef Mehmood, R., Graham, G.: Big data logistics: a health-care transport capacity sharing model. Proc. Comput. Sci. 64, 1107–1114 (2015)CrossRef
4.
go back to reference Mehmood, R., Lu, J.A.: Computational Markovian analysis of large systems. J. Manuf. Technol. Manag. 22, 804–817 (2011)CrossRef Mehmood, R., Lu, J.A.: Computational Markovian analysis of large systems. J. Manuf. Technol. Manag. 22, 804–817 (2011)CrossRef
5.
go back to reference Altowaijri, S., Mehmood, R., Williams, J.: A quantitative model of grid systems performance in healthcare organisations. In: 2010 Int. Conf. Intell. Syst. Model. Simul., pp. 431–436 (2010) Altowaijri, S., Mehmood, R., Williams, J.: A quantitative model of grid systems performance in healthcare organisations. In: 2010 Int. Conf. Intell. Syst. Model. Simul., pp. 431–436 (2010)
6.
go back to reference Saad, Y.: Iterative methods for sparse linear systems. Society for Industrial and Applied Mathematics (2003) Saad, Y.: Iterative methods for sparse linear systems. Society for Industrial and Applied Mathematics (2003)
7.
go back to reference Golub, G.H., Van Loan, C.F.: Matrix Computations (2013) Golub, G.H., Van Loan, C.F.: Matrix Computations (2013)
8.
go back to reference Ford, W.: Chapter 20: Basic iterative methods. In: Ford, W. (ed.) Numerical Linear Algebra with Applications, pp. 469–490. Academic, Boston (2015) Ford, W.: Chapter 20: Basic iterative methods. In: Ford, W. (ed.) Numerical Linear Algebra with Applications, pp. 469–490. Academic, Boston (2015)
9.
go back to reference Mehmood, R.: Disk-based techniques for efficient solution of large Markov chains. PhD Thesis, School of Computer Science, University of Birmingham (2004) Mehmood, R.: Disk-based techniques for efficient solution of large Markov chains. PhD Thesis, School of Computer Science, University of Birmingham (2004)
10.
go back to reference Kwiatkowska, M., Mehmood, R., Norman, G., Parker, D.: A symbolic out-of-core solution method for Markov models. Electron. Notes Theor. Comput. Sci. 68, 589–604 (2002)CrossRef Kwiatkowska, M., Mehmood, R., Norman, G., Parker, D.: A symbolic out-of-core solution method for Markov models. Electron. Notes Theor. Comput. Sci. 68, 589–604 (2002)CrossRef
11.
go back to reference Kwiatkowska, M., Parker, D., Yi Zhang, Y., Mehmood, R.: Dual-processor parallelisation of symbolic probabilistic model checking. In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings, pp. 123–130. IEEE (2004) Kwiatkowska, M., Parker, D., Yi Zhang, Y., Mehmood, R.: Dual-processor parallelisation of symbolic probabilistic model checking. In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings, pp. 123–130. IEEE (2004)
12.
go back to reference Mehmood, R., Crowcroft, J., Elmirghani, J.M.H.: A parallel implicit method for the steady-state solution of CTMCs. In: 14th IEEE International Symposium on Modeling, Analysis, and Simulation, pp. 293–302. IEEE (2006) Mehmood, R., Crowcroft, J., Elmirghani, J.M.H.: A parallel implicit method for the steady-state solution of CTMCs. In: 14th IEEE International Symposium on Modeling, Analysis, and Simulation, pp. 293–302. IEEE (2006)
13.
go back to reference Mehmood, R.: A survey of out-of-core analysis techniques in stochastic modelling. Technical Report CSR-03-7, School of Computer Science, University of Birmingham, Birningham (2003) Mehmood, R.: A survey of out-of-core analysis techniques in stochastic modelling. Technical Report CSR-03-7, School of Computer Science, University of Birmingham, Birningham (2003)
14.
go back to reference Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient symbolic out-of-core solution method for markov models. Technical Report CSR-03-08, School of Computer Science, University of Birmingham, Birmingham (2003) Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient symbolic out-of-core solution method for markov models. Technical Report CSR-03-08, School of Computer Science, University of Birmingham, Birmingham (2003)
15.
go back to reference Saad, Y., Van Der Vost, H.A.: Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math. 123, 1–33 (2000)MathSciNetCrossRef Saad, Y., Van Der Vost, H.A.: Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math. 123, 1–33 (2000)MathSciNetCrossRef
16.
go back to reference Banu, S., Vaideeswaran, D.: Performance Analysis on Parallel Sparse Matrix Vector Multiplication Micro-Benchmark Using Dynamic Instrumentation Pintool. Presented at the 2013, pp. 129–136 (2013) Banu, S., Vaideeswaran, D.: Performance Analysis on Parallel Sparse Matrix Vector Multiplication Micro-Benchmark Using Dynamic Instrumentation Pintool. Presented at the 2013, pp. 129–136 (2013)
17.
go back to reference Kwiatkowska, M., Mehmood, R.: Out-of-core solution of large linear systems of equations arising from Stochastic modelling. In: Hermanns, H., Segala, R. (eds.) Process Algebra and Probabilistic Methods: Performance Modeling and Verification. PAPM-PROBMIV, pp. 135–151. Springer, Berlin, Heidelberg (2002)CrossRef Kwiatkowska, M., Mehmood, R.: Out-of-core solution of large linear systems of equations arising from Stochastic modelling. In: Hermanns, H., Segala, R. (eds.) Process Algebra and Probabilistic Methods: Performance Modeling and Verification. PAPM-PROBMIV, pp. 135–151. Springer, Berlin, Heidelberg (2002)CrossRef
18.
go back to reference Mehmood, R.: Serial disk-based analysis of large stochastic models. In: Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.-P., Siegle, M. (eds.) Validation of Stochastic Systems: A Guide to Current Research, pp. 230–255. Springer, Berlin, Heidelberg (2004)CrossRef Mehmood, R.: Serial disk-based analysis of large stochastic models. In: Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.-P., Siegle, M. (eds.) Validation of Stochastic Systems: A Guide to Current Research, pp. 230–255. Springer, Berlin, Heidelberg (2004)CrossRef
19.
go back to reference Mehmood, R., Crowcroft, J.: Parallel iterative solution method for large sparse linear equation systems. Technical Report Number UCAM-CL-TR-650, Computer Laboratory, University of Cambridge, Cambridge (2005) Mehmood, R., Crowcroft, J.: Parallel iterative solution method for large sparse linear equation systems. Technical Report Number UCAM-CL-TR-650, Computer Laboratory, University of Cambridge, Cambridge (2005)
20.
go back to reference Giles, M.B., Reguly, I.: Trends in high-performance computing for engineering calculations. Philos. Trans. R. Soc. A. 372, 20130319 (2014)CrossRef Giles, M.B., Reguly, I.: Trends in high-performance computing for engineering calculations. Philos. Trans. R. Soc. A. 372, 20130319 (2014)CrossRef
21.
go back to reference Sodani, A., Gramunt, R., Corbal, J., Kim, H.S., Vinod, K., Chinthamani, S., Hutsell, S., Agarwal, R., Liu, Y.C.: Knights landing: second-generation Intel Xeon Phi Product. IEEE Micro. 36, 34–46 (2016)CrossRef Sodani, A., Gramunt, R., Corbal, J., Kim, H.S., Vinod, K., Chinthamani, S., Hutsell, S., Agarwal, R., Liu, Y.C.: Knights landing: second-generation Intel Xeon Phi Product. IEEE Micro. 36, 34–46 (2016)CrossRef
22.
go back to reference Eleliemy, A., Fayze, M., Mehmood, R., Katib, I., Aljohani, N.: Loadbalancing on parallel heterogeneous architectures: spin-image algorithm on CPU and MIC. In: EUROSIM 2016, The 9th Eurosim Congress on Modelling and Simulation. p. 6. Oulu (2016) Eleliemy, A., Fayze, M., Mehmood, R., Katib, I., Aljohani, N.: Loadbalancing on parallel heterogeneous architectures: spin-image algorithm on CPU and MIC. In: EUROSIM 2016, The 9th Eurosim Congress on Modelling and Simulation. p. 6. Oulu (2016)
23.
go back to reference Alyahya, H., Mehmood, R., Katib, I.: Parallel sparse matrix vector multiplication on intel MIC: performance analysis. In: Smart Societies, Infrastructure, Technologies and Applications, SCITA 2017. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, vol. 224, pp. 306–322. Springer, Cham (2018) Alyahya, H., Mehmood, R., Katib, I.: Parallel sparse matrix vector multiplication on intel MIC: performance analysis. In: Smart Societies, Infrastructure, Technologies and Applications, SCITA 2017. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, vol. 224, pp. 306–322. Springer, Cham (2018)
24.
go back to reference Björck, Å.: Numerical Methods in Matrix Computations. Springer International Publishing, Cham (2015)CrossRef Björck, Å.: Numerical Methods in Matrix Computations. Springer International Publishing, Cham (2015)CrossRef
25.
go back to reference Gander, W., Gander, M.J., Kwok, F.: Scientific Computing - An Introduction Using Maple and MATLAB. Springer Publishing Company, Incorporated (2014)CrossRef Gander, W., Gander, M.J., Kwok, F.: Scientific Computing - An Introduction Using Maple and MATLAB. Springer Publishing Company, Incorporated (2014)CrossRef
26.
go back to reference Koza, Z., Matyka, M., Mirosław, Ł., Poła, J.: Sparse matrix-vector product. In: Kindratenko, V. (ed.) Numerical Computations with GPUs, pp. 103–121. Springer International Publishing, Cham (2014) Koza, Z., Matyka, M., Mirosław, Ł., Poła, J.: Sparse matrix-vector product. In: Kindratenko, V. (ed.) Numerical Computations with GPUs, pp. 103–121. Springer International Publishing, Cham (2014)
27.
go back to reference Akhunov, R.R., Kuksenko, S.P., Salov, V.K., Gazizov, T.R.: Sparse matrix storage formats and acceleration of iterative solution of linear algebraic systems with dense matrices. J. Math. Sci. (United States). 191, 10–18 (2013)MATH Akhunov, R.R., Kuksenko, S.P., Salov, V.K., Gazizov, T.R.: Sparse matrix storage formats and acceleration of iterative solution of linear algebraic systems with dense matrices. J. Math. Sci. (United States). 191, 10–18 (2013)MATH
28.
go back to reference Cramer, T., Schmidl, D., Klemm, M., Mey, D.: OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison, pp. 38–44. Marc@Rwth (2012) Cramer, T., Schmidl, D., Klemm, M., Mey, D.: OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison, pp. 38–44. Marc@Rwth (2012)
29.
go back to reference Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., Wang, Y.: High-Performance Computing on the Intel® Xeon Phi™. Springer International Publishing, Cham (2014)CrossRef Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., Wang, Y.: High-Performance Computing on the Intel® Xeon Phi™. Springer International Publishing, Cham (2014)CrossRef
30.
go back to reference Maeda, H., Takahashi, D.: Performance evaluation of sparse matrix-vector multiplication using GPU/MIC cluster. In: 2015 Third International Symposium on Computing and Networking. pp. 396–399 (2015) Maeda, H., Takahashi, D.: Performance evaluation of sparse matrix-vector multiplication using GPU/MIC cluster. In: 2015 Third International Symposium on Computing and Networking. pp. 396–399 (2015)
31.
go back to reference Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient BDD-based implementation of Gauss-Seidel for CTMC analysis. Technical Report CSR-03-13, School of Computer Science, University of Birmingham, Birmingham (2013) Mehmood, R., Parker, D., Kwiatkowska, M.: An efficient BDD-based implementation of Gauss-Seidel for CTMC analysis. Technical Report CSR-03-13, School of Computer Science, University of Birmingham, Birmingham (2013)
32.
go back to reference Tang, Z., Huang, H., Jiang, H., Li, B.: MIC-based preconditioned conjugate gradient method for solving large sparse linear equations. In: Hung, J., Yen, N., Li, K.C. (eds.) Frontier Computing. Lecture Notes in Electrical Engineering, vol. 375. Springer, Singapore (2016) Tang, Z., Huang, H., Jiang, H., Li, B.: MIC-based preconditioned conjugate gradient method for solving large sparse linear equations. In: Hung, J., Yen, N., Li, K.C. (eds.) Frontier Computing. Lecture Notes in Electrical Engineering, vol. 375. Springer, Singapore (2016)
33.
go back to reference Li, Z., Donde, V.D., Tournier, J.-C., Yang, F.: On limitations of traditional multi-core and potential of many-core processing architectures for sparse linear solvers used in large-scale power system applications. In: 2011 IEEE Power and Energy Society General Meeting, pp. 1–8. IEEE (2011) Li, Z., Donde, V.D., Tournier, J.-C., Yang, F.: On limitations of traditional multi-core and potential of many-core processing architectures for sparse linear solvers used in large-scale power system applications. In: 2011 IEEE Power and Energy Society General Meeting, pp. 1–8. IEEE (2011)
34.
go back to reference Yan, D., Cao, H., Dong, X., Zhang, B., Zhang, X.: Optimizing algorithm of sparse linear systems on GPU. In: 2011 Sixth Annu. Chinagrid Conf. pp. 174–179 (2011) Yan, D., Cao, H., Dong, X., Zhang, B., Zhang, X.: Optimizing algorithm of sparse linear systems on GPU. In: 2011 Sixth Annu. Chinagrid Conf. pp. 174–179 (2011)
35.
go back to reference Ye, F., Calvin, C., Petiton, S.G.: A study of SpMV implementation using MPI and OpenMP on Intel many-Core architecture. In: High Performance Computing for Computational Science—VECPAR 2014: 11th International Conference, Eugene, OR, USA, June 30–July 3, 2014, Revised Selected Papers, pp. 43–56. Springer International Publishing, Cham (2015) Ye, F., Calvin, C., Petiton, S.G.: A study of SpMV implementation using MPI and OpenMP on Intel many-Core architecture. In: High Performance Computing for Computational Science—VECPAR 2014: 11th International Conference, Eugene, OR, USA, June 30–July 3, 2014, Revised Selected Papers, pp. 43–56. Springer International Publishing, Cham (2015)
36.
go back to reference Saule, E., Kaya, K., Catalyurek, U.V.: Performance evaluation of sparse matrix multiplication kernels on Intel Xeon Phi, ArXiv, Tech. Rep. arXiv:1302.1078, Feb (2013) Saule, E., Kaya, K., Catalyurek, U.V.: Performance evaluation of sparse matrix multiplication kernels on Intel Xeon Phi, ArXiv, Tech. Rep. arXiv:1302.1078, Feb (2013)
37.
go back to reference Maeda, H., Takahashi, D.: Parallel sparse matrix-vector multiplication using accelerators. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Torre, C.M., Taniar, D., Apduhan, B.O., Stankova, E., Wang, S. (eds.) Computational Science and Its Applications—ICCSA 2016: 16th International Conference, Beijing, China, July 4–7, 2016, Proceedings, Part II, pp. 3–18. Springer International Publishing, Cham (2016)CrossRef Maeda, H., Takahashi, D.: Parallel sparse matrix-vector multiplication using accelerators. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Torre, C.M., Taniar, D., Apduhan, B.O., Stankova, E., Wang, S. (eds.) Computational Science and Its Applications—ICCSA 2016: 16th International Conference, Beijing, China, July 4–7, 2016, Proceedings, Part II, pp. 3–18. Springer International Publishing, Cham (2016)CrossRef
38.
go back to reference Ahamed, A.-K.C., Magoules, F.: Iterative methods for sparse linear systems on graphics processing unit. In: 2012 IEEE 14th Int. Conf. High Perform. Comput. Commun. 2012 IEEE 9th Int. Conf. Embed. Softw. Syst. pp. 836–842 (2012) Ahamed, A.-K.C., Magoules, F.: Iterative methods for sparse linear systems on graphics processing unit. In: 2012 IEEE 14th Int. Conf. High Perform. Comput. Commun. 2012 IEEE 9th Int. Conf. Embed. Softw. Syst. pp. 836–842 (2012)
39.
go back to reference Estebanez, A., Llanos, D.R., Gonzalez-Escribano, A.: Using the Xeon Phi platform to run speculatively-parallelized codes. Int. J. Parallel Prog. 45, 225–241 (2017)CrossRef Estebanez, A., Llanos, D.R., Gonzalez-Escribano, A.: Using the Xeon Phi platform to run speculatively-parallelized codes. Int. J. Parallel Prog. 45, 225–241 (2017)CrossRef
40.
go back to reference Dongarra, J., Gates, M., Haidar, A., Jia, Y., Kabir, K., Luszczek, P., Tomov, S.: HPC programming on Intel many-integrated-Core hardware with MAGMA port to Xeon phi. Sci. Program. 2015, 1–11 (2015) Dongarra, J., Gates, M., Haidar, A., Jia, Y., Kabir, K., Luszczek, P., Tomov, S.: HPC programming on Intel many-integrated-Core hardware with MAGMA port to Xeon phi. Sci. Program. 2015, 1–11 (2015)
41.
go back to reference Chen, C., Yang, C., Tang, T., Wu, Q., Zhang, P.: OpenACC to intel offload: automatic translation and optimization. In: Communications in Computer and Information Science. pp. 111–120 (2013) Chen, C., Yang, C., Tang, T., Wu, Q., Zhang, P.: OpenACC to intel offload: automatic translation and optimization. In: Communications in Computer and Information Science. pp. 111–120 (2013)
42.
go back to reference Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1–1:25 (2011)MathSciNetMATH Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1–1:25 (2011)MathSciNetMATH
Metadata
Title
Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture
Authors
Hana Alyahya
Rashid Mehmood
Iyad Katib
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-13705-2_16