nach oben

Erschienen in:

2015 | OriginalPaper | Buchkapitel

Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM

verfasst von : Jian Lin, Khaled Hamidouche, Jie Zhang, Xiaoyi Lu, Abhinav Vishnu, Dhabaleswar Panda

Erschienen in: OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Machine learning algorithms are benefiting from the continuous improvement of programming models, including MPI, MapReduce and PGAS. k-Nearest Neighbors (k-NN) algorithm is a widely used machine learning algorithm, applied to supervised learning tasks such as classification. Several parallel implementations of k-NN have been proposed in the literature and practice. However, on high-performance computing systems with high-speed interconnects, it is important to further accelerate existing designs of the k-NN algorithm through taking advantage of scalable programming models. To improve the performance of k-NN on large-scale environment with InfiniBand network, this paper proposes several alternative hybrid MPI+OpenSHMEM designs and performs a systemic evaluation and analysis on typical workloads. The hybrid designs leverage the one-sided memory access to better overlap communication with computation than the existing pure MPI design, and propose better schemes for efficient buffer management. The implementation based on k-NN program from MaTEx toolkit with MVAPICH2-X (Unified MPI+PGAS Communication Runtime over InfiniBand) shows up to 9.0 % time reduction for training KDD Cup 2010 workload over 512 cores, and 27.6 % time reduction for small workload with balanced communication and computation. Experiments of running with varied number of cores show that our design can maintain good scalability.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Graph 500 in OpenSHMEM

Nächstes Kapitel Parallelizing the Smith-Waterman Algorithm Using OpenSHMEM and MPI-3 One-Sided Interfaces

Altman, N.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)MathSciNet

Apache Software Foundation: Apache Hadoop. http://hadoop.apache.org/

Apache Software Foundation: Apache Mahout. http://mahout.apache.org/

Aparício, G., Blanquer, I., Hernández, V.: A parallel implementation of the K nearest neighbours classifier in three levels: threads, MPI processes and the grid. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds.) VECPAR 2006. LNCS, vol. 4395, pp. 225–235. Springer, Heidelberg (2007) CrossRef

Arefin, A.S., Riveros, C., Berretta, R., Moscato, P.: GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs. PLoS ONE 7, e44000 (2012)CrossRef

Carlson, W., Draper, J., Culler, D., Yelick, K., Brooks, E., Warren, K.: Introduction to UPC and Language Specification. Center for Computing Sciences, Institute for Defense Analyses (1999)

Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)CrossRef

Chapman, B., Curtis, T., Pophale, S., Poole, S., Kuehn, J., Koelbel, C., Smith, L.: Introducing openSHMEM: SHMEM for the PGAS community. In: Proceedings of the 4th Conference on Partitioned Global Address Space Programming Model, p. 2 (2010)

Chu, C.T., Kim, S., Lin, Y.a., Yu, Y., Bradski, G., Olukotun, K., Ng, A.: Map-reduce for machine learning on multicore. In: Advances in Neural Information Processing Systems, vol. 19 (2007)

10.

Dongarra, J., Beckman, P., Moore, T., Aerts, P., et al.: The international exascale software project roadmap. Int. J. High Perform. Comput. Appl. 25(1), 3–60 (2011)CrossRef

11.

Ghoting, A., Krishnamurthy, R., Pednault, E., Reinwald, B., Sindhwani, V., Tatikonda, S., Tian, Y., Vaithyanathan, S.: SystemML: declarative machine learning on mapreduce. In: Proceedings of IEEE 27th International Conference on Data Engineering (2011)

12.

Jose, J., Potluri, S., Subramoni, H., Lu, X., Hamidouche, K., Schulz, K., Sundar, H., Panda, D.K.: Designing scalable out-of-core sorting with hybrid MPI+PGAS programming models. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models (2014)

13.

Jose, J., Potluri, S., Tomko, K., Panda, D.K.: Designing scalable graph500 benchmark with hybrid MPI+OpenSHMEM programming models. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 109–124. Springer, Heidelberg (2013) CrossRef

14.

Li, M., Lin, J., Lu, X., Hamidouche, K., Tomko, K., Panda, D.K.: Scalable MiniMD design with hybrid MPI and OpenSHMEM. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, p. 24 (2014)

15.

Moon, L., Long, D., Joshi, S., Tripathi, V., Xiao, B., Biros, G.: Parallel algorithms for clustering and nearest neighbor search problems in high dimensions. In: Proceedings of the 2011 ACM/IEEE Conference on Supercomputing (2011)

16.

Network Based Computing Lab, The Ohio State University: MVAPICH2-X: Unified MPI+PGAS Communication Runtime over OpenFabrics/Gen2 for Exascale Systems. http://mvapich.cse.ohio-state.edu/

17.

Numrich, R., Reid, J.: Co-Array Fortran for Parallel Programming. Technical Report RAL-TR-1998-060, Rutheford Appleton Laboratory (1998)

18.

Pacific Northwest National Laboratory: Global Arrays Programming Models. http://hpc.pnl.gov/globalarrays/

19.

Pacific Northwest National Laboratory: MaTEx: Machine Learning Toolkit for Extreme Scale. http://hpc.pnl.gov/matex/

20.

Pophale, S., Jin, H., Poole, S., Kuehn, J.: OpenSHMEM performance and potential: A NPB experimental study. In: Proceedings of the 1st Workshop on OpenSHMEM (2013)

21.

Yu, H.F., Lo, H.Y., Hsieh, H.P., Lou, J.K., Mckenzie, T.G., Chou, J.W., Chung, P.H., Ho, C.H., Chang, C.F., Weng, J.Y., et al.: Feature engineering and classifier ensemble for KDD cup 2010. In: JMLR Workshop and Conference Proceedings (2011)

22.

Zhang, C., Li, F., Jestes, J.: Efficient parallel kNN joins for large data in MapReduce. In: Proceedings the 15th International Conference on Extending Database Technology (2012)

23.

Zhang, Q., Li, C., He, P., Li, X., Zou, H.: Irregular partitioning method based K-nearest neighbor query algorithm using mapreduce. In: Proceedings of 2015 International Symposium on Computers & Informatics (2015)

Titel: Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM
verfasst von: Jian Lin
Khaled Hamidouche
Jie Zhang
Xiaoyi Lu
Abhinav Vishnu
Dhabaleswar Panda
Verlag: Springer International Publishing
Buch: OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies
Print ISBN: 978-3-319-26427-1

Electronic ISBN: 978-3-319-26428-8

Copyright-Jahr: 2015
DOI: https://doi.org/10.1007/978-3-319-26428-8_11

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"