skip to main content
10.1145/3456669.3456692acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiwoclConference Proceedingsconference-collections
research-article

Performance-Portable Distributed k-Nearest Neighbors using Locality-Sensitive Hashing and SYCL

Published:27 April 2021Publication History

ABSTRACT

In the age of data collection, machine learning algorithms have to be able to efficiently cope with vast data sets. This requires scalable algorithms and efficient implementations that can cope with heterogeneous hardware. We propose a new, performance-portable implementation of a well-known, robust, and versatile multi-class classification method that supports multiple Graphics Processing Units (GPUs) from different vendors. It is based on a performance-portable implementation of the approximate k-nearest neighbors (k-NN) algorithm in SYCL. The k-NN assigns a class to a data point based on a majority vote of its neighborhood. The naive approach compares a data point x to all other data points in the training data to identify the k nearest ones. However, this has quadratic runtime and is infeasible for large data sets. Therefore, approximate variants have been developed. Such an algorithm is the Locality-Sensitive Hashing (LSH) algorithm, which uses hash tables together with locality-sensitive hash functions to reduce the data points that have to be examined to compute the k-NN.

To the best of our knowledge, there is no distributed LSH version supporting multiple GPUs from different vendors available so far despite the fact that k-NNs are frequently employed. Therefore, we have developed the library. It provides the first hardware-independent, yet efficient and distributed implementation of the LSH algorithm that is suited for modern supercomputers. The implementation uses C++17 together with SYCL 1.2.1, which is an abstraction layer for OpenCL that allows targeting different hardware with a single implementation. To support large data sets, we utilize multiple GPUs using the Message Passing Interface (MPI) to enable the usage of both shared and distributed memory systems.

We have tested different parameter combinations for two locality-sensitive hash function implementations, which we compare. Our results show that our library can easily scale on multiple GPUs using both hash function types, achieving a nearly optimal parallel speedup of up to 7.6 on 8 GPUs. Furthermore, we demonstrate that the library supports different SYCL implementations—ComputeCpp, hipSYCL, and DPC++—to target different hardware architectures without significant performance differences.

References

  1. Aksel Alpay and Vincent Heuveline. 2020. SYCL beyond OpenCL: The architecture, current state and future direction of hipSYCL. In Proceedings of the International Workshop on OpenCL. ACM. https://doi.org/10.1145/3388333.3388658Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alexandr Andoni, Ilya Razenshteyn, and Negev Shekel Nosatzki. 2017. LSH Forest: Practical Algorithms Made Theoretical. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611974782.5Google ScholarGoogle ScholarCross RefCross Ref
  3. Bahman Bahmani, Ashish Goel, and Rajendra Shinde. 2012. Efficient distributed locality sensitive hashing. In Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12. ACM Press. https://doi.org/10.1145/2396761.2398596Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Baldi, P. Sadowski, and D. Whiteson. 2014. Searching for exotic particles in high-energy physics with deep learning. Nature Communications 5, 1 (jul 2014).Google ScholarGoogle ScholarCross RefCross Ref
  5. Mayank Bawa, Tyson Condie, and Prasanna Ganesan. 2005. LSH forest. In Proceedings of the 14th international conference on World Wide Web - WWW '05. ACM Press. https://doi.org/10.1145/1060745.1060840Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. David Beckingsale, Richard Hornung, Tom Scogland, and Arturo Vargas. 2019. Performance portable C++ programming with RAJA. In Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming. 455–456.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jon Louis Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 9 (sep 1975), 509–517. https://doi.org/10.1145/361002.361007Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Thomas Cover and P. Hart. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13, 1 (Jan. 1967), 7 pages. https://doi.org/10.1109/tit.1967.1053964Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual ACM symposium on Computational geometry. ACM Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Carter Edwards, Christian R. Trott, and Daniel Sunderland. 2014. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. J. Parallel and Distrib. Comput. 74, 12 (2014), 3202 – 3216. https://doi.org/10.1016/j.jpdc.2014.07.003Domain-Specific Languages and High-Level Frameworks for High-Performance Computing.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ioannis Z. Emiris, Alexandros Konstantinakis-Karmis, Dimitri Nicolopoulos, and A. Thanos-Filis. 2012. Data structures for approximate nearest neighbor search. Techn. Ber. Technical Report CGLTR-29, NKUA(2012).Google ScholarGoogle Scholar
  12. Ioannis Z. Emiris and Dimitri Nicolopoulos. 2013. Randomized kd-trees for approximate nearest neighbor search. CGL-TR-78, NKUA, Tech. Rep.(2013).Google ScholarGoogle Scholar
  13. Jerome H. Friedman. 1991. Multivariate Adaptive Regression Splines. The Annals of Statistics 19, 1 (mar 1991), 1–67.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jerome H. Friedman, Jon Louis Bentley, and Raphael Ari Finkel. 1977. An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Trans. Math. Software 3, 3 (sep 1977), 209–226. https://doi.org/10.1145/355744.355745Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Parisa Haghani, Sebastian Michel, and Karl Aberer. 2009. Distributed similarity search in high dimensions using locality sensitive hashing. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology. 744–755.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Parisa Haghani, Sebastian Michel, Karl Aberer, 2008. Lsh at large-distributed knn search in high dimensions. In 11th International Workshop on the Web and Databases, WebDB.Google ScholarGoogle Scholar
  17. Jared Hoberock, Michael Garland, Chris Kohlhoff, Chris Mysen, Carter Edwards, Gordon Brown, David Hollman, Lee Howes, Kirk Shoop, Lewis Baker, and Eric Niebler. 2020. A Unified Executors Proposal for C++ (P0443R14). Technical Report.Google ScholarGoogle Scholar
  18. Jared Hoberock, Michael Garland, Chris Kohlhoff, Chris Mysen, Carter Edwards, Gordon Brown, and Michael Wong. 2018. Executors Design Document (P0761R2). Technical Report.Google ScholarGoogle Scholar
  19. Jared Hoberock and Chris Kohlhoff. 2019. C++20 Executors are Resilient to ABI Breakage (P1405R0). Technical Report.Google ScholarGoogle Scholar
  20. Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors. In Proceedings of the thirtieth annual ACM symposium on Theory of computing.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Nicolai Josuttis. 2018. hash_combine() Again (P0814R2). Proposal P0814R2.Google ScholarGoogle Scholar
  22. Beis J.S. and D.G. Lowe. [n.d.]. Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Comput. Soc. https://doi.org/10.1109/cvpr.1997.609451Google ScholarGoogle ScholarCross RefCross Ref
  23. Ronan Keryell, Maria Rovatsou, and Lee Howes. 2020. SYCL Specification 1.2.1: SYCL integrates OpenCL devices with modern C++. Khronos SYCL Working Group. Revision: 7.Google ScholarGoogle Scholar
  24. Khronos OpenCL Working Group. 2020. The OpenCL Specification.Google ScholarGoogle Scholar
  25. Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. 2014. Mining of Massive Datasets. Cambridge University Press. https://doi.org/10.1017/CBO9781139924801Google ScholarGoogle ScholarCross RefCross Ref
  26. Qin Lv, William Josephson, Zhe Wang, Moses Charikar, and Kai Li. 2007. Multi-probe LSH: efficient indexing for high-dimensional similarity search. In Proceedings of the 33rd international conference on Very large data bases. 950–961.Google ScholarGoogle Scholar
  27. Marius Muja and David G. Lowe. 2009. Fast approximate nearest neighbors with automatic algorithm configuration.VISAPP (1) 2, 331-340 (2009), 2.Google ScholarGoogle Scholar
  28. Jia Pan and Dinesh Manocha. 2012. Bi-level Locality Sensitive Hashing for k-Nearest Neighbor Computation. In 2012 IEEE 28th International Conference on Data Engineering. IEEE. https://doi.org/10.1109/icde.2012.40Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rina Panigrahy. 2006. Entropy based nearest neighbor search in high dimensions. In Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm - SODA '06. ACM Press. https://doi.org/10.1145/1109557.1109688Google ScholarGoogle ScholarCross RefCross Ref
  30. Md Mostofa Ali Patwary, Nadathur Rajagopalan Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Pradeep Dubey, 2016. Panda: Extreme scale parallel k-nearest neighbor on distributed architectures. In 2016 IEEE international parallel and distributed processing symposium (IPDPS). IEEE, 494–503.Google ScholarGoogle Scholar
  31. Parikshit Ram and Kaushik Sinha. 2019. Revisiting kd-tree for Nearest Neighbor Search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. https://doi.org/10.1145/3292500.3330875Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Gomes da Silva Ramon, Matheus Henrique Dal Molin Ribeiro, Viviana Cocco Mariani, and Leandro dos Santos Coelho. 2020. Forecasting Brazilian and American COVID-19 cases based on artificial intelligence coupled with climatic exogenous variables. Chaos, Solitons & Fractals 139 (Oct. 2020), 110027 pages.Google ScholarGoogle Scholar
  33. Warda M. Shaban, Asmaa H. Rabie, Ahmed I. Saleh, and M.A. Abo-Elsoud. 2020. A new COVID-19 Patients Detection Strategy (CPDS) based on hybrid feature selection and enhanced KNN classifier. Knowledge-Based Systems 205 (Oct. 2020).Google ScholarGoogle Scholar
  34. Chanop Silpa-Anan and Richard Hartley. 2008. Optimised KD-trees for fast image descriptor matching. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. https://doi.org/10.1109/cvpr.2008.4587638Google ScholarGoogle ScholarCross RefCross Ref
  35. Narayanan Sundaram, Aizana Turmukhametova, Nadathur Satish, Todd Mostak, Piotr Indyk, Samuel Madden, and Pradeep Dubey. 2013. Streaming similarity search over one billion tweets using parallel locality-sensitive hashing. Proceedings of the VLDB Endowment 6, 14 (sep 2013), 1930–1941. https://doi.org/10.14778/2556549.2556574Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Qiang Wang, Zhiyuan Guo, Gang Liu, and Jun Guo. 2012. Entropy based locality sensitive hashing. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  37. Roger Weber, Hans-Jörg Schek, and Stephen Blott. 1998. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In VLDB, Vol. 98. 194–205.Google ScholarGoogle Scholar
  38. Chi Zhang, Feifei Li, and Jeffrey Jestes. 2012. Efficient parallel kNN joins for large data in MapReduce. In Proceedings of the 15th international conference on extending database technology. 38–49.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Peter Zuzek. 2016. Implementation of the SYCL Heterogeneous Computing Library: Masters Thesis: the 2nd Cycle Masters Study Programme Computer and Information Science. (2016).Google ScholarGoogle Scholar

Index Terms

  1. Performance-Portable Distributed k-Nearest Neighbors using Locality-Sensitive Hashing and SYCL
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          IWOCL '21: Proceedings of the 9th International Workshop on OpenCL
          April 2021
          112 pages
          ISBN:9781450390330
          DOI:10.1145/3456669

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 27 April 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate84of152submissions,55%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format