Skip to main content

2016 | OriginalPaper | Buchkapitel

Similarity Search of Sparse Histograms on GPU Architecture

verfasst von : Hasmik Osipyan, Jakub Lokoč, Stéphane Marchand-Maillet

Erschienen in: Similarity Search and Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Searching for similar objects within large-scale database is a hard problem due to the exponential increase of multimedia data. The time required to find the nearest objects to the specific query in a high-dimensional space has become a serious constraint of the searching algorithms. One of the possible solution for this problem is utilization of massively parallel platforms such as GPU architectures. This solution becomes very sensitive for the applications working with sparse dataset. The performance of the algorithm can be totally changed depending on the different sparsity settings of the input data. In this paper, we study four different approaches on the GPU architecture for finding the similar histograms to the given queries. The performance and efficiency of observed methods were studied on sparse dataset of half a million histograms. We summarize our empirical results and point out the optimal GPU strategy for sparse histograms with different sparsity settings.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Proceedings of the 3rd International Conference on Scalable Information Systems, pp. 28:1–28:10 (2008) Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Proceedings of the 3rd International Conference on Scalable Information Systems, pp. 28:1–28:10 (2008)
2.
Zurück zum Zitat Ashari, A., Sedaghati, N., Eisenlohr, J., Sadayappan, P.: An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs. In: ICS 2014, Muenchen, Germany, 10–13 June 2014, pp. 273–282 (2014) Ashari, A., Sedaghati, N., Eisenlohr, J., Sadayappan, P.: An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs. In: ICS 2014, Muenchen, Germany, 10–13 June 2014, pp. 273–282 (2014)
3.
Zurück zum Zitat Chang, D., Jones, N.A., Li, D., Ouyang, M., Ragade, R.K.: Compute pairwise Euclidean distances of data points with GPUs. In: Proceedings of the IASTED International Symposium on CBB, Florida, USA, 16–18 November 2008, pp. 278–283 (2008) Chang, D., Jones, N.A., Li, D., Ouyang, M., Ragade, R.K.: Compute pairwise Euclidean distances of data points with GPUs. In: Proceedings of the IASTED International Symposium on CBB, Florida, USA, 16–18 November 2008, pp. 278–283 (2008)
4.
Zurück zum Zitat Cui, B., Zhao, J., Cong, G.: ISIS: a new approach for efficient similarity search in sparse databases. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 231–245. Springer, Heidelberg (2010)CrossRef Cui, B., Zhao, J., Cong, G.: ISIS: a new approach for efficient similarity search in sparse databases. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5982, pp. 231–245. Springer, Heidelberg (2010)CrossRef
5.
Zurück zum Zitat Dotsenko, Y., Govindaraju, N.K., Sloan, P.J., Boyd, C., Manferdelli, J.: Fast scan algorithms on graphics processors. In: Proceedings of the 22nd Annual ICS, Island of Kos, Greece, 7–12 June 2008, pp. 205–213 (2008) Dotsenko, Y., Govindaraju, N.K., Sloan, P.J., Boyd, C., Manferdelli, J.: Fast scan algorithms on graphics processors. In: Proceedings of the 22nd Annual ICS, Island of Kos, Greece, 7–12 June 2008, pp. 205–213 (2008)
6.
Zurück zum Zitat Fang, J., Varbanescu, A.L., Sips, H.J.: A comprehensive performance comparison of CUDA and OpenCL. In: ICPP, Taipei, Taiwan, September 2011, pp. 216–225 (2011) Fang, J., Varbanescu, A.L., Sips, H.J.: A comprehensive performance comparison of CUDA and OpenCL. In: ICPP, Taipei, Taiwan, September 2011, pp. 216–225 (2011)
7.
Zurück zum Zitat Garcia, V., Debreuve, E., Barlaud, M.: Fast k nearest neighbor search using GPU. In: IEEE Conference on CVPR, Anchorage, USA, 23–28 June 2008, pp. 1–6 (2008) Garcia, V., Debreuve, E., Barlaud, M.: Fast k nearest neighbor search using GPU. In: IEEE Conference on CVPR, Anchorage, USA, 23–28 June 2008, pp. 1–6 (2008)
8.
Zurück zum Zitat Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on VLDB 1999, pp. 518–529. Morgan Kaufmann Publishers Inc., San Francisco (1999) Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on VLDB 1999, pp. 518–529. Morgan Kaufmann Publishers Inc., San Francisco (1999)
9.
Zurück zum Zitat Goumas, G.I., Kourtis, K., Anastopoulos, N., Karakasis, V., Koziris, N.: Understanding the performance of sparse matrix-vector multiplication. In: 16th Euromicro International Conference on PDP, pp. 283–292 (2008) Goumas, G.I., Kourtis, K., Anastopoulos, N., Karakasis, V., Koziris, N.: Understanding the performance of sparse matrix-vector multiplication. In: 16th Euromicro International Conference on PDP, pp. 283–292 (2008)
10.
Zurück zum Zitat Karimi, K., Dickson, N.G., Hamze, F.: A performance comparison of CUDA and OpenCL. CoRR abs/1005.2581 (2010) Karimi, K., Dickson, N.G., Hamze, F.: A performance comparison of CUDA and OpenCL. CoRR abs/1005.2581 (2010)
11.
Zurück zum Zitat Khronos OpenCL Working Group: The OpenCL Specification, version 1.0.29, 8 December 2008 Khronos OpenCL Working Group: The OpenCL Specification, version 1.0.29, 8 December 2008
12.
Zurück zum Zitat Kohout, J., Pevny, T.: Unsupervised detection of malware in persistent web traffic. In: IEEE International Conference on ICASSP (2015) Kohout, J., Pevny, T.: Unsupervised detection of malware in persistent web traffic. In: IEEE International Conference on ICASSP (2015)
13.
Zurück zum Zitat Kruliš, M., Osipyan, H., Marchand-Maillet, S.: Optimizing Sorting and top-k selection steps in permutation based indexing on GPUs. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. CCIS, vol. 539, pp. 305–317. Springer, Heidelberg (2015)CrossRef Kruliš, M., Osipyan, H., Marchand-Maillet, S.: Optimizing Sorting and top-k selection steps in permutation based indexing on GPUs. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. CCIS, vol. 539, pp. 305–317. Springer, Heidelberg (2015)CrossRef
14.
Zurück zum Zitat Krulis, M., Osipyan, H., Marchand-Maillet, S.: Permutation based indexing for high dimensional data on GPU architectures. In: 13th International Workshop on CBMI, Prague, Czech Republic, 10–12 June 2015, pp. 1–6 (2015) Krulis, M., Osipyan, H., Marchand-Maillet, S.: Permutation based indexing for high dimensional data on GPU architectures. In: 13th International Workshop on CBMI, Prague, Czech Republic, 10–12 June 2015, pp. 1–6 (2015)
15.
Zurück zum Zitat Li, Q., Kecman, V., Salman, R.: A chunking method for Euclidean distance matrix calculation on large dataset using multi-GPU. In: The Ninth ICMLA, Washington, DC, USA, 12–14 December 2010, pp. 208–213 (2010) Li, Q., Kecman, V., Salman, R.: A chunking method for Euclidean distance matrix calculation on large dataset using multi-GPU. In: The Ninth ICMLA, Washington, DC, USA, 12–14 December 2010, pp. 208–213 (2010)
16.
Zurück zum Zitat Li, S., Amenta, N.: Brute-force k-nearest neighbors search on the GPU. In: Amato, G., Connor, R., Falchi, F., Gennaro, C. (eds.) SISAP 2015. LNCS, vol. 9371, pp. 259–270. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25087-8_25 CrossRef Li, S., Amenta, N.: Brute-force k-nearest neighbors search on the GPU. In: Amato, G., Connor, R., Falchi, F., Gennaro, C. (eds.) SISAP 2015. LNCS, vol. 9371, pp. 259–270. Springer, Heidelberg (2015). doi:10.​1007/​978-3-319-25087-8_​25 CrossRef
17.
Zurück zum Zitat Liang, S., Liu, Y., Wang, C., Jian, L.: A cuda-based parallel implementation of k-nearest neighbor algorithm. In: Cyber-Enable Distributed Computing and Knowledge Discovery, pp. 291–296 (2010) Liang, S., Liu, Y., Wang, C., Jian, L.: A cuda-based parallel implementation of k-nearest neighbor algorithm. In: Cyber-Enable Distributed Computing and Knowledge Discovery, pp. 291–296 (2010)
18.
Zurück zum Zitat Liu, K., Bellet, A., Sha, F.: Similarity learning for high-dimensional sparse data. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, AISTATS, San Diego, California, USA, 9–12 May 2015 (2015) Liu, K., Bellet, A., Sha, F.: Similarity learning for high-dimensional sparse data. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, AISTATS, San Diego, California, USA, 9–12 May 2015 (2015)
19.
Zurück zum Zitat Liu, W., Vinter, B.: CSR5: an efficient storage format for cross-platform sparse matrix-vector multiplication. In: Proceedings of the 29th ACM on ICS 2015, Newport Beach/Irvine, CA, USA, 8–11 June 2015, pp. 339–350 (2015) Liu, W., Vinter, B.: CSR5: an efficient storage format for cross-platform sparse matrix-vector multiplication. In: Proceedings of the 29th ACM on ICS 2015, Newport Beach/Irvine, CA, USA, 8–11 June 2015, pp. 339–350 (2015)
20.
Zurück zum Zitat Matsumoto, T., Yiu, M.L.: Accelerating exact similarity search on CPU-GPU systems. In: ICDM, Atlantic City, NJ, USA, 14–17 November 2015, pp. 320–329 (2015) Matsumoto, T., Yiu, M.L.: Accelerating exact similarity search on CPU-GPU systems. In: ICDM, Atlantic City, NJ, USA, 14–17 November 2015, pp. 320–329 (2015)
21.
Zurück zum Zitat Mohamed, H., Osipyan, H., Marchand-Maillet, S.: Multi-core (CPU and GPU) for permutation-based indexing. In: Traina, A.J.M., Traina Jr., C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 277–288. Springer, Heidelberg (2014) Mohamed, H., Osipyan, H., Marchand-Maillet, S.: Multi-core (CPU and GPU) for permutation-based indexing. In: Traina, A.J.M., Traina Jr., C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 277–288. Springer, Heidelberg (2014)
22.
Zurück zum Zitat Neelima, B., Raghavendra, P.S.: CSPR: column only sparse matrix representation for performance improvement on GPU architecture. In: Advances in Parallel Distributed Computing, Tirunelveli, India, 23–25 September 2011, pp. 581–595 (2011) Neelima, B., Raghavendra, P.S.: CSPR: column only sparse matrix representation for performance improvement on GPU architecture. In: Advances in Parallel Distributed Computing, Tirunelveli, India, 23–25 September 2011, pp. 581–595 (2011)
23.
Zurück zum Zitat Neelima, B., Reddy, G.R.M., Raghavendra, P.S.: A GPU framework for sparse matrix vector multiplication. In: IEEE 13th International Symposium on Parallel and Distributed Computing, ISPDC, Marseille, France, June 2014, pp. 51–58 (2014) Neelima, B., Reddy, G.R.M., Raghavendra, P.S.: A GPU framework for sparse matrix vector multiplication. In: IEEE 13th International Symposium on Parallel and Distributed Computing, ISPDC, Marseille, France, June 2014, pp. 51–58 (2014)
25.
Zurück zum Zitat NVIDIA Corporation: NVIDIA CUDA C programming guide, version 3.2 (2010) NVIDIA Corporation: NVIDIA CUDA C programming guide, version 3.2 (2010)
26.
Zurück zum Zitat Pan, J., Manocha, D.: Fast GPU-based locality sensitive hashing for k-nearest neighbor computation. In: 19th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, Chicago, IL, USA, pp. 211–220 (2011) Pan, J., Manocha, D.: Fast GPU-based locality sensitive hashing for k-nearest neighbor computation. In: 19th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, Chicago, IL, USA, pp. 211–220 (2011)
27.
Zurück zum Zitat Samet, H.: Foundations of Multidimensional and Metric Data Structures. The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling. Morgan Kaufmann Publishers Inc., San Francisco (2005)MATH Samet, H.: Foundations of Multidimensional and Metric Data Structures. The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling. Morgan Kaufmann Publishers Inc., San Francisco (2005)MATH
28.
Zurück zum Zitat Teodoro, G., Valle, E., Mariano, N., da Silva Torres, R., M Jr., W., Saltz, J.H.: Approximate similarity search for online multimedia services on distributed CPU-GPU platforms. VLDB J. 23(3), 427–448 (2014)CrossRef Teodoro, G., Valle, E., Mariano, N., da Silva Torres, R., M Jr., W., Saltz, J.H.: Approximate similarity search for online multimedia services on distributed CPU-GPU platforms. VLDB J. 23(3), 427–448 (2014)CrossRef
29.
Zurück zum Zitat Wang, C., Wang, X.S.: Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches. VLDB J. 9(4), 344–361 (2001)MATH Wang, C., Wang, X.S.: Indexing very high-dimensional sparse and quasi-sparse vectors for similarity searches. VLDB J. 9(4), 344–361 (2001)MATH
30.
Zurück zum Zitat Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach, 1st edn. Springer, New York (2010)MATH Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach, 1st edn. Springer, New York (2010)MATH
Metadaten
Titel
Similarity Search of Sparse Histograms on GPU Architecture
verfasst von
Hasmik Osipyan
Jakub Lokoč
Stéphane Marchand-Maillet
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46759-7_25

Neuer Inhalt