Skip to main content

2019 | OriginalPaper | Buchkapitel

HGraph: A Connected-Partition Approach to Proximity Graphs for Similarity Search

verfasst von : Larissa Capobianco Shimomura, Daniel S. Kaster

Erschienen in: Database and Expert Systems Applications

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Similarity search is a common approach to support new applications that deal with complex data (e.g., images, videos, georeferenced data, etc.). As a consequence, appropriate indexing structures to support this task have been proposed in the literature. Recently, graph-based methods have shown to be very efficient for approximate similarity search. However, some of the main types of graphs used still suffer from two main drawbacks: (i) slow construction, and (ii) inaccurate retrieval. To reduce these drawbacks, in this paper, we propose the HGraph method. HGraph is a divide-and-conquer method for building graphs for similarity search that recursively partitions the input dataset and connect vertices across partitions at different levels. The method can be used with different types of graphs proposed in the literature to speed up the graph construction time as well as to increase the approximate search results quality through long-range edges connecting pivots of different partitions. We present experimental results using real datasets that show that HGraph applied to the k-NNG graph was able to decrease the construction time while increasing the approximate search recall when compared to the k-NNG. Regarding the application of HGraph to the NSW graph, the query recall also increased, however with a higher computational cost. An analysis of different combinations of the tested methods demonstrated HGraph query times given a recall rate were always among the top results regarding different setups.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Amato, G., Esuli, A., Falchi, F.: A comparison of pivot selection techniques for permutation-based indexing. Inf. Syst. 52(C), 176–188 (2015)CrossRef Amato, G., Esuli, A., Falchi, F.: A comparison of pivot selection techniques for permutation-based indexing. Inf. Syst. 52(C), 176–188 (2015)CrossRef
2.
Zurück zum Zitat Barioni, M.C.N., Kaster, D.D.S., Razente, H.L., Traina, A.J., Júnior, C.T.: Advanced Database Query Systems. IGI Global (2011) Barioni, M.C.N., Kaster, D.D.S., Razente, H.L., Traina, A.J., Júnior, C.T.: Advanced Database Query Systems. IGI Global (2011)
4.
Zurück zum Zitat Bustos, B., Navarro, G., Chavez, E.: Pivot selection techniques for proximity searching in metric spaces. In: SCCC, pp. 33–40, November 2001 Bustos, B., Navarro, G., Chavez, E.: Pivot selection techniques for proximity searching in metric spaces. In: SCCC, pp. 33–40, November 2001
5.
Zurück zum Zitat Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)CrossRef Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)CrossRef
6.
Zurück zum Zitat Chen, J., Fangand, H.R., Saad, Y.: Fast approximate KNN graph construction for high dimensional data via recursive Lanczos bisection. J. Mach. Learn. Res. 10, 1989–2012 (2009)MathSciNetMATH Chen, J., Fangand, H.R., Saad, Y.: Fast approximate KNN graph construction for high dimensional data via recursive Lanczos bisection. J. Mach. Learn. Res. 10, 1989–2012 (2009)MathSciNetMATH
7.
Zurück zum Zitat Hajebi, K., Abbasi-Yadkori, Y., Shahbazi, H., Zhang, H.: Fast approximate nearest-neighbor search with k-nearest neighbor graph. In: IJCAI, pp. 1312–1317 (2011) Hajebi, K., Abbasi-Yadkori, Y., Shahbazi, H., Zhang, H.: Fast approximate nearest-neighbor search with k-nearest neighbor graph. In: IJCAI, pp. 1312–1317 (2011)
8.
Zurück zum Zitat Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
9.
Zurück zum Zitat Malkov, Y., et al.: Approximate nearest neighbor algorithm based on navigable small world graphs. Inf. Syst. 45, 61–68 (2014)CrossRef Malkov, Y., et al.: Approximate nearest neighbor algorithm based on navigable small world graphs. Inf. Syst. 45, 61–68 (2014)CrossRef
10.
Zurück zum Zitat Ocsa, A., Bedregal, C., Cuadros-Vargas, E.: A new approach for similarity queries using neighborhood graphs. In: Brazilian Symposium on Databases, pp. 131–142 (2007) Ocsa, A., Bedregal, C., Cuadros-Vargas, E.: A new approach for similarity queries using neighborhood graphs. In: Brazilian Symposium on Databases, pp. 131–142 (2007)
11.
Zurück zum Zitat Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.S.: Supporting ranked boolean similarity queries in MARS. TKDE 10(6), 905–925 (1998) Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.S.: Supporting ranked boolean similarity queries in MARS. TKDE 10(6), 905–925 (1998)
13.
Zurück zum Zitat Park, H.S., Jun, C.H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2, Part 2), 3336–3341 (2009)CrossRef Park, H.S., Jun, C.H.: A simple and fast algorithm for k-medoids clustering. Expert Syst. Appl. 36(2, Part 2), 3336–3341 (2009)CrossRef
14.
15.
Zurück zum Zitat Traina Jr., C., Filho, R.F., Traina, A.J., Vieira, M.R., Faloutsos, C.: The Omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J. 16(4), 483–505 (2007)CrossRef Traina Jr., C., Filho, R.F., Traina, A.J., Vieira, M.R., Faloutsos, C.: The Omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient. VLDB J. 16(4), 483–505 (2007)CrossRef
16.
Zurück zum Zitat Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)CrossRef Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)CrossRef
Metadaten
Titel
HGraph: A Connected-Partition Approach to Proximity Graphs for Similarity Search
verfasst von
Larissa Capobianco Shimomura
Daniel S. Kaster
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-27615-7_8

Premium Partner