Skip to main content
Erschienen in: World Wide Web 2/2017

13.05.2016

Multiple k nearest neighbor search

verfasst von: Yu-Chi Chung, I-Fang Su, Chiang Lee, Pei-Chi Liu

Erschienen in: World Wide Web | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The problem of kNN (k Nearest Neighbor) queries has received considerable attention in the database and information retrieval communities. Given a dataset D and a kNN query q, the k nearest neighbor algorithm finds the closest k data points to q. The applications of kNN queries are board, not only in spatio-temporal databases but also in many areas. For example, they can be used in multimedia databases, data mining, scientific databases and video retrieval. The past studies of kNN query processing did not consider the case that the server may receive multiple kNN queries at one time. Their algorithms process queries independently. Thus, the server will be busy with continuously reaccessing the database to obtain the data that have already been acquired. This results in wasting I/O costs and degrading the performance of the whole system. In this paper, we focus on this problem and propose an algorithm named COrrelated kNN query Evaluation (COKE). The main idea of COKE is an “information sharing” strategy whereby the server reuses the query results of previously executed queries for efficiently processing subsequent queries. We conduct a comprehensive set of experiments to analyze the performance of COKE and compare it with the Best-First Search (BFS) algorithm. Empirical studies indicate that COKE outperforms BFS, and achieves lower I/O costs and less running time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
5.
Zurück zum Zitat Bohm, C., Ooi, B. C., Plant, C., Yan, Y.: Efficiently Processing Continuous K-Nn Queries on Data Streams. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 156–165 (2007) Bohm, C., Ooi, B. C., Plant, C., Yan, Y.: Efficiently Processing Continuous K-Nn Queries on Data Streams. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 156–165 (2007)
6.
Zurück zum Zitat Braunmuller, B., Ester, M., Kriegel, H.P., Sander, J.: Multiple similarity queries: a basic dbms operation for mining in metric databases. IEEE Trans. Knowl. Data Eng. 13(1), 79–95 (2001)CrossRef Braunmuller, B., Ester, M., Kriegel, H.P., Sander, J.: Multiple similarity queries: a basic dbms operation for mining in metric databases. IEEE Trans. Knowl. Data Eng. 13(1), 79–95 (2001)CrossRef
7.
Zurück zum Zitat Chávez, E., Navarro, G., Baeza-Yates, R., Marroguin, J.L.: Searching in metric spaces. J. ACM Comput. Surv. (CSUR) 33(3), 273–321 (2001)CrossRef Chávez, E., Navarro, G., Baeza-Yates, R., Marroguin, J.L.: Searching in metric spaces. J. ACM Comput. Surv. (CSUR) 33(3), 273–321 (2001)CrossRef
8.
Zurück zum Zitat Cui, B., Ooi, B.C., Su, J., Tan, K.L.: Indexing high-dimensional data for efficient in-memory similarity search. IEEE Trans. Knowl. Data Eng. 17(3), 339–353 (2005)CrossRef Cui, B., Ooi, B.C., Su, J., Tan, K.L.: Indexing high-dimensional data for efficient in-memory similarity search. IEEE Trans. Knowl. Data Eng. 17(3), 339–353 (2005)CrossRef
9.
Zurück zum Zitat Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in knowledge discovery and data mining (1996) Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in knowledge discovery and data mining (1996)
10.
Zurück zum Zitat Hjaltason, G.R., Samet, H.: Distance browsing in spatial databases. ACM Trans. Database Syst. (TODS) 24(2), 265–318 (2010)CrossRef Hjaltason, G.R., Samet, H.: Distance browsing in spatial databases. ACM Trans. Database Syst. (TODS) 24(2), 265–318 (2010)CrossRef
11.
Zurück zum Zitat Jagadish, H.V., Ooi, B.C., lee Tan, K., Yu, C., Zhang, R.: Idistance: An adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. (TODS) 30(2), 364–397 (2005)CrossRef Jagadish, H.V., Ooi, B.C., lee Tan, K., Yu, C., Zhang, R.: Idistance: An adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. (TODS) 30(2), 364–397 (2005)CrossRef
12.
Zurück zum Zitat Jiazhu, D., Zhilong, L.: A Location Authentication Scheme Based on Proximity Test of Location Tags. In: Proceedings of 2013 International Conference on Information and Network Security (ICINS 2013), pp. 1–6 (2013) Jiazhu, D., Zhilong, L.: A Location Authentication Scheme Based on Proximity Test of Location Tags. In: Proceedings of 2013 International Conference on Information and Network Security (ICINS 2013), pp. 1–6 (2013)
13.
Zurück zum Zitat Kolahdouzan, M., Shahabi, C.: Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, pp. 840–851 (2004) Kolahdouzan, M., Shahabi, C.: Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, pp. 840–851 (2004)
14.
Zurück zum Zitat Korn, F., Muthukrishnan, S.: Influence Sets Based on Reverse Nearest Neighbor Queries. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 201–212 (2000) Korn, F., Muthukrishnan, S.: Influence Sets Based on Reverse Nearest Neighbor Queries. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 201–212 (2000)
15.
Zurück zum Zitat Korn, F., Sidiropoulos, N., Faloutsos, C., Siegel, E., Protopapas, Z.: Fast Nearest Neighbor Search in Medical Image Databases (1996). CA, USA Korn, F., Sidiropoulos, N., Faloutsos, C., Siegel, E., Protopapas, Z.: Fast Nearest Neighbor Search in Medical Image Databases (1996). CA, USA
16.
Zurück zum Zitat Lu, H., Ooi, B.C., Shen, H.T., Xue, X.: Hierarchical indexing structure for efficient similarity search in video retrieval. IEEE Trans. Knowl. Data Eng. 18(11), 1544–1559 (2006)CrossRef Lu, H., Ooi, B.C., Shen, H.T., Xue, X.: Hierarchical indexing structure for efficient similarity search in video retrieval. IEEE Trans. Knowl. Data Eng. 18(11), 1544–1559 (2006)CrossRef
17.
Zurück zum Zitat Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-Probe Lsh: Efficient Indexing for High-Dimensional Similarity Search. In: Proceedings of the 33Rd International Conference on Very Data Bases, pp. 950–961 (2007) Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-Probe Lsh: Efficient Indexing for High-Dimensional Similarity Search. In: Proceedings of the 33Rd International Conference on Very Data Bases, pp. 950–961 (2007)
18.
Zurück zum Zitat Mokbel, M.F., Xiong, X., Aref, W.G.: Sina: Scalable Incremental Processing of Continuous Queries in Spatio-Temporal Databases. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 623–634 (2004) Mokbel, M.F., Xiong, X., Aref, W.G.: Sina: Scalable Incremental Processing of Continuous Queries in Spatio-Temporal Databases. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 623–634 (2004)
19.
Zurück zum Zitat Mouratidis, K., Papadias, D.: Continuous nearest neighbor queries over sliding windows. IEEE Trans. Knowl. Data Eng. 19(6), 789–803 (2007)CrossRef Mouratidis, K., Papadias, D.: Continuous nearest neighbor queries over sliding windows. IEEE Trans. Knowl. Data Eng. 19(6), 789–803 (2007)CrossRef
20.
Zurück zum Zitat Papadias, D., Tao, Y., Mouratidis, K., Hui, C.K.: Aggregate nearest neighbor queries in spatial databases. ACM Trans. Database Syst. 30(2), 529–576 (2005)CrossRef Papadias, D., Tao, Y., Mouratidis, K., Hui, C.K.: Aggregate nearest neighbor queries in spatial databases. ACM Trans. Database Syst. 30(2), 529–576 (2005)CrossRef
21.
Zurück zum Zitat Rao, J., Ross, K.A.: Making B+-Trees Cache Conscious in Main Memory. In: ACM SIGMOD Record, vol. 29, pp. 475–486 (2000) Rao, J., Ross, K.A.: Making B+-Trees Cache Conscious in Main Memory. In: ACM SIGMOD Record, vol. 29, pp. 475–486 (2000)
22.
Zurück zum Zitat Roussopoulos, N., Kelly, S., Vncent, F.: Nearest Neighbor Queries. In: ACM SIGMOD International Conference on Management of Data. New Your, USA, pp. 71–79 (1995) Roussopoulos, N., Kelly, S., Vncent, F.: Nearest Neighbor Queries. In: ACM SIGMOD International Conference on Management of Data. New Your, USA, pp. 71–79 (1995)
23.
Zurück zum Zitat Saraiva, P.C., de Moura, E.S., Ziviani, N., andRodrigo Fonseca, W.M., Riberio-Neto, B.: Rank-Preserving Two-Level Caching for Scalable Search Engines. In: Proceedings of the 24Th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 51–58 (2001) Saraiva, P.C., de Moura, E.S., Ziviani, N., andRodrigo Fonseca, W.M., Riberio-Neto, B.: Rank-Preserving Two-Level Caching for Scalable Search Engines. In: Proceedings of the 24Th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 51–58 (2001)
24.
Zurück zum Zitat Seidl, T., Kriegel, H.P.: Efficient User-Adaptable Similarity Search in Large Multimedia Database. In: Proceedings of the 23Rd International Conference on Very Large Data Bases, pp. 506–515 (1997) Seidl, T., Kriegel, H.P.: Efficient User-Adaptable Similarity Search in Large Multimedia Database. In: Proceedings of the 23Rd International Conference on Very Large Data Bases, pp. 506–515 (1997)
25.
Zurück zum Zitat Sharifzadeh, M., Shahabi, C.: Vor-Tree: R-trees with Voronoi Diagrams for Efficient Processing of Spatial Nearest Neighbor Queries. In: Proceedings of the VLDB Endowment (2010) Sharifzadeh, M., Shahabi, C.: Vor-Tree: R-trees with Voronoi Diagrams for Efficient Processing of Spatial Nearest Neighbor Queries. In: Proceedings of the VLDB Endowment (2010)
26.
Zurück zum Zitat Tao, Y., Papadias, D., Lian, X.: Reverse Knn Search in Arbitrary Dimensionality. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30, pp. 744–755 (2004) Tao, Y., Papadias, D., Lian, X.: Reverse Knn Search in Arbitrary Dimensionality. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases - Volume 30, pp. 744–755 (2004)
27.
Zurück zum Zitat Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Quality and Efficiency in High Dimensional Nearest Neighbor Search. In: ACM SIGMOD (2009) Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Quality and Efficiency in High Dimensional Nearest Neighbor Search. In: ACM SIGMOD (2009)
28.
Zurück zum Zitat Teevan, J., Adar, E., Jones, R., Potts, M.: History Repeats Itself: Repeat Queries in Yahoo’s Query Logs. In: Proceedings of the 29Th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR ’06), pp. 703–704 (2005) Teevan, J., Adar, E., Jones, R., Potts, M.: History Repeats Itself: Repeat Queries in Yahoo’s Query Logs. In: Proceedings of the 29Th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR ’06), pp. 703–704 (2005)
29.
Zurück zum Zitat Weber, R., Schek, H., Blott, S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. In: VLDB, pp. 194–205 (1998) Weber, R., Schek, H., Blott, S.: A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. In: VLDB, pp. 194–205 (1998)
30.
Zurück zum Zitat Willhalm, T., Popovici, N., Boshmaf, Y., Plattner, H., Zeier, A., Schaffner, J.: Simd-scan: ultra fast in-memory table scan using on-chip vector processing units. Proc. VLDB Endowment 2(1), 385–394 (2009)CrossRef Willhalm, T., Popovici, N., Boshmaf, Y., Plattner, H., Zeier, A., Schaffner, J.: Simd-scan: ultra fast in-memory table scan using on-chip vector processing units. Proc. VLDB Endowment 2(1), 385–394 (2009)CrossRef
31.
Zurück zum Zitat Xie, Y., O’Hallaron, D.: Locality in Search Engine Queries and Its Implications for Caching. In: Proceedings of 21 Annual Joint Conference of the IEEE Computer and Communications Societies (Infocom), pp. 1238–1247 (2002) Xie, Y., O’Hallaron, D.: Locality in Search Engine Queries and Its Implications for Caching. In: Proceedings of 21 Annual Joint Conference of the IEEE Computer and Communications Societies (Infocom), pp. 1238–1247 (2002)
32.
Zurück zum Zitat Xiong, X., Mokbel, M.F., Aref, W.G.: Sea-Cnn: Scalable Processing of Continuous K-Nearest Neighbor Queries in Spatio-Temporal Databases. In: Proceedings of the 21St International Conference on Data Engineering, pp. 643–654 (2005) Xiong, X., Mokbel, M.F., Aref, W.G.: Sea-Cnn: Scalable Processing of Continuous K-Nearest Neighbor Queries in Spatio-Temporal Databases. In: Proceedings of the 21St International Conference on Data Engineering, pp. 643–654 (2005)
33.
Zurück zum Zitat Yang, C., Lin, K.I.: An Index Structure for Efficient Reverse Nearest Neighbor Queries. In: Proceedings of the 2001 International Conference on Data Engineering, pp. 485–492 (2001) Yang, C., Lin, K.I.: An Index Structure for Efficient Reverse Nearest Neighbor Queries. In: Proceedings of the 2001 International Conference on Data Engineering, pp. 485–492 (2001)
34.
Zurück zum Zitat Zhang, J., Zhu, M., Papadias, D., Tao, Y., Lee, D.L.: Location-Based Spatial Queries. In: ACM SIGMOD Int. Conf. Manag. Data. NY, USA, pp. 443–454 (2003) Zhang, J., Zhu, M., Papadias, D., Tao, Y., Lee, D.L.: Location-Based Spatial Queries. In: ACM SIGMOD Int. Conf. Manag. Data. NY, USA, pp. 443–454 (2003)
35.
Zurück zum Zitat Zhang, R., Stradling, M.: The hv-tree: a memory hierarchy aware version index. Proc. VLDB Endowment 3(1-2), 397–408 (2010)CrossRef Zhang, R., Stradling, M.: The hv-tree: a memory hierarchy aware version index. Proc. VLDB Endowment 3(1-2), 397–408 (2010)CrossRef
36.
Zurück zum Zitat Zhuang, Y., Li, Q., Chen, L.: Multi-Query Optimization for Distributed Similarity Query Processing. In: International Conference on Distributed Computing Systems (ICDCS) (2008) Zhuang, Y., Li, Q., Chen, L.: Multi-Query Optimization for Distributed Similarity Query Processing. In: International Conference on Distributed Computing Systems (ICDCS) (2008)
Metadaten
Titel
Multiple k nearest neighbor search
verfasst von
Yu-Chi Chung
I-Fang Su
Chiang Lee
Pei-Chi Liu
Publikationsdatum
13.05.2016
Verlag
Springer US
Erschienen in
World Wide Web / Ausgabe 2/2017
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-016-0392-2

Weitere Artikel der Ausgabe 2/2017

World Wide Web 2/2017 Zur Ausgabe