skip to main content
article

Aggregate nearest neighbor queries in spatial databases

Authors Info & Claims
Published:01 June 2005Publication History
Skip Abstract Section

Abstract

Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,…qn, an ANN query outputs the facility pP that minimizes the sum of distances |pqi| for 1 ≤ in that the users have to travel in order to meet there. Similarly, another ANN query may report the point pP that minimizes the maximum distance that any user has to travel, or the minimum distance from some user to his/her closest facility. If Q fits in memory and P is indexed by an R-tree, we develop algorithms for aggregate nearest neighbors that capture several versions of the problem, including weighted queries and incremental reporting of results. Then, we analyze their performance and propose cost models for query optimization. Finally, we extend our techniques for disk-resident queries and approximate ANN retrieval. The efficiency of the algorithms and the accuracy of the cost models are evaluated through extensive experiments with real and synthetic datasets.

References

  1. Acharya, S., Poosala, V., and Ramaswamy, S. 1999. Selectivity estimation in spatial databases. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Philadelphia, Pa., June), ACM, New York, 13--24.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aggrawal, C. and Yu, P. 2001. Outlier detection for high dimensional data. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Santa Barbara, Calif., May). ACM, New York, 37--47.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arya, S., Mount, D., Netanyahu, N., Silverman, R., and Wu, A. 1998. An optimal algorithm for approximate nearest neighbor searching. J. ACM 45, 6, 891--923.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Beckmann, N., Kriegel, H., Schneider, R., and Seeger, B. 1990. The R*-tree: An efficient and robust access method for points and rectangles. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Atlantic City, N.J., May). ACM, New York, 322--331.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Benetis, R., Jensen, C., Karciauskas, G., and Saltenis, S. 2002. Nearest neighbor and reverse nearest neighbor queries for moving objects. In Proceedings of International Database Engineering and Applications Symposium (IDEAS) (Edmonton, Ont., Canada, July). 44--53.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Berchtold, S., Bohm, C., Keim, D.A., and Kriegel, H. 1997. A cost model for nearest neighbor saearch in high-dimensional data space. In Proceedings of ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS) (Tucson, Az. May). ACM, New York, 78--86.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Beyer, K., Goldstein, J., Ramakrishnan, R., and Shaft, U. 1999. When is nearest neighbor meaningful? In Proceedings of International Conference on Database Theory (ICDT) (Jerusalem, Israel, Jan.). 217--235.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bohm, C. 2000. A cost model for query processing in high dimensional data spaces. ACM Trans. Datab. Syst. 25, 2, 129--178.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bruno, N., Chaudhuri, S., and Gravano, L. 2002. Top-k selection queries over relational databases: Mapping strategies and performance evaluation. ACM Trans. Datab. Syst. 27, 2, 153--187.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chang, Y., Bergman, L., Castelli, V., Li, C., Lo, M., and Smith, J. 2000. The onion technique: Indexing for linear optimization queries. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Dallas, Tex., May). ACM, New York, 391--402.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cheung, K. and Fu, A. 1998. Enhanced nearest neighbour search on the R-tree. SIGMOD Record 27, 3, 16--21.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Corral, A., Manolopoulos, Y., Theodoridis, Y., and Vassilakopoulos, M. 2000. Closest pair queries in spatial databases. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Dallas, Tex., May). ACM, New York, 189--200.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fagin, R. 2002. Combining fuzzy information: An overview. SIGMOD Rec. 31, 2, 109--118.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fagin, R., Lotem, A., and Naor, M. 2001. Optimal aggregation algorithms for middleware. In Proceedings of ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS) (Santa Barbara, Calif., May). ACM, New York, 102--113.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Guttman, A. 1984. R-Trees: A dynamic index structure for spatial searching. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Boston, Mass., June). ACM, New York, 47--57.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jain, A., Murthy, M., and Flynn, P. 1999. Data clustering: A review. ACM Comput. Surv. 31, 3, 264--323.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Henrich, A. 1994. A distance scan algorithm for spatial access structures. In Proceedings of ACM Workshop on Geographic Information Systems (ACM GIS) (Gaithersburg, Md., Dec.). ACM, New York, 136--143.]]Google ScholarGoogle Scholar
  18. Hjaltason, G. and Samet, H. 1998. Incremental distance join algorithms for spatial databases. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Seattle, Wash., June). ACM, New York, 237--248.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hjaltason, G. and Samet, H. 1999. Distance browsing in spatial databases. ACM Trans. Datab. Syst. 24, 2, 265--318.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hochreiter, S., Younger, A. S., and Conwell, P. 2001. Learning to learn using gradient descent. In Proceedings of International Conference on Artificial Neural Networks (ICANN) (Vienna, Austria, Aug.). 87--94.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hristidis, V. and Papakonstantinou, Y. 2004. Algorithms and applications for answering ranked queries using ranked views. VLDB J. 13, 1, 49--70.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kolahdouzan, M. and Shahabi, C. 2004. Voronoi-based K nearest neighbor search for spatial network databases. In Proceedings of Very Large Data Bases Conference (VLDB) (Toronto, Ont., Canada, Aug.). 840--851.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kollios, G., Gunopulos, D., and Tsotras, V. 1999. Nearest neighbor queries in mobile environment. In International Workshop on Spatio-Temporal Database Management (STDBM) (Edinburgh, Scotland, Sept.). 119--134.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Korn, F., Pagel, B., and Faloutsos, C. 2001. On the “Dimensionality Curse” and the “Self-Similarity Blessing.” IEEE Trans. Knowl. Data Eng. 13, 1, 96--111.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Liu, X. and Ferhatosmanoglu, H. 2003. Efficient k-NN search on streaming data series. In Proceedings of International Symposium on Advances in Spatial and Temporal Databases (SSTD) (Santorini Island, Greece, July). 83--101.]]Google ScholarGoogle Scholar
  26. Nakano, K. and Olariu, S. 1997. An optimal algorithm for the angle-restricted all nearest neighbor problem on the reconfigurable mesh, with applications. IEEE Trans. Paral. Distrib. Syst. 8, 9, 983--990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ng, R. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proceedings of Very Large Data Bases Conference (VLDB) (San Francisco, Calif., Sept.). 144--155.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Papadias, D., Shen, Q., Tao, Y., and Mouratidis, K. 2004. Group nearest neighbor queries. In Proceedings of IEEE International Conference on Data Engineering (ICDE) (Boston, Mass., Mar.). IEEE Computer Society Press, Los Alamitos, Calif., 301--312.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Papadias, D., Zhang, J., Mamoulis, N., and Tao, Y. 2003. Query processing in spatial network databases. In Proceedings of Very Large Data Bases Conference (VLDB) (Berlin, Germany, Sept.). 802--813.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Papadopoulos, A. and Manolopoulos, Y. 1997. Performance of nearest neighbor queries in R-trees. In Proceedings of International Conference on Database Theory (ICDT) (Delphi, Greece, Jan.). 394--408.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Press, W., Flannery, B., Teukolsky, S., and Vetterling, W. 2002. Numerical recipes in C++ (second edition). Cambridge University Press, ISBN 0-521-75034-2.]]Google ScholarGoogle Scholar
  32. Roussopoulos, N., Kelly, S., and Vincent, F. 1995. Nearest neighbor queries. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (San Jose, Calif., May). ACM, New York, 71--79.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sakurai, Y., Yoshikawa, M., Uemura, S., and Kojima, H. 2000. The A-tree: An index structure for high-dimensional spaces using relative approximation. In Proceedings of Very Large Data Bases Conference (VLDB) (Cairo, Egypt, Sept.). 516--526.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Shan, J., Zhang, D., and Salzberg, B. 2003. On spatial-range closest-pair query. In Proceedings of International Symposium on Advances in Spatial and Temporal Databases (SSTD) (Santorini Island, Greece, July). 252--269.]]Google ScholarGoogle Scholar
  35. Song, Z. and Roussopoulos, N. 2001. K-nearest neighbor search for moving query point. In Proceedings of International Symposium on Advances in Spatial and Temporal Databases (SSTD) (Redondo Beach, Calif., July). 79--96.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sproull, R. 1991. Refinements to nearest neighbor searching in K-dimensional trees. Algorithmica 6, 4, 579--589.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Tao, Y. and Papadias, D. 2003. Spatial queries in dynamic environments. ACM Trans. Syst. 28, 2, 101--139.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Tao, Y., Zhang, J., Papadias, D., and Mamoulis, N. 2004. An efficient cost model for optimization of nearest neighbor search in low and medium dimensional spaces. IEEE Trans. Knowl. Data Eng. 16, 10, 1169--1184.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Theodoridis, Y., Stefanakis, E., and Sellis, T. 2000. Efficient cost models for spatial queries using R-trees. IEEE Trans. Knowl. Data Eng. 12, 1, 19--32.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Weber, R., Schek, H. J., and Blott, S. 1998. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proceedings of Very Large Data Bases Conference (VLDB) (New York, N.Y., Aug.). 194--205.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Welzl, E. 1991. Smallest enclosing disks (balls and ellipsoids). In New Results and New Trends in Computer Science. Springer-Verlag, Lecture Notes in Computer Science, vol. 555. New York, 359--370.]]Google ScholarGoogle Scholar
  42. Wesolowsky, G. 1993. The Weber problem: History and perspectives. Loc. Sci. 1, 1, 5--23.]]Google ScholarGoogle Scholar
  43. Xiong, X., Mokbel, M., and Aref, W. 2005. SEA-CNN: Scalable processing of continuous K-nearest neighbor queries in spatio-temporal databases. In Proceedings of IEEE International Conference on Data Engineering (ICDE) (Tokyo, Japan, Apr.). IEEE Computer Society Press, Los Alamitos, Calif., 643--654.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yu, C., Ooi, B, Tan, K., and Jagadish, H. 2001. Indexing the distance: An efficient method to KNN processing. In Proceedings of Very Large Data Bases Conference (VLDB) (Rome, Italy, Sept.). 421--430.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yu, X., Pu, K., and Koudas, N. 2005. Monitoring K-nearest neighbor queries over moving objects. In Proceedings of IEEE International Conference on Data Engineering (ICDE) (Tokyo, Japan, Apr.). IEEE Computer Society Press, Los Alamitos, Calif., 631--642.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Aggregate nearest neighbor queries in spatial databases

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Database Systems
        ACM Transactions on Database Systems  Volume 30, Issue 2
        June 2005
        328 pages
        ISSN:0362-5915
        EISSN:1557-4644
        DOI:10.1145/1071610
        Issue’s Table of Contents

        Copyright © 2005 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 June 2005
        Published in tods Volume 30, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader