Abstract
Given two spatial datasets P (e.g., facilities) and Q (queries), an aggregate nearest neighbor (ANN) query retrieves the point(s) of P with the smallest aggregate distance(s) to points in Q. Assuming, for example, n users at locations q1,…qn, an ANN query outputs the facility p ∈ P that minimizes the sum of distances |pqi| for 1 ≤ i ≤ n that the users have to travel in order to meet there. Similarly, another ANN query may report the point p ∈ P that minimizes the maximum distance that any user has to travel, or the minimum distance from some user to his/her closest facility. If Q fits in memory and P is indexed by an R-tree, we develop algorithms for aggregate nearest neighbors that capture several versions of the problem, including weighted queries and incremental reporting of results. Then, we analyze their performance and propose cost models for query optimization. Finally, we extend our techniques for disk-resident queries and approximate ANN retrieval. The efficiency of the algorithms and the accuracy of the cost models are evaluated through extensive experiments with real and synthetic datasets.
- Acharya, S., Poosala, V., and Ramaswamy, S. 1999. Selectivity estimation in spatial databases. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Philadelphia, Pa., June), ACM, New York, 13--24.]] Google ScholarDigital Library
- Aggrawal, C. and Yu, P. 2001. Outlier detection for high dimensional data. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Santa Barbara, Calif., May). ACM, New York, 37--47.]] Google ScholarDigital Library
- Arya, S., Mount, D., Netanyahu, N., Silverman, R., and Wu, A. 1998. An optimal algorithm for approximate nearest neighbor searching. J. ACM 45, 6, 891--923.]] Google ScholarDigital Library
- Beckmann, N., Kriegel, H., Schneider, R., and Seeger, B. 1990. The R*-tree: An efficient and robust access method for points and rectangles. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Atlantic City, N.J., May). ACM, New York, 322--331.]] Google ScholarDigital Library
- Benetis, R., Jensen, C., Karciauskas, G., and Saltenis, S. 2002. Nearest neighbor and reverse nearest neighbor queries for moving objects. In Proceedings of International Database Engineering and Applications Symposium (IDEAS) (Edmonton, Ont., Canada, July). 44--53.]] Google ScholarDigital Library
- Berchtold, S., Bohm, C., Keim, D.A., and Kriegel, H. 1997. A cost model for nearest neighbor saearch in high-dimensional data space. In Proceedings of ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS) (Tucson, Az. May). ACM, New York, 78--86.]] Google ScholarDigital Library
- Beyer, K., Goldstein, J., Ramakrishnan, R., and Shaft, U. 1999. When is nearest neighbor meaningful? In Proceedings of International Conference on Database Theory (ICDT) (Jerusalem, Israel, Jan.). 217--235.]] Google ScholarDigital Library
- Bohm, C. 2000. A cost model for query processing in high dimensional data spaces. ACM Trans. Datab. Syst. 25, 2, 129--178.]] Google ScholarDigital Library
- Bruno, N., Chaudhuri, S., and Gravano, L. 2002. Top-k selection queries over relational databases: Mapping strategies and performance evaluation. ACM Trans. Datab. Syst. 27, 2, 153--187.]] Google ScholarDigital Library
- Chang, Y., Bergman, L., Castelli, V., Li, C., Lo, M., and Smith, J. 2000. The onion technique: Indexing for linear optimization queries. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Dallas, Tex., May). ACM, New York, 391--402.]] Google ScholarDigital Library
- Cheung, K. and Fu, A. 1998. Enhanced nearest neighbour search on the R-tree. SIGMOD Record 27, 3, 16--21.]] Google ScholarDigital Library
- Corral, A., Manolopoulos, Y., Theodoridis, Y., and Vassilakopoulos, M. 2000. Closest pair queries in spatial databases. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Dallas, Tex., May). ACM, New York, 189--200.]] Google ScholarDigital Library
- Fagin, R. 2002. Combining fuzzy information: An overview. SIGMOD Rec. 31, 2, 109--118.]] Google ScholarDigital Library
- Fagin, R., Lotem, A., and Naor, M. 2001. Optimal aggregation algorithms for middleware. In Proceedings of ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS) (Santa Barbara, Calif., May). ACM, New York, 102--113.]] Google ScholarDigital Library
- Guttman, A. 1984. R-Trees: A dynamic index structure for spatial searching. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Boston, Mass., June). ACM, New York, 47--57.]] Google ScholarDigital Library
- Jain, A., Murthy, M., and Flynn, P. 1999. Data clustering: A review. ACM Comput. Surv. 31, 3, 264--323.]] Google ScholarDigital Library
- Henrich, A. 1994. A distance scan algorithm for spatial access structures. In Proceedings of ACM Workshop on Geographic Information Systems (ACM GIS) (Gaithersburg, Md., Dec.). ACM, New York, 136--143.]]Google Scholar
- Hjaltason, G. and Samet, H. 1998. Incremental distance join algorithms for spatial databases. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (Seattle, Wash., June). ACM, New York, 237--248.]] Google ScholarDigital Library
- Hjaltason, G. and Samet, H. 1999. Distance browsing in spatial databases. ACM Trans. Datab. Syst. 24, 2, 265--318.]] Google ScholarDigital Library
- Hochreiter, S., Younger, A. S., and Conwell, P. 2001. Learning to learn using gradient descent. In Proceedings of International Conference on Artificial Neural Networks (ICANN) (Vienna, Austria, Aug.). 87--94.]] Google ScholarDigital Library
- Hristidis, V. and Papakonstantinou, Y. 2004. Algorithms and applications for answering ranked queries using ranked views. VLDB J. 13, 1, 49--70.]] Google ScholarDigital Library
- Kolahdouzan, M. and Shahabi, C. 2004. Voronoi-based K nearest neighbor search for spatial network databases. In Proceedings of Very Large Data Bases Conference (VLDB) (Toronto, Ont., Canada, Aug.). 840--851.]] Google ScholarDigital Library
- Kollios, G., Gunopulos, D., and Tsotras, V. 1999. Nearest neighbor queries in mobile environment. In International Workshop on Spatio-Temporal Database Management (STDBM) (Edinburgh, Scotland, Sept.). 119--134.]] Google ScholarDigital Library
- Korn, F., Pagel, B., and Faloutsos, C. 2001. On the “Dimensionality Curse” and the “Self-Similarity Blessing.” IEEE Trans. Knowl. Data Eng. 13, 1, 96--111.]] Google ScholarDigital Library
- Liu, X. and Ferhatosmanoglu, H. 2003. Efficient k-NN search on streaming data series. In Proceedings of International Symposium on Advances in Spatial and Temporal Databases (SSTD) (Santorini Island, Greece, July). 83--101.]]Google Scholar
- Nakano, K. and Olariu, S. 1997. An optimal algorithm for the angle-restricted all nearest neighbor problem on the reconfigurable mesh, with applications. IEEE Trans. Paral. Distrib. Syst. 8, 9, 983--990.]] Google ScholarDigital Library
- Ng, R. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proceedings of Very Large Data Bases Conference (VLDB) (San Francisco, Calif., Sept.). 144--155.]] Google ScholarDigital Library
- Papadias, D., Shen, Q., Tao, Y., and Mouratidis, K. 2004. Group nearest neighbor queries. In Proceedings of IEEE International Conference on Data Engineering (ICDE) (Boston, Mass., Mar.). IEEE Computer Society Press, Los Alamitos, Calif., 301--312.]] Google ScholarDigital Library
- Papadias, D., Zhang, J., Mamoulis, N., and Tao, Y. 2003. Query processing in spatial network databases. In Proceedings of Very Large Data Bases Conference (VLDB) (Berlin, Germany, Sept.). 802--813.]] Google ScholarDigital Library
- Papadopoulos, A. and Manolopoulos, Y. 1997. Performance of nearest neighbor queries in R-trees. In Proceedings of International Conference on Database Theory (ICDT) (Delphi, Greece, Jan.). 394--408.]] Google ScholarDigital Library
- Press, W., Flannery, B., Teukolsky, S., and Vetterling, W. 2002. Numerical recipes in C++ (second edition). Cambridge University Press, ISBN 0-521-75034-2.]]Google Scholar
- Roussopoulos, N., Kelly, S., and Vincent, F. 1995. Nearest neighbor queries. In Proceedings of ACM Conference on the Management of Data (SIGMOD) (San Jose, Calif., May). ACM, New York, 71--79.]] Google ScholarDigital Library
- Sakurai, Y., Yoshikawa, M., Uemura, S., and Kojima, H. 2000. The A-tree: An index structure for high-dimensional spaces using relative approximation. In Proceedings of Very Large Data Bases Conference (VLDB) (Cairo, Egypt, Sept.). 516--526.]] Google ScholarDigital Library
- Shan, J., Zhang, D., and Salzberg, B. 2003. On spatial-range closest-pair query. In Proceedings of International Symposium on Advances in Spatial and Temporal Databases (SSTD) (Santorini Island, Greece, July). 252--269.]]Google Scholar
- Song, Z. and Roussopoulos, N. 2001. K-nearest neighbor search for moving query point. In Proceedings of International Symposium on Advances in Spatial and Temporal Databases (SSTD) (Redondo Beach, Calif., July). 79--96.]] Google ScholarDigital Library
- Sproull, R. 1991. Refinements to nearest neighbor searching in K-dimensional trees. Algorithmica 6, 4, 579--589.]]Google ScholarDigital Library
- Tao, Y. and Papadias, D. 2003. Spatial queries in dynamic environments. ACM Trans. Syst. 28, 2, 101--139.]] Google ScholarDigital Library
- Tao, Y., Zhang, J., Papadias, D., and Mamoulis, N. 2004. An efficient cost model for optimization of nearest neighbor search in low and medium dimensional spaces. IEEE Trans. Knowl. Data Eng. 16, 10, 1169--1184.]] Google ScholarDigital Library
- Theodoridis, Y., Stefanakis, E., and Sellis, T. 2000. Efficient cost models for spatial queries using R-trees. IEEE Trans. Knowl. Data Eng. 12, 1, 19--32.]] Google ScholarDigital Library
- Weber, R., Schek, H. J., and Blott, S. 1998. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proceedings of Very Large Data Bases Conference (VLDB) (New York, N.Y., Aug.). 194--205.]] Google ScholarDigital Library
- Welzl, E. 1991. Smallest enclosing disks (balls and ellipsoids). In New Results and New Trends in Computer Science. Springer-Verlag, Lecture Notes in Computer Science, vol. 555. New York, 359--370.]]Google Scholar
- Wesolowsky, G. 1993. The Weber problem: History and perspectives. Loc. Sci. 1, 1, 5--23.]]Google Scholar
- Xiong, X., Mokbel, M., and Aref, W. 2005. SEA-CNN: Scalable processing of continuous K-nearest neighbor queries in spatio-temporal databases. In Proceedings of IEEE International Conference on Data Engineering (ICDE) (Tokyo, Japan, Apr.). IEEE Computer Society Press, Los Alamitos, Calif., 643--654.]] Google ScholarDigital Library
- Yu, C., Ooi, B, Tan, K., and Jagadish, H. 2001. Indexing the distance: An efficient method to KNN processing. In Proceedings of Very Large Data Bases Conference (VLDB) (Rome, Italy, Sept.). 421--430.]] Google ScholarDigital Library
- Yu, X., Pu, K., and Koudas, N. 2005. Monitoring K-nearest neighbor queries over moving objects. In Proceedings of IEEE International Conference on Data Engineering (ICDE) (Tokyo, Japan, Apr.). IEEE Computer Society Press, Los Alamitos, Calif., 631--642.]] Google ScholarDigital Library
Index Terms
- Aggregate nearest neighbor queries in spatial databases
Recommendations
Continuous obstructed nearest neighbor queries in spatial databases
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of dataIn this paper, we study a novel form of continuous nearest neighbor queries in the presence of obstacles, namely continuous obstructed nearest neighbor (CONN) search. It considers the impact of obstacles on the distance between objects, which is ignored ...
Group visible nearest neighbor queries in spatial databases
WAIM'10: Proceedings of the 11th international conference on Web-age information managementTraditional nearest neighbor queries and its variants, such as Group Nearest Neighbor Query (GNN), have been widely studied by many researchers. Recently obstacles are involved in spatial queries. The existence of obstacles may affect the query results ...
Approximate Direct and Reverse Nearest Neighbor Queries, and the k-nearest Neighbor Graph
SISAP '09: Proceedings of the 2009 Second International Workshop on Similarity Search and ApplicationsRetrieving the \emph{k-nearest neighbors} of a query object is a basic primitive in similarity searching. A related, far less explored primitive is to obtain the dataset elements which would have the query object within their own \emph{k}-nearest ...
Comments