Abstract
Top-k spatial preference queries return a ranked set of the k best data objects based on the scores of feature objects in their spatial neighborhood. Despite the wide range of location-based applications that rely on spatial preference queries, existing algorithms incur non-negligible processing cost resulting in high response time. The reason is that computing the score of a data object requires examining its spatial neighborhood to find the feature object with highest score. In this paper, we propose a novel technique to speed up the performance of top-k spatial preference queries. To this end, we propose a mapping of pairs of data and feature objects to a distance-score space, which in turn allows us to identify and materialize the minimal subset of pairs that is sufficient to answer any spatial preference query. Furthermore, we present a novel algorithm that improves query processing performance by avoiding examining the spatial neighborhood of the data objects during query execution. In addition, we propose an efficient algorithm for materialization and we describe useful properties that reduce the cost of maintenance. We show through extensive experiments that our approach significantly reduces the number of I/Os and execution time compared to the state-of-the-art algorithms for different setups.
- C. Böhm, B. C. Ooi, C. Plant, and Y. Yan. Efficiently processing continuous k-nn queries on data streams. In Proc. of Int. Conf. on Data Engineering (ICDE), pages 156--165, 2007.Google Scholar
- S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In Proc. of Int. Conf. on Data Engineering (ICDE), page 421430, 2001. Google Scholar
- S. Chaudhuri, N. N. Dalvi, and R. Kaushik. Robust cardinality and cost estimation for skyline operator. In Proc. of Int. Conf. on Data Engineering (ICDE), page 64, 2006. Google Scholar
- Y. Du, D. Zhang, and T. Xia. The Optimal-Location query. In Proc. of the Int. Symposium on Spatial and Temporal Databases (SSTD), pages 163--180, 2005. Google Scholar
- R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. Journal of Computer and System Sciences, 66(4):614--656, 2003. Google Scholar
- A. Guttman. R-trees: a dynamic index structure for spatial searching. In Proc. of the Int. Conf. on Management of Data (SIGMOD), pages 47--57, Boston, Massachusetts, 1984. Google Scholar
- J. Han, M. Kamber, and A. K. H. Tung. Spatial clustering methods in data mining: A survey. Geographic Data Mining and Knowledge Discovery, pages 1--29, 2001.Google Scholar
- F. Korn and S. Muthukrishnan. Influence sets based on reverse nearest neighbor queries. In Proc. of the Int. Conf. on Management of Data (SIGMOD), pages 201--212, 2000. Google Scholar
- N. Mamoulis, M. L. Yiu, K. H. Cheng, and D. W. Cheung. Efficient top-k aggregation of ranked inputs. ACM Transactions on Database Systems (TODS), 32(3): 19, 2007. Google Scholar
- K. Mouratidis, S. Bakiras, and D. Papadias. Continuous monitoring of top-k queries over sliding windows. In Proc. of the Int. Conf. on Management of Data (SIGMOD), pages 635--646, 2006. Google Scholar
- D. Papadias, P. Kalnis, J. Zhang, and Y Tao. Efficient OLAP operations in spatial data warehouses. In Proc. of the Int. Symposium on Advances in Spatial and Temporal Databases (SSTD), pages 443--459. Springer-Verlag, 2001. Google Scholar
- D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS), 30(1):41--82, 2005. Google Scholar
- E. Pekalska and R. P. W. Duin. Classifiers for dissimilarity-based pattern recognition. In Proc. of Int. Conf. on Pattern Recognition (ICPR), pages 2012--2016, 2000.Google Scholar
- H. Samet. The quadtree and related hierarchical data structures. ACM Comput. Surv., 16(2): 187--260, 1984. Google Scholar
- T. Xia, D. Zhang, E. Kanoulas, and Y. Du. On computing top-t most influential spatial sites. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 946--957, 2005. Google Scholar
- M. L. Yiu, X. Dai, N. Mamoulis, and M. Vaitis. Top-k spatial preference queries. In Proc. of Int. Conf. on Data Engineering (ICDE), pages 1076--1085, 2007.Google Scholar
- M. L. Yiu, H. Lu, N. Mamoulis, and M. Vaitis. Ranking spatial data by quality preferences. Transactions on Knowledge and Data Engineering (TKDE), to appear. Google Scholar
- Z. Zhang, Y. Yang, R. Cai, D. Papadias, and A. K. H. Tung. Kernel-based skyline cardinality estimation. In Proc. of the Int. Conf. on Management of Data (SIGMOD), pages 509--522, 2009. Google Scholar
Index Terms
- Efficient processing of top-k spatial preference queries
Recommendations
Processing a large number of continuous preference top-k queries
SIGMOD '12: Proceedings of the 2012 ACM SIGMOD International Conference on Management of DataGiven a set of objects, each with multiple numeric attributes, a (preference) top-k query retrieves the k objects with the highest scores according to a user preference, defined as a linear combination of attribute values. We consider the problem of ...
Scalable and efficient processing of top-k multiple-type integrated queries
AbstractIn this paper, we define a new class of queries, the top-k multiple-type integrated query (simply, top-k MULTI query). It deals with multiple data types and finds the information in the order of relevance between the query and the object. Various ...
Efficient top-k retrieval for user preference queries
SAC '11: Proceedings of the 2011 ACM Symposium on Applied ComputingEfficient retrieval of the most relevant (i.e. top-k) tuples is an important requirement in information systems which access large amounts of data. In general answering a top-k query request means to retrieve the k-objects which score best for an ...
Comments