1 Introduction
-
To the best of our knowledge, we are the first to propose a pre-computation-based approach that does not depend on the set of users. Our pre-computation significantly reduces the number of facilities to be accessed for SLICE and improves the query processing cost.
-
We come up with two RkNN variants which are useful in practice and we proposed effective algorithms.
-
Our comprehensive experimental study on real and synthetic datasets demonstrates that our algorithm for RkNN significantly improves SLICE in query processing and both the algorithm for RkNN and variants algorithms outperform existing ones.
2 Related Works
3 Techniques
3.1 Computing Guardian Set of a Rectangular Region
-
is an intersection between other guardian facility’s combined curve and \(C_{fgi}\) or between \(C_{fgi}\) and universe boundary;
-
lies at the same side with f w.r.t. L;
-
is pruned exactly k-1 times and has the farthest distance to L or L’s extension if \(v_g\) is out of L’s range; if \(dist(L,f)>2dist(v_{g},L)\), f has no influence on constructing R’s guardian set and can be pruned.
-
intersection between \(C_f\) and combined curves of other guardian facilities;
-
intersection between \(C_f\) and universe boundaries;
-
intersection that is joint point between different sub-curves of \(C_f\).
3.2 Partition Universe
4 Query Processing
5 Reverse k Nearest Neighbour Variants
5.1 Fixed Value Reverse k Nearest Neighbour Queries
-
for each user \(u_i\) locating above k-th upper arc, there are at least k other facilities than q that closer to \(u_i\) and q cannot be \(u_i\)’s k-th nearest neighbour, \(u_i\) is pruned. e.g. \(u_2\) in Fig 5b.
-
for each user \(u_i\) that locates under the (k-1)-th lower arc, there are at most (k-2) other facilities than q that closer to \(u_i\) and q is at least \(k-1\) closest facility to \(u_i\) and cannot be \(u_i\)’s k-th nearest neighbour, \(u_i\) is pruned. e.g. \(u_1\) in Fig 5b.
5.2 Weighted Reverse k Nearest Neighbour Queries
-
If \(q_{Cost_{textual}}\,\ge\,f_{i(Cost_{textual})}\), all users locating on the right side of the left branch of H must have a smaller or equal aggregated cost to \(f_i\) than to q.
-
If \(q_{Cost_{textual}}\,<\,f_{i(Cost_{textual})}\), all users locating on the right side of the right branch of H must have a smaller aggregated cost to \(f_i\) than to q.
-
When \(q_{Cost_{textual}} \ge f_{i(Cost_{textual})}\), we consider the left branch \(H_l\) of H and the extreme case is that \(u_i\) has same cost to q and \(f_i\), we obtain \(q_{Cost_{textual}}\) + \(\alpha \times dist(q_l,u_i)\) = \(f_{i(Cost_{textual})}\) + \(\alpha \times dist({f_i}_l,u_i)\). By transferring, we have \(dist({f_i}_l,u_i) - dist(q_l,u_i)\) = \(\frac{q_{Cost_{textual}} - f_{i(Cost_{textual})}}{\alpha }\). As shown \(u_1\) in Fig. 6a. When \(u_i\) move right to the right hand side of \(H_l\)(like \(u_2\)), according to the definition of H that \( b - h = \frac{q_{Cost_{textual}} - f_{i(Cost_{textual})}}{\alpha }\) and the triangle inequality that \(e - g < b\), we have \(e - (g+h) < \frac{q_{Cost_{textual}} - f_{i(Cost_{textual})}}{\alpha }\), i.e. \(dist({f_i}_l,u_i) - dist(q_l,u_i) < \frac{q_{Cost_{textual}} - f_{i(Cost_{textual})}}{\alpha } \), we have \(Cost_{total}(q,u_i) > Cost_{total}(f_i,u_i)\). When \(u_i\) is inside the left area of \(H_l\), it is easy to prove \(Cost_{total}(q,u_i) < Cost_{total}(f_i,u_i)\) and we omit it.