Skip to main content
Top
Published in: The VLDB Journal 1/2022

28-08-2021 | Regular Paper

Answering reachability and K-reach queries on large graphs with label constraints

Authors: You Peng, Xuemin Lin, Ying Zhang, Wenjie Zhang, Lu Qin

Published in: The VLDB Journal | Issue 1/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The purpose of this paper is to examine the problem of label-constrained reachability (LCR) and K-reach (LCKR) queries, which are fundamental in a wide variety of applications using directed edge-labeled graphs. While reachability and K-reach queries have been extensively researched, LCR and LCKR queries are much more challenging due to the fact that the number of potential label-constraint sets is exponential to the size of the labels. We note that existing techniques for LCR queries only build a partial index and that their worse-case query time could be comparable to that of an online breadth-first search (BFS). This paper proposes a new label-constrained 2-hop indexing method with innovative pruning rules and order strategies. Our work demonstrates that the worst query time could be bounded by the number of in-out index entries. Extensive experiments demonstrate that the proposed methods substantially outperform the state-of-the-art approach in terms of the query response time (up to 5 orders of magnitude speedup), index size, and the index construction time. More precisely, the method we present can response LCR queries across billion-scale networks within microseconds on a single machine. We formally define the problem of LCKR queries and discuss critical applications for addressing it. To tackle the difficulties presented by label and hop constraints, an efficient upper and lower bound is suggested based on a search method. Using all of these techniques, extensive experiments on synthetic and real-world networks demonstrate that our algorithm outperforms the baseline by about three to four orders of magnitude while maintaining competitive indexing time and size.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Otherwise, the system will be overwhelmed by false alarms raised by the above path analysis.
 
2
To avoid possible ambiguity of “labeling index” and “label-constrained”, we use “2-hop index technique” to present the “2-hop labeling index technique” used in the literature.
 
3
Our index construction process for each vertex will return a minimal label path tree.
 
4
This is the default setting for LI+ in [41].
 
5
Find the minimum distance between \(v_k\) and u in the index \(L_{k-1}\) with label constraints P[u].labels.
 
6
w is set due to the initial experiments.
 
7
P2H+ uses degree order which is the same as LI+ for the sake of comparison fairness. The effect of vertex order will be compared in Table 8
 
Literature
1.
go back to reference Abraham, I., Delling, D., Goldberg, A.V., Werneck, R.F.: Hierarchical hub labelings for shortest paths. In European Symposium on Algorithms, pages 24–35. Springer, (2012) Abraham, I., Delling, D., Goldberg, A.V., Werneck, R.F.: Hierarchical hub labelings for shortest paths. In European Symposium on Algorithms, pages 24–35. Springer, (2012)
2.
go back to reference Akiba, T., Iwata, Y., Kawarabayashi, K.-i., Kawata, Y.: Fast shortest-path distance queries on road networks by pruned highway labeling. In 2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 147–154. SIAM, (2014) Akiba, T., Iwata, Y., Kawarabayashi, K.-i., Kawata, Y.: Fast shortest-path distance queries on road networks by pruned highway labeling. In 2014 Proceedings of the Sixteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 147–154. SIAM, (2014)
3.
go back to reference Akiba, T., Iwata, Y., Yoshida, Y.: Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 349–360. ACM, (2013) Akiba, T., Iwata, Y., Yoshida, Y.: Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 349–360. ACM, (2013)
4.
go back to reference Barrett, C., Jacob, R., Marathe, M.: Formal-language-constrained path problems. SIAM J. Comput. 30(3), 809–837 (2000)MathSciNetCrossRef Barrett, C., Jacob, R., Marathe, M.: Formal-language-constrained path problems. SIAM J. Comput. 30(3), 809–837 (2000)MathSciNetCrossRef
5.
go back to reference Bonchi, F., Gionis, A., Gullo, F., Ukkonen, A.: Distance oracles in edge-labeled graphs. In EDBT, pages 547–558, (2014) Bonchi, F., Gionis, A., Gullo, F., Ukkonen, A.: Distance oracles in edge-labeled graphs. In EDBT, pages 547–558, (2014)
6.
go back to reference Chen, M., Gu, Y., Bao, Y., Yu, G.: Label and distance-constraint reachability queries in uncertain graphs. In International Conference on Database Systems for Advanced Applications, pages 188–202. Springer, (2014) Chen, M., Gu, Y., Bao, Y., Yu, G.: Label and distance-constraint reachability queries in uncertain graphs. In International Conference on Database Systems for Advanced Applications, pages 188–202. Springer, (2014)
7.
go back to reference Chen, X., Lai, L., Qin, L., Lin, X., Liu, B.: A framework to quantify approximate simulation on graph data. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 1308–1319. IEEE, (2021) Chen, X., Lai, L., Qin, L., Lin, X., Liu, B.: A framework to quantify approximate simulation on graph data. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 1308–1319. IEEE, (2021)
8.
go back to reference Cheng, J., Huang, S., Wu, H., Fu, A.W.-C.: Tf-label: a topological-folding labeling scheme for reachability querying in a large graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 193–204. ACM, (2013) Cheng, J., Huang, S., Wu, H., Fu, A.W.-C.: Tf-label: a topological-folding labeling scheme for reachability querying in a large graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 193–204. ACM, (2013)
9.
go back to reference Cheng, J., Shang, Z., Cheng, H., Wang, H., K-reach, J.XYu.: who is in your small world. Proceedings of the VLDB Endowment 5(11), 1292–1303 (2012) Cheng, J., Shang, Z., Cheng, H., Wang, H., K-reach, J.XYu.: who is in your small world. Proceedings of the VLDB Endowment 5(11), 1292–1303 (2012)
10.
go back to reference Cheng, J., Yu, J.X.: On-line exact shortest distance query processing. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pages 481–492. ACM, (2009) Cheng, J., Yu, J.X.: On-line exact shortest distance query processing. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pages 481–492. ACM, (2009)
11.
go back to reference Cheng, J., Yu, J.X., Lin, X., Wang, H., Philip, S.Y.: Fast computation of reachability labeling for large graphs. In International Conference on Extending Database Technology, pages 961–979. Springer, (2006) Cheng, J., Yu, J.X., Lin, X., Wang, H., Philip, S.Y.: Fast computation of reachability labeling for large graphs. In International Conference on Extending Database Technology, pages 961–979. Springer, (2006)
12.
go back to reference Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32(5), 1338–1355 (2003)MathSciNetCrossRef Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32(5), 1338–1355 (2003)MathSciNetCrossRef
13.
go back to reference Fang, Y., Cheng, R., Li, X., Luo, S., Hu, J.: Effective community search over large spatial graphs. PVLDB 10(6), 709–720 (2017) Fang, Y., Cheng, R., Li, X., Luo, S., Hu, J.: Effective community search over large spatial graphs. PVLDB 10(6), 709–720 (2017)
14.
go back to reference Fang, Y., Cheng, R., Luo, S., Hu, J.: Effective community search for large attributed graphs. PVLDB 9(12), 1233–1244 (2016) Fang, Y., Cheng, R., Luo, S., Hu, J.: Effective community search for large attributed graphs. PVLDB 9(12), 1233–1244 (2016)
15.
go back to reference Fang, Y., Cheng, R., Luo, S., Hu, J., Huang, K.: C-explorer: browsing communities in large graphs. PVLDB 10(12), 1885–1888 (2017) Fang, Y., Cheng, R., Luo, S., Hu, J., Huang, K.: C-explorer: browsing communities in large graphs. PVLDB 10(12), 1885–1888 (2017)
16.
go back to reference Fang, Y., Huang, X., Qin, L., Zhang, Y., Zhang, W., Cheng, R., Lin, X.: A survey of community search over big graphs. VLDB J. 29(1), 353–392 (2020)CrossRef Fang, Y., Huang, X., Qin, L., Zhang, Y., Zhang, W., Cheng, R., Lin, X.: A survey of community search over big graphs. VLDB J. 29(1), 353–392 (2020)CrossRef
17.
go back to reference Fang, Y., Yang, Y., Zhang, W., Lin, X., Cao, X.: Effective and efficient community search over large heterogeneous information networks. PVLDB 13(6), 854–857 (2020) Fang, Y., Yang, Y., Zhang, W., Lin, X., Cao, X.: Effective and efficient community search over large heterogeneous information networks. PVLDB 13(6), 854–857 (2020)
18.
go back to reference Fang, Y., Yu, K., Cheng, R., Lakshmanan, L.V., Lin, X.: Efficient algorithms for densest subgraph discovery. PVLDB 12(11), 1719–1732 (2019) Fang, Y., Yu, K., Cheng, R., Lakshmanan, L.V., Lin, X.: Efficient algorithms for densest subgraph discovery. PVLDB 12(11), 1719–1732 (2019)
19.
go back to reference Hassan, M.S., Aref, W.G., Aly, A.M.: Graph indexing for shortest-path finding over dynamic sub-graphs. In Proceedings of the 2016 International Conference on Management of Data, pages 1183–1197. ACM, (2016) Hassan, M.S., Aref, W.G., Aly, A.M.: Graph indexing for shortest-path finding over dynamic sub-graphs. In Proceedings of the 2016 International Conference on Management of Data, pages 1183–1197. ACM, (2016)
20.
go back to reference Hu, J., Cheng, R., Chang, K.C.-C., Sankar, A., Fang, Y., Lam, B.Y.: Discovering maximal motif cliques in large heterogeneous information networks. In International Conference on Data Engineering (ICDE), pages 746–757. IEEE, (2019) Hu, J., Cheng, R., Chang, K.C.-C., Sankar, A., Fang, Y., Lam, B.Y.: Discovering maximal motif cliques in large heterogeneous information networks. In International Conference on Data Engineering (ICDE), pages 746–757. IEEE, (2019)
21.
go back to reference Jin, R., Hong, H., Wang, H., Ruan, N., Xiang, Y.: Computing label-constraint reachability in graph databases. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 123–134. ACM, (2010) Jin, R., Hong, H., Wang, H., Ruan, N., Xiang, Y.: Computing label-constraint reachability in graph databases. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 123–134. ACM, (2010)
22.
go back to reference Jin, R., Wang, G.: Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment 6(14), 1978–1989 (2013) Jin, R., Wang, G.: Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment 6(14), 1978–1989 (2013)
23.
go back to reference Jin, X., Yang, Z., Lin, X., Yang, S., Qin, L., Peng, Y.: Fast: Fpga-based subgraph matching on massive graphs. arXiv preprint arXiv:2102.10768, (2021) Jin, X., Yang, Z., Lin, X., Yang, S., Qin, L., Peng, Y.: Fast: Fpga-based subgraph matching on massive graphs. arXiv preprint arXiv:​2102.​10768, (2021)
24.
go back to reference Klodt, P., Weikum, G., Bedathur, S., Seufert, S.: Indexing strategies for constrained shortest paths over large social networks. Universitat des Saarlandes, (2011) Klodt, P., Weikum, G., Bedathur, S., Seufert, S.: Indexing strategies for constrained shortest paths over large social networks. Universitat des Saarlandes, (2011)
25.
go back to reference Koschmieder, A., Leser, U.: Regular path queries on large graphs. In International Conference on Scientific and Statistical Database Management, pages 177–194. Springer, (2012) Koschmieder, A., Leser, U.: Regular path queries on large graphs. In International Conference on Scientific and Statistical Database Management, pages 177–194. Springer, (2012)
26.
go back to reference Kunegis, J.: Konect: the koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web, pages 1343–1350. ACM, (2013) Kunegis, J.: Konect: the koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web, pages 1343–1350. ACM, (2013)
27.
go back to reference Lai, Z., Peng, Y., Yang, S., Lin, X., Zhang, W.: Pefp: Efficient k-hop constrained s-t simple path enumeration on fpga. In ICDE, IEEE (2021) Lai, Z., Peng, Y., Yang, S., Lin, X., Zhang, W.: Pefp: Efficient k-hop constrained s-t simple path enumeration on fpga. In ICDE, IEEE (2021)
28.
go back to reference Leskovec, J.: Snap: Stanford large network dataset collection, (2016) Leskovec, J.: Snap: Stanford large network dataset collection, (2016)
29.
go back to reference Leskovec, J., Sosic, R.: Snap: A general purpose network analysis and graph mining library in c++, (2014) Leskovec, J., Sosic, R.: Snap: A general purpose network analysis and graph mining library in c++, (2014)
30.
go back to reference Li, Y., Yiu, M.L., Kou, N.M., et al.: An experimental study on hub labeling based shortest path algorithms. Proceedings of the VLDB Endowment 11(4), 445–457 (2017) Li, Y., Yiu, M.L., Kou, N.M., et al.: An experimental study on hub labeling based shortest path algorithms. Proceedings of the VLDB Endowment 11(4), 445–457 (2017)
31.
go back to reference Li, Z., Fang, Y., Qin, L., Cheng, J., Cheng, R., Lui, J.C.: Walking in the cloud: parallel simrank at scale. PVLDB 9(1), 24–35 (2015) Li, Z., Fang, Y., Qin, L., Cheng, J., Cheng, R., Lui, J.C.: Walking in the cloud: parallel simrank at scale. PVLDB 9(1), 24–35 (2015)
32.
go back to reference Ma, C., Fang, Y., Cheng, R., Lakshmanan, L.V., Zhang, W., Lin, X.: Efficient algorithms for densest subgraph discovery on large directed graphs. In ACM SIGMOD, pages 1051–1066, (2020) Ma, C., Fang, Y., Cheng, R., Lakshmanan, L.V., Zhang, W., Lin, X.: Efficient algorithms for densest subgraph discovery on large directed graphs. In ACM SIGMOD, pages 1051–1066, (2020)
33.
go back to reference Peng, Y., Lin, X., Zhang, Y., Zhang, W., Qin, L., Zhou, J.: Efficient hop-constrained s-t simple path enumeration. The VLDB Journal, pages 1–24, (2021) Peng, Y., Lin, X., Zhang, Y., Zhang, W., Qin, L., Zhou, J.: Efficient hop-constrained s-t simple path enumeration. The VLDB Journal, pages 1–24, (2021)
34.
go back to reference Peng, Y., Zhang, Y., Lin, X., Qin, L., Zhang, W.: Answering billion-scale label-constrained reachability queries within microsecond. Proceedings of the VLDB Endowment 13(6), 812–825 (2020) Peng, Y., Zhang, Y., Lin, X., Qin, L., Zhang, W.: Answering billion-scale label-constrained reachability queries within microsecond. Proceedings of the VLDB Endowment 13(6), 812–825 (2020)
35.
go back to reference Peng, Y., Zhang, Y., Lin, X., Zhang, W., Qin, L., Zhou, J.: Hop-constrained s-t simple path enumeration: Towards bridging theory and practice. Proceedings of the VLDB Endowment 13(4), 463–476 (2019)CrossRef Peng, Y., Zhang, Y., Lin, X., Zhang, W., Qin, L., Zhou, J.: Hop-constrained s-t simple path enumeration: Towards bridging theory and practice. Proceedings of the VLDB Endowment 13(4), 463–476 (2019)CrossRef
36.
go back to reference Peng, Y., Zhang, Y., Lin, X., Zhang, W., Qin, L., Zhou, J.: Towards bridging theory and practice: hop-constrained st simple path enumeration. Proc. VLDB Endow. 13(4), 463–476 (2019)CrossRef Peng, Y., Zhang, Y., Lin, X., Zhang, W., Qin, L., Zhou, J.: Towards bridging theory and practice: hop-constrained st simple path enumeration. Proc. VLDB Endow. 13(4), 463–476 (2019)CrossRef
37.
go back to reference Peng, Y., Zhang, Y., Zhang, W., Lin, X., Qin, L.: Efficient probabilistic k-core computation on uncertain graphs. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pages 1192–1203. IEEE, (2018) Peng, Y., Zhang, Y., Zhang, W., Lin, X., Qin, L.: Efficient probabilistic k-core computation on uncertain graphs. In 2018 IEEE 34th International Conference on Data Engineering (ICDE), pages 1192–1203. IEEE, (2018)
38.
go back to reference Peng, Y., Zhao, W., Zhang, W., Lin, X., Zhang, Y.: Dlq: A system for label-constrained reachability queries on dynamic graphs. In Proceedings of the 230th ACM International Conference on Information & Knowledge Management, (2021) Peng, Y., Zhao, W., Zhang, W., Lin, X., Zhang, Y.: Dlq: A system for label-constrained reachability queries on dynamic graphs. In Proceedings of the 230th ACM International Conference on Information & Knowledge Management, (2021)
39.
go back to reference Qiu, X., Cen, W., Qian, Z., Peng, Y., Zhang, Y., Lin, X., Zhou, J.: Real-time constrained cycle detection in large dynamic graphs. Proc. VLDB Endow. 11(12), 1876–1888 (2018)CrossRef Qiu, X., Cen, W., Qian, Z., Peng, Y., Zhang, Y., Lin, X., Zhou, J.: Real-time constrained cycle detection in large dynamic graphs. Proc. VLDB Endow. 11(12), 1876–1888 (2018)CrossRef
40.
go back to reference Qiu, X., Cen, W., Qian, Z., Peng, Y., Zhang, Y., Lin, X., Zhou, J.: Real-time constrained cycle detection in large dynamic graphs. PVLDB 11(12), 1876–1888 (2018) Qiu, X., Cen, W., Qian, Z., Peng, Y., Zhang, Y., Lin, X., Zhou, J.: Real-time constrained cycle detection in large dynamic graphs. PVLDB 11(12), 1876–1888 (2018)
41.
go back to reference Valstar, L.D., Fletcher, G.H., Yoshida, Y.: Landmark indexing for evaluation of label-constrained reachability queries. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 345–358. ACM, (2017) Valstar, L.D., Fletcher, G.H., Yoshida, Y.: Landmark indexing for evaluation of label-constrained reachability queries. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 345–358. ACM, (2017)
42.
go back to reference van Rest, O., Hong, S., Kim, J., Meng, X., Chafi, H.: Pgql: a property graph query language. In Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, page 7. ACM, (2016) van Rest, O., Hong, S., Kim, J., Meng, X., Chafi, H.: Pgql: a property graph query language. In Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, page 7. ACM, (2016)
43.
go back to reference Wadhwa, S., Prasad, A., Ranu, S., Bagchi, A., Bedathur, S.: Efficiently answering regular simple path queries on large labeled networks. Age 3, v7 (2019) Wadhwa, S., Prasad, A., Ranu, S., Bagchi, A., Bedathur, S.: Efficiently answering regular simple path queries on large labeled networks. Age 3, v7 (2019)
44.
go back to reference Wood, P.T.: Query languages for graph databases. ACM SIGMOD Rec. 41(1), 50–60 (2012)CrossRef Wood, P.T.: Query languages for graph databases. ACM SIGMOD Rec. 41(1), 50–60 (2012)CrossRef
45.
go back to reference Yuan, Y., Lian, X., Wang, G., Ma, Y., Wang, Y.: Constrained shortest path query in a large time-dependent graph. Proc. VLDB Endow. 12(10), 1058–1070 (2019)CrossRef Yuan, Y., Lian, X., Wang, G., Ma, Y., Wang, Y.: Constrained shortest path query in a large time-dependent graph. Proc. VLDB Endow. 12(10), 1058–1070 (2019)CrossRef
46.
go back to reference Yue, D., Wu, X., Wang, Y., Li, Y., Chu, C.-H.: A review of data mining-based financial fraud detection research. In 2007 International Conference on Wireless Communications, Networking and Mobile Computing, pages 5519–5522. Ieee, (2007) Yue, D., Wu, X., Wang, Y., Li, Y., Chu, C.-H.: A review of data mining-based financial fraud detection research. In 2007 International Conference on Wireless Communications, Networking and Mobile Computing, pages 5519–5522. Ieee, (2007)
47.
go back to reference Zhang, X., Özsu, M.T.: Correlation constraint shortest path over large multi-relation graphs. Proc. VLDB Endow. 12(5), 488–501 (2019)CrossRef Zhang, X., Özsu, M.T.: Correlation constraint shortest path over large multi-relation graphs. Proc. VLDB Endow. 12(5), 488–501 (2019)CrossRef
48.
go back to reference Zou, L., Xu, K., Yu, J.X., Chen, L., Xiao, Y., Zhao, D.: Efficient processing of label-constraint reachability queries in large graphs. Inf. Syst. 40, 47–66 (2014)CrossRef Zou, L., Xu, K., Yu, J.X., Chen, L., Xiao, Y., Zhao, D.: Efficient processing of label-constraint reachability queries in large graphs. Inf. Syst. 40, 47–66 (2014)CrossRef
Metadata
Title
Answering reachability and K-reach queries on large graphs with label constraints
Authors
You Peng
Xuemin Lin
Ying Zhang
Wenjie Zhang
Lu Qin
Publication date
28-08-2021
Publisher
Springer Berlin Heidelberg
Published in
The VLDB Journal / Issue 1/2022
Print ISSN: 1066-8888
Electronic ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-021-00695-0

Other articles of this Issue 1/2022

The VLDB Journal 1/2022 Go to the issue

Premium Partner