Skip to main content
Erschienen in: The VLDB Journal 1/2018

31.05.2017 | Regular Paper

Reachability querying: an independent permutation labeling approach

verfasst von: Hao Wei, Jeffrey Xu Yu, Can Lu, Ruoming Jin

Erschienen in: The VLDB Journal | Ausgabe 1/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Reachability query is a fundamental graph operation which answers whether a vertex can reach another vertex over a large directed graph G with n vertices and m edges and has been extensively studied. In the literature, all the approaches compute a label for every vertex in a graph G by index construction offline. The query time for answering reachability queries online is affected by the quality of the labels computed in index construction. The three main costs are the index construction time, the index size, and the query time. Some of the up-to-date approaches can answer reachability queries efficiently, but spend nonlinear time to construct an index. Some of the up-to-date approaches construct an index in linear time and space, but may need to depth-first search G at run-time in \(O(n + m)\). In this paper, we discuss a new randomized labeling approach, named IP label, to answer reachability queries with probability guarantee, and the randomness is by independent permutation. Two additional labels are also proposed to further enhance the query processing. In addition, to deal with dynamic graphs, we discuss the label maintenance over dynamic graphs and give efficient algorithms for the labels proposed. We conduct extensive experimental studies to compare with the up-to-date approaches using 19 large real datasets used in the existing work and synthetic datasets. We confirm the efficiency and scalability of our approach in static graphs testing, and our maintenance algorithms are about one order of magnitude faster than the existing ones in dynamic graphs testing.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
citeseerx.ist.psu.edu.
 
3
snap.stanford.edu.
 
4
govwild.hpi-web.de.
 
6
twitter.com.
 
7
dbpedia.org.
 
8
data.webarchive.org.uk.
 
Literatur
1.
Zurück zum Zitat Abboud, A., Williams, V.V.: Popular conjectures imply strong lower bounds for dynamic problems. In: FOCS (2014) Abboud, A., Williams, V.V.: Popular conjectures imply strong lower bounds for dynamic problems. In: FOCS (2014)
2.
Zurück zum Zitat Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proceedings of SIGMOD’89 (1989) Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proceedings of SIGMOD’89 (1989)
3.
Zurück zum Zitat Boldi, P., Santini, M., Vigna, S.: A large time-aware web graph. SIGIR Forum 42(2), 33–38 (2008)CrossRef Boldi, P., Santini, M., Vigna, S.: A large time-aware web graph. SIGIR Forum 42(2), 33–38 (2008)CrossRef
4.
Zurück zum Zitat Bramandia, R., Choi, B., Ng, W.K.: Incremental maintenance of 2-hop labeling of large graphs. IEEE Trans. Knowl. Data Eng. 22(5), 682–698 (2010)CrossRef Bramandia, R., Choi, B., Ng, W.K.: Incremental maintenance of 2-hop labeling of large graphs. IEEE Trans. Knowl. Data Eng. 22(5), 682–698 (2010)CrossRef
5.
Zurück zum Zitat Broder, A.: On the resemblance and containment of documents. In: Proceedings of SEQUENCES’97 (1997) Broder, A.: On the resemblance and containment of documents. In: Proceedings of SEQUENCES’97 (1997)
6.
Zurück zum Zitat Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. In: Proceedings of STOC’98 (1998) Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. In: Proceedings of STOC’98 (1998)
7.
Zurück zum Zitat Cai, J., Poon, C.K.: Path-hop: efficiently indexing large graphs for reachability queries. In: Proceedings of CIKM’10 (2010) Cai, J., Poon, C.K.: Path-hop: efficiently indexing large graphs for reachability queries. In: Proceedings of CIKM’10 (2010)
8.
Zurück zum Zitat Cha, M., Haddadi, H., Benevenuto, F., Gummadi, P.K.: Measuring user influence in twitter: the million follower fallacy. In: Proceedings of ICWSM’10 (2010) Cha, M., Haddadi, H., Benevenuto, F., Gummadi, P.K.: Measuring user influence in twitter: the million follower fallacy. In: Proceedings of ICWSM’10 (2010)
9.
Zurück zum Zitat Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: Proceedings of VLDB’05 (2005) Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: Proceedings of VLDB’05 (2005)
10.
Zurück zum Zitat Chen, Y., Chen, Y.: An efficient algorithm for answering graph reachability queries. In: Proceedings of ICDE’08 (2008) Chen, Y., Chen, Y.: An efficient algorithm for answering graph reachability queries. In: Proceedings of ICDE’08 (2008)
11.
Zurück zum Zitat Chen, Y., Chen, Y.: Decomposing dags into spanning trees: a new way to compress transitive closures. In: Proceedings of ICDE’11 (2011) Chen, Y., Chen, Y.: Decomposing dags into spanning trees: a new way to compress transitive closures. In: Proceedings of ICDE’11 (2011)
12.
Zurück zum Zitat Cheng, J., Huang, S., Wu, H., Fu, A.W.-C.: Tf-label: a topological-folding labeling scheme for reachability querying in a large graph. In: Proceedings of SIGMOD’13 (2013) Cheng, J., Huang, S., Wu, H., Fu, A.W.-C.: Tf-label: a topological-folding labeling scheme for reachability querying in a large graph. In: Proceedings of SIGMOD’13 (2013)
13.
Zurück zum Zitat Cheng, J., Shang, Z., Cheng, H., Wang, K., Yu, J.X.: K-reach: who is in your small world. PVLDB 5(11), 1292–1303 (2012) Cheng, J., Shang, Z., Cheng, H., Wang, K., Yu, J.X.: K-reach: who is in your small world. PVLDB 5(11), 1292–1303 (2012)
14.
Zurück zum Zitat Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computation of reachability labeling for large graphs. In: Proceedings of EDBT’06 (2006) Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computation of reachability labeling for large graphs. In: Proceedings of EDBT’06 (2006)
15.
Zurück zum Zitat Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computing reachability labelings for large graphs with high compression rate. In: Proceedings of EDBT’08 (2008) Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computing reachability labelings for large graphs with high compression rate. In: Proceedings of EDBT’08 (2008)
16.
Zurück zum Zitat Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997)MathSciNetCrossRefMATH Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997)MathSciNetCrossRefMATH
17.
Zurück zum Zitat Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proceedings of SODA’02 (2002) Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proceedings of SODA’02 (2002)
18.
Zurück zum Zitat Cohen, E., Kaplan, H.: Summarizing data using bottom-k sketches. In: PODC (2007) Cohen, E., Kaplan, H.: Summarizing data using bottom-k sketches. In: PODC (2007)
19.
Zurück zum Zitat Cohen, E., Kaplan, H.: Tighter estimation using bottom k sketches. PVLDB 1(1), 213–224 (2008) Cohen, E., Kaplan, H.: Tighter estimation using bottom k sketches. PVLDB 1(1), 213–224 (2008)
20.
Zurück zum Zitat Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. ACM SIGCOMM Comput. Commun. Rev. 29(4), 251–262 (1999)CrossRefMATH Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. ACM SIGCOMM Comput. Commun. Rev. 29(4), 251–262 (1999)CrossRefMATH
21.
Zurück zum Zitat Fisher, R.A., Yates, F., et al.: Statistical Tables for Biological, Agricultural and Medical Research, 3rd edn. Oliver and Boyd, Edinburgh (1949)MATH Fisher, R.A., Yates, F., et al.: Statistical Tables for Biological, Agricultural and Medical Research, 3rd edn. Oliver and Boyd, Edinburgh (1949)MATH
22.
Zurück zum Zitat Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)MathSciNetCrossRef Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)MathSciNetCrossRef
23.
Zurück zum Zitat Jin, R., Ruan, N., Dey, S., Yu, J. X.: Scarab: scaling reachability computation on large graphs. In: Proceedings of SIGMOD’12 (2012) Jin, R., Ruan, N., Dey, S., Yu, J. X.: Scarab: scaling reachability computation on large graphs. In: Proceedings of SIGMOD’12 (2012)
24.
Zurück zum Zitat Jin, R., Ruan, N., Xiang, Y., Wang, H.: Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Trans. Database Syst. 36(1), 7:1–7:44 (2011). doi:10.1145/1929934.1929941 Jin, R., Ruan, N., Xiang, Y., Wang, H.: Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Trans. Database Syst. 36(1), 7:1–7:44 (2011). doi:10.​1145/​1929934.​1929941
25.
Zurück zum Zitat Jin, R., Wang, G.: Simple, fast, and scalable reachability oracle. PVLDB 6(14), 1978–1989 (2013) Jin, R., Wang, G.: Simple, fast, and scalable reachability oracle. PVLDB 6(14), 1978–1989 (2013)
26.
Zurück zum Zitat Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-HOP: A high-compression indexing scheme for reachability query. In: Proceedings of SIGMOD’09 (2009) Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-HOP: A high-compression indexing scheme for reachability query. In: Proceedings of SIGMOD’09 (2009)
27.
Zurück zum Zitat Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently answering reachability queries on very large directed graphs. In: Proceedings of SIGMOD’08 (2008) Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently answering reachability queries on very large directed graphs. In: Proceedings of SIGMOD’08 (2008)
28.
Zurück zum Zitat Knuth, D.E.: The Art of Computer Programming, Volume 2: Seminumerical Algorithms. Addison-Wesley, Boston (1981)MATH Knuth, D.E.: The Art of Computer Programming, Volume 2: Seminumerical Algorithms. Addison-Wesley, Boston (1981)MATH
29.
Zurück zum Zitat Lacki, J.: Improved deterministic algorithms for decremental reachability and strongly connected components. ACM Trans. Algorithms 9(3), 27:1–27:15 (2013). doi:10.1145/2483699.2483707 Lacki, J.: Improved deterministic algorithms for decremental reachability and strongly connected components. ACM Trans. Algorithms 9(3), 27:1–27:15 (2013). doi:10.​1145/​2483699.​2483707
30.
Zurück zum Zitat Roditty, L.: Decremental maintenance of strongly connected components. In: Proceedings of SODA (2013) Roditty, L.: Decremental maintenance of strongly connected components. In: Proceedings of SODA (2013)
31.
Zurück zum Zitat Schenkel, R., Theobald, A., Weikum, G.: Hopi: An efficient connection index for complex XML document collections. In: Proceedings of EDBT’04 (2004) Schenkel, R., Theobald, A., Weikum, G.: Hopi: An efficient connection index for complex XML document collections. In: Proceedings of EDBT’04 (2004)
32.
Zurück zum Zitat Seufert, S., Anand, A., Bedathur, S. J., Weikum, G.: Ferrari: Flexible and efficient reachability range assignment for graph indexing. In: Proceedings of ICDE’13 (2013) Seufert, S., Anand, A., Bedathur, S. J., Weikum, G.: Ferrari: Flexible and efficient reachability range assignment for graph indexing. In: Proceedings of ICDE’13 (2013)
33.
34.
Zurück zum Zitat TrißI, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: Proceedings of SIGMOD’07 (2007) TrißI, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: Proceedings of SIGMOD’07 (2007)
35.
Zurück zum Zitat van Schaik, S.J., de Moor, O.: A memory efficient reachability data structure through bit vector compression. In: Proceedings of SIGMOD’11 (2011) van Schaik, S.J., de Moor, O.: A memory efficient reachability data structure through bit vector compression. In: Proceedings of SIGMOD’11 (2011)
36.
Zurück zum Zitat Veloso, R.R., Cerf, L., W, M. Jr.: Zaki, M.J.: Reachability queries in very large graphs: A fast refined online search approach. In: Proceedings of EDBT (2014) Veloso, R.R., Cerf, L., W, M. Jr.: Zaki, M.J.: Reachability queries in very large graphs: A fast refined online search approach. In: Proceedings of EDBT (2014)
37.
Zurück zum Zitat Wang, H., He, H., Yang, J., Yu, P. S., Yu, J. X.: Dual labeling: answering graph reachability queries in constant time. In: Proceedings of ICDE’06 (2006) Wang, H., He, H., Yang, J., Yu, P. S., Yu, J. X.: Dual labeling: answering graph reachability queries in constant time. In: Proceedings of ICDE’06 (2006)
38.
Zurück zum Zitat Wei, H., Yu, J.X., Lu, C., Jin, R.: Reachability querying: an independent permutation labeling approach. PVLDB 7(12), 1191–1202 (2014) Wei, H., Yu, J.X., Lu, C., Jin, R.: Reachability querying: an independent permutation labeling approach. PVLDB 7(12), 1191–1202 (2014)
39.
Zurück zum Zitat Yildirim, H., Chaoji, V., Zaki, M.J.: Grail: scalable reachability index for large graphs. PVLDB 3(1), 276–284 (2010) Yildirim, H., Chaoji, V., Zaki, M.J.: Grail: scalable reachability index for large graphs. PVLDB 3(1), 276–284 (2010)
40.
Zurück zum Zitat Yildirim, H., Chaoji, V., Zaki, M.J.: Grail: a scalable index for reachability queries in very large graphs. VLDB J. 21(4), 509–534 (2012)CrossRef Yildirim, H., Chaoji, V., Zaki, M.J.: Grail: a scalable index for reachability queries in very large graphs. VLDB J. 21(4), 509–534 (2012)CrossRef
41.
Zurück zum Zitat Yildirim, H., Chaoji, V., Zaki, M. J.: Dagger: a scalable index for reachability queries in large dynamic graphs. arXiv preprint arXiv:1301.0977 (2013) Yildirim, H., Chaoji, V., Zaki, M. J.: Dagger: a scalable index for reachability queries in large dynamic graphs. arXiv preprint arXiv:​1301.​0977 (2013)
42.
Zurück zum Zitat Yu, J. X., Cheng, J.: Graph reachability queries: A survey. In: Aggarwal, C.C., Wang, H. (eds.) Managing and Mining Graph Data, pp. 181–215. Springer (2010) Yu, J. X., Cheng, J.: Graph reachability queries: A survey. In: Aggarwal, C.C., Wang, H. (eds.) Managing and Mining Graph Data, pp. 181–215. Springer (2010)
43.
Zurück zum Zitat Zhang, Z., Yu, J. X., Qin, L., Zhu, Q., Zhou, X.: I/o cost minimization: reachability queries processing over massive graphs. In: Proceedings of EDBT’12 (2012) Zhang, Z., Yu, J. X., Qin, L., Zhu, Q., Zhou, X.: I/o cost minimization: reachability queries processing over massive graphs. In: Proceedings of EDBT’12 (2012)
44.
Zurück zum Zitat Zhu, A. D., Lin, W., Wang, S., Xiao, X.: Reachability queries on large dynamic graphs: a total order approach. In: Proceedings of the 2014 ACM SIGMOD (2014) Zhu, A. D., Lin, W., Wang, S., Xiao, X.: Reachability queries on large dynamic graphs: a total order approach. In: Proceedings of the 2014 ACM SIGMOD (2014)
45.
Zurück zum Zitat Zhu, L., Choi, B., He, B., Yu, J.X., Ng, W.K.: A uniform framework for ad-hoc indexes to answer reachability queries on large graphs. In: DASFAA (2009) Zhu, L., Choi, B., He, B., Yu, J.X., Ng, W.K.: A uniform framework for ad-hoc indexes to answer reachability queries on large graphs. In: DASFAA (2009)
Metadaten
Titel
Reachability querying: an independent permutation labeling approach
verfasst von
Hao Wei
Jeffrey Xu Yu
Can Lu
Ruoming Jin
Publikationsdatum
31.05.2017
Verlag
Springer Berlin Heidelberg
Erschienen in
The VLDB Journal / Ausgabe 1/2018
Print ISSN: 1066-8888
Elektronische ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-017-0468-3

Weitere Artikel der Ausgabe 1/2018

The VLDB Journal 1/2018 Zur Ausgabe