Skip to main content

2018 | OriginalPaper | Buchkapitel

Min-Forest: Fast Reachability Indexing Approach for Large-Scale Graphs on Spark Platform

verfasst von : Liu Yang, Tongyong Liu, Zhigang Hu, Zhifang Liao, Jun Long

Erschienen in: Web Services – ICWS 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Reachability query is an important graph operation in graph database which answers whether a vertex can reach another vertex through a path over the graph, and it is also fundamental to real applications involved with graph-shaped data. However, the increasingly large amount of data in real graph database makes it more challenging for query efficiency and scalability. In this paper, we propose Min-Forest approach to handle with reachability problem in large graphs. We present Min-Forest structure to transfer and label the original DAG, and introduce a 4-tuple labeling scheme to construct index for each vertices, which integrate interval labels for trees and non-tree labels. We design efficient reachability query algorithms for Min-Forest approach on the Cloud Platform of Spark. The experiment results show that query time of Min-Forest approach is also on average about 10−4 ms for large dense graphs, and query time and index construction time of our approach are linear for both sparse graphs and dense graphs. It can answer reachability queries much faster than the state-of-art approaches on real graphs database, especially on large and dense ones.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Cheng, B., Zhai, Z., Zhao, S., Chen, J.: LSMP: a lightweight services mashup platform for ordinary-users. IEEE Commun. Mag. 55(4), 116–122 (2017)CrossRef Cheng, B., Zhai, Z., Zhao, S., Chen, J.: LSMP: a lightweight services mashup platform for ordinary-users. IEEE Commun. Mag. 55(4), 116–122 (2017)CrossRef
2.
Zurück zum Zitat Cheng, B., Li, C., Chen, J.: A web services discovery approach based on interface underlying semantics mining. IEEE Trans. Knowl. Data Eng. 29(5), 950–962 (2017) CrossRef Cheng, B., Li, C., Chen, J.: A web services discovery approach based on interface underlying semantics mining. IEEE Trans. Knowl. Data Eng. 29(5), 950–962 (2017) CrossRef
5.
Zurück zum Zitat Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)MathSciNetCrossRef Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)MathSciNetCrossRef
6.
Zurück zum Zitat Chen, Y., Chen, Y.: An efficient algorithm for answering graph reachability queries. In: 24th ICDE Proceedings, pp. 893–902. IEEE, New York (2008) Chen, Y., Chen, Y.: An efficient algorithm for answering graph reachability queries. In: 24th ICDE Proceedings, pp. 893–902. IEEE, New York (2008)
7.
Zurück zum Zitat Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: ACM SIGMOD Proceedings, pp. 253–262. ACM, New York (1989)CrossRef Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: ACM SIGMOD Proceedings, pp. 253–262. ACM, New York (1989)CrossRef
8.
Zurück zum Zitat Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: answering graph reachability queries in constant time. In: 22th ICDE Proceedings, pp. 1–12. IEEE, New York (2006) Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: answering graph reachability queries in constant time. In: 22th ICDE Proceedings, pp. 1–12. IEEE, New York (2006)
9.
Zurück zum Zitat Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently answering reachability queries on very large directed graphs. In: 8th SIGMOD Proceedings, pp. 595–608. ACM, New York (2008) Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently answering reachability queries on very large directed graphs. In: 8th SIGMOD Proceedings, pp. 595–608. ACM, New York (2008)
10.
Zurück zum Zitat Jin, R., Ruan, N., Xiang, Y., Wang, H.: Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Trans. Database Syst. 36(1), 73–84 (2011)CrossRef Jin, R., Ruan, N., Xiang, Y., Wang, H.: Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Trans. Database Syst. 36(1), 73–84 (2011)CrossRef
11.
Zurück zum Zitat Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32(5), 1338–1355 (2003)MathSciNetCrossRef Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32(5), 1338–1355 (2003)MathSciNetCrossRef
12.
Zurück zum Zitat Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-hop: a high-compression indexing scheme for reachability query. In: 9th ACM SIGMOD Proceedings, pp. 813–826. ACM, New York (2009) Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-hop: a high-compression indexing scheme for reachability query. In: 9th ACM SIGMOD Proceedings, pp. 813–826. ACM, New York (2009)
13.
Zurück zum Zitat Cai, J., Poon, C.K.: Path-hop: efficiently indexing large graphs for reachability queries. In: 10th CIKM Proceedings, pp. 119–128. ACM, New York (2010) Cai, J., Poon, C.K.: Path-hop: efficiently indexing large graphs for reachability queries. In: 10th CIKM Proceedings, pp. 119–128. ACM, New York (2010)
14.
Zurück zum Zitat Yildirim, H., Chaoji, V., Zaki, M.J.: GRAIL: a scalable index for reachability queries in very large graphs. VLDB J. 21(4), 509–534 (2012)CrossRef Yildirim, H., Chaoji, V., Zaki, M.J.: GRAIL: a scalable index for reachability queries in very large graphs. VLDB J. 21(4), 509–534 (2012)CrossRef
15.
Zurück zum Zitat Seufert, S., Anand, A., Bedathur, S., Weikum, G.: FERRARI: flexible and efficient reachability range assignment for graph indexing. In: 13th ICDE Proceedings, pp. 1009–1020. IEEE, New York (2013) Seufert, S., Anand, A., Bedathur, S., Weikum, G.: FERRARI: flexible and efficient reachability range assignment for graph indexing. In: 13th ICDE Proceedings, pp. 1009–1020. IEEE, New York (2013)
16.
Zurück zum Zitat Jin, R., Ruan, N., Dey, S., Xu, J.Y.: SCARAB: scaling reachability computation on large graphs. In: 12th SIGMOD Proceedings, pp. 169–180. ACM, New York (2012) Jin, R., Ruan, N., Dey, S., Xu, J.Y.: SCARAB: scaling reachability computation on large graphs. In: 12th SIGMOD Proceedings, pp. 169–180. ACM, New York (2012)
17.
Zurück zum Zitat Su, J., Zhu, Q., Wei, H., Yu, J.X.: Reachability querying: can it be even faster? IEEE Trans. Knowl. Data Eng. 29(3), 683–697 (2017)CrossRef Su, J., Zhu, Q., Wei, H., Yu, J.X.: Reachability querying: can it be even faster? IEEE Trans. Knowl. Data Eng. 29(3), 683–697 (2017)CrossRef
Metadaten
Titel
Min-Forest: Fast Reachability Indexing Approach for Large-Scale Graphs on Spark Platform
verfasst von
Liu Yang
Tongyong Liu
Zhigang Hu
Zhifang Liao
Jun Long
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-94289-6_28

Premium Partner