Abstract
Multi-relation graphs intuitively capture the heterogeneous correlations among real-world entities by allowing multiple types of relationships to be represented as entity-connecting edges, i.e., two entities could be correlated with more than one type of relationship. This is important in various applications such as social network analysis, ecology, and bio-informatics. Existing studies on these graphs usually consider an edge label constraint perspective, where each edge contains only one label and each edge is considered independently. For example, there are lines of research focusing on reachability between two vertices under a set of edge label constraints, or finding paths whose consecutive edge labels satisfy a user-specified logical expression. This is too restricted in real graphs, and in this work, we define a generic correlation constraint on multi-relation graphs from the perspective of vertex correlations, where a correlation can be defined recursively. Specifically, we formalize and investigate the shortest path problem over large multi-relation graphs in the presence of both necessity and denial constraints, which have various real applications. We show that it is nontrivial to apply conventional graph traversal algorithms (e.g., BFS or DFS) to address the challenge. To effectively reduce the search space, we propose a Hybrid Relation Encoding method, a.k.a. HyRE, to encode both topological and relation information in a compact way. We conduct extensive experiments over large real-world graphs to validate the effectiveness and efficiency of the proposed solution.
- I. Abraham, D. Delling, A. V. Goldberg, and R. F. F. Werneck. Hierarchical hub labelings for shortest paths. In Proc. 20th Annual European Symp. on Algorithms, pages 24--35, 2012. Google ScholarDigital Library
- T. Akiba, Y. Iwata, and Y. Yoshida. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 349--360. ACM, 2013. Google ScholarDigital Library
- H. V. N. Y. Angela Bonifati, George Fletcher. Querying graphs. In Synthesis Lectures on Data Management, Morgan Claypool, 2018. Google ScholarDigital Library
- R. Angles. The property graph database model. In Proceedings of the 12th Alberto Mendelzon International Workshop on Foundations of Data Management, 2018.Google Scholar
- C. Avery. Giraph: Large-scale graph processing infrastructure on hadoop. Proceedings of the Hadoop Summit. Santa Clara, 2011.Google Scholar
- M. Baitaluk, X. Qian, S. Godbole, A. Raval, A. Ray, and A. Gupta. Pathsys: integrating molecular interaction graphs for systems biology. BMC Bioinformatics, 7(1):55, Feb 2006.Google ScholarCross Ref
- P. Barceló, C. A. Hurtado, L. Libkin, and P. T. Wood. Expressive languages for path queries over graph-structured data. In Proc. ACM SIGACT-SIGMOD Symp. on Principles of Database Systems, pages 3--14. ACM, 2010. Google ScholarDigital Library
- C. L. Barrett, K. R. Bisset, R. Jacob, G. Konjevod, and M. V. Marathe. Classical and contemporary shortest path problems in road networks: Implementation and experimental analysis of the TRANSIMS router. In Proc. 10th Annual European Symp. on Algorithms, pages 126--138, 2002. Google ScholarDigital Library
- H. Bast, S. Funke, D. Matijevic, P. Sanders, and D. Schultes. Proc. 9th workshop on algorithm eng. and experiments. In Proceedings of the Nine Workshop on Algorithm Engineering and Experiments, 2007.Google Scholar
- S. Beamer, K. Asanovic, and D. A. Patterson. Direction-optimizing breadth-first search. Scientific Programming, 21(3--4):137--148, 2013. Google ScholarDigital Library
- F. Bonchi, A. Gionis, F. Gullo, and A. Ukkonen. Distance oracles in edge-labeled graphs. In Proc. 17th Int. Conf. on Extending Database Technology, pages 547--558, 2014.Google Scholar
- D. Calvanese, G. D. Giacomo, M. Lenzerini, and M. Y. Vardi. Containment of conjunctive regular path queries with inverse. In KR, pages 176--185. Morgan Kaufmann, 2000. Google ScholarDigital Library
- A. Cardillo, J. Gómez-Gardeñes, M. Zanin, M. Romance, D. Papo, F. del Pozo, and S. Boccaletti. Emergence of network features from multiplexity. CoRR, abs/1212.2153, 2012.Google Scholar
- M. Coscia, G. Rossetti, D. Pennacchioli, D. Ceccarelli, and F. Giannotti. You know because i know: a multidimensional network approach to human resources problem. In Advances in Social Networks Analysis and Mining, pages 434--441, 2013. Google ScholarDigital Library
- J. Dibbelt, T. Pajor, and D. Wagner. User-constrained multimodal route planning. ACM J. Experimental Algorithmics, 19(1), 2014. Google ScholarDigital Library
- E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1:269--271, 1959. Google ScholarDigital Library
- W. Fan, J. Li, S. Ma, N. Tang, and Y. Wu. Adding regular expressions to graph reachability and pattern queries. Front. Comput. Sci., 6(3):313--338, 2012.Google ScholarCross Ref
- D. Florescu, A. Y. Levy, and D. Suciu. Query containment for conjunctive queries with regular expressions. In Proc. ACM SIGACT-SIGMOD Symp. on Principles of Database Systems, pages 139--148. ACM Press, 1998. Google ScholarDigital Library
- R. Geisberger, M. N. Rice, P. Sanders, and V. J. Tsotras. Route planning with flexible edge restrictions. ACM J. Experimental Algorithmics, 17(1), 2012. Google ScholarDigital Library
- R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In Proc. Int. Workshop on Experimental and Efficient Algorithms, pages 319--333, 2008. Google ScholarDigital Library
- A. Gubichev, S. Bedathur, S. Seufert, and G. Weikum. Fast and accurate estimation of shortest paths in large graphs. In Proc. 19th ACM Int. Conf. on Information and Knowledge Management, pages 499--508. ACM, 2010. Google ScholarDigital Library
- P. E. Hart, N. J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Systems Science and Cybernetics, 4(2): 100--107, 1968.Google ScholarCross Ref
- M. S. Hassan, W. G. Aref, and A. M. Aly. Graph indexing for shortest-path finding over dynamic sub-graphs. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 1183--1197, 2016. Google ScholarDigital Library
- J. Håstad. Clique is hard to approximate within n<sup>1-epsilon</sup>. In FOCS, pages 627--636. IEEE Computer Society, 1996. Google ScholarDigital Library
- Janusgraph: Distributed graph database. http://janusgraph.org/, 2018.Google Scholar
- A. Kanterakis, D. Kafetzopoulos, V. Moustakis, and G. Potamias. Mining gene expression profiles and gene regulatory networks: Identification of phenotype-specific molecular mechanisms. In Proc. Artificial Intelligence: Theories, Models and Applications, 5th Hellenic Conf. on AI, pages 97--109, 2008. Google ScholarDigital Library
- D. Lasalle and G. Karypis. Multi-threaded graph partitioning. In Proc. 27th IEEE Int. Parallel & Distributed Processing Symp., pages 225--236, 2013. Google ScholarDigital Library
- Y. Liang and P. Zhao. Similarity search in graph databases: A multi-layered indexing approach. In Proc. 33st Int. Conf. on Data Engineering, pages 783--794, 2017.Google ScholarCross Ref
- S. Ma, K. Feng, J. Li, H. Wang, G. Cong, and J. Huai. Proxies for shortest path and distance queries. IEEE Trans. Knowl. and Data Eng., 28(7):1835--1850, 2016.Google ScholarCross Ref
- G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 135--146. ACM, 2010. Google ScholarDigital Library
- M. Mao, J. Lu, G. Zhang, and J. Zhang. Multirelational social recommendations via multigraph ranking. IEEE Trans. Cybernetics, 47(12):4049--4061, 2017.Google ScholarCross Ref
- J. J. McAuley and J. Leskovec. Image labeling on a network: Using social-network metadata for image classification. In ECCV (4), volume 7575 of Lecture Notes in Computer Science, pages 828--841. Springer, 2012. Google ScholarDigital Library
- A. O. Mendelzon and P. T. Wood. Finding regular simple paths in graph databases. SIAM J. on Comput., 24(6):1235--1258, 1995. Google ScholarDigital Library
- Microsoft academic graph. https://www.openacademic.ai/oag/, 2015.Google Scholar
- V. Rastogi, A. Machanavajjhala, L. Chitnis, and A. D. Sarma. Finding connected components in map-reduce in logarithmic rounds. In Proc. 29th Int. Conf. on Data Engineering, pages 50--61, 2013. Google ScholarDigital Library
- M. N. Rice and V. J. Tsotras. Graph indexing of road networks for shortest path queries with label restrictions. PVLDB, 4(2):69--80, 2010. Google ScholarDigital Library
- H. Samet, J. Sankaranarayanan, and H. Alborzi. Scalable network distance browsing in spatial databases. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 43--54, 2008. Google ScholarDigital Library
- A. Sealfon. Shortest paths and distances with differential privacy. In Proc. 35nd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, pages 29--41, 2016. Google ScholarDigital Library
- P. Selmer, A. Poulovassilis, and P. T. Wood. Implementing flexible operators for regular path queries. In Proc. 18th Int. Conf. on Extending Database Technology, pages 149--156, 2015.Google Scholar
- T. Shulong, B. Jiajun, C. Chun, and H. Xiaofei. Using rich social media information for music recommendation via hypergraph model. In ACM Trans. Multimedia Comp., Comm., and Appl., volume 7S, pages 213--237. 2011. Google ScholarDigital Library
- M. Stella, C. S. Andreazzi, S. Selakovic, A. Goudarzi, and A. Antonioni. Parasite spreading in spatial ecological multiplex networks. J. Complex Networks, 5(3):486--511, 2017.Google Scholar
- D. Szklarczyk, A. Franceschini, S. Wyder, K. Forslund, D. Heller, J. Huerta-Cepas, M. Simonovic, A. Roth, A. Santos, K. P. Tsafou, M. Kuhn, P. Bork, L. J. Jensen, and C. von Mering. String v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Research, 43(D1):D447, 2014.Google Scholar
- D. Szklarczyk, J. H. Morris, H. Cook, M. Kuhn, S. Wyder, M. Simonovic, A. Santos, N. T. Doncheva, A. Roth, P. Bork, L. J. Jensen, and C. von Mering. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Research, 45(Database-Issue):D362--D368, 2017.Google Scholar
- L. Tang, X. Wang, and H. Liu. Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov., 25(1):1--33, 2012. Google ScholarDigital Library
- The neo4j graph platform. https://neo4j.com/, 2018.Google Scholar
- L. D. J. Valstar, G. H. L. Fletcher, and Y. Yoshida. Landmark indexing for evaluation of label-constrained reachability queries. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 345--358, 2017. Google ScholarDigital Library
- S. Wang, Y. Tsai, T. Hong, and H. Kao. k-anonymization of multiple shortest paths. J. Soft Comput., 21(15):4215--4226, 2017. Google ScholarDigital Library
- S. Wang, Z. Tsai, T. Hong, and I. Ting. Anonymizing shortest paths on social network graphs. In Proc. 3rd Int. Conf. on Intell. Information and Database Syst., pages 129--136, 2011. Google ScholarDigital Library
- F. Wei. Tedi: Efficient shortest path query answering on graphs. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 99--110, 2010. Google ScholarDigital Library
- F. Wei. TEDI: efficient shortest path query answering on graphs. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 99--110, 2010. Google ScholarDigital Library
- D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981--1992, 2014. Google ScholarDigital Library
Recommendations
Path-Bicolorable Graphs
Graph Theory, Computational Intelligence and ThoughtIn this paper, we introduce the notion of path-bicolorability that generalizes bipartite graphs in a natural way: For <em>k</em> *** 2, a graph <em>G</em> = (<em>V</em> ,<em>E</em> ) is <em>P</em> <em>k</em> <em>-bicolorable</em> if its vertex set <em>...
Large Induced Forests in Graphs
In this article, we prove three theorems. The first is that every connected graph of order n and size m has an induced forest of order at least 8n-2m-2/9 with equality if and only if such a graph is obtained from a tree by expanding every vertex to a ...
Shortest paths in Sierpiński graphs
In 23], Klav ar and Milutinović (1997) proved that there exist at most two different shortest paths between any two vertices in Sierpiński graphs S k n , and showed that the number of shortest paths between any fixed pair of vertices of S k n can be ...
Comments