skip to main content
10.1145/3106426.3109447acmconferencesArticle/Chapter ViewAbstractPublication PageswiConference Proceedingsconference-collections
research-article

Partial sums-based P-Rank computation in information networks

Published:23 August 2017Publication History

ABSTRACT

P-Rank is a simple and captivating link-based similarity measure that extends SimRank by exploiting both in- and out-links for similarity computation. However, the existing work of P-Rank computation is expensive in terms of time and space cost and cannot efficiently support similarity computation in large information networks. For tackling this problem, in this paper, we propose an optimization technique for fast P-Rank computation in information networks by adopting the spiritual of partial sums. We write P-Rank equation based on partial sums and further approximate this equation by setting a threshold for ignoring the small similarity scores during iterative similarity computation. An optimized similarity computation algorithm is developed, which reduces the computation cost by skipping the similarity scores smaller than the give threshold during accumulation operations. And the accuracy loss estimation under the threshold is given through extensive mathematical analysis. Extensive experiments demonstrate the effectiveness and efficiency of our proposed approach through comparing with the straightforward P-Rank computation algorithm.

References

  1. R. Amsler. 1972. Application of citation-based automatic classification. Technical report, The University of Texas at Austin Linguistics Research Center.Google ScholarGoogle Scholar
  2. Ioannis Antonellis, Hector Garcia-Molina, and Chi-Chao Chang. 2008. Simrank++: query rewriting through link analysis of the click graph. PVLDB 1, 1 (2008), 408--421. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dániel Fogaras and Balázs Rácz. 2005. Scaling link-based similarity search. In WWW. 641--650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4 (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Glen Jeh and Jennifer Widom. 2002. SimRank: a measure of structural-context similarity. In KDD. 538--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. M. Kessler. 1963. Bibliographic coupling between scientific papers. American Documentation 14 (1963), 10--25.Google ScholarGoogle ScholarCross RefCross Ref
  7. Cuiping Li, Jiawei Han, Guoming He, Xin Jin, Yizhou Sun, Yintao Yu, and Tianyi Wu. 2010. Fast computation of SimRank for static and dynamic information networks. In EDBT. 465--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Lina Li, Cuiping Li, Hong Chen, and Xiaoyong Du. 2013. MapReduce-Based SimRank Computation and Its Application in Social Recommender System. In IEEE International Congress on Big Data, BigData Congress 2013, June 27 2013--July 2, 2013. 133--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ruiqi Li, Xiang Zhao, Haichuan Shang, Yifan Chen, and Weidong Xiao. 2017. Fast top-k similarity join for SimRank. Inf. Sci. 381 (2017), 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zhenjiang Lin, Irwin King, and Michael R. Lyu. 2006. PageSim: A Novel Link-Based Similarity Measure for the World Wide Web. In WI. 687--693. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dmitry Lizorkin, Pavel Velikhov, Maxim N. Grinev, and Denis Turdakov. 2008. Accuracy estimate and optimization techniques for SimRank computation. PVLDB 1, 1 (2008), 422--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos, and Pinar Duygulu. 2004. Automatic multimedia cross-modal correlation discovery. In KDD. 653--658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Henry G. Small. 1973. Co-citation in the scientific literature: A new measure of the relationship betweenf two documents. Journal of the American Society for Information Scienc 24, 4 (1973), 265--269.Google ScholarGoogle ScholarCross RefCross Ref
  14. Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, Zheng Chen, Jun Yan, and Dong Zhuang. 2005. SimFusion: measuring similarity using unified relationship matrix. In SIGIR. 130--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xiaoxin Yin, Jiawei Han, and Philip S. Yu. 2006. LinkClus: Efficient Clustering via Heterogeneous Semantic Links. In VLDB. 427--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Mingxi Zhang, Hao Hu, Zhenying He, Liping Gao, and Liujie Sun. 2015. Efficient link-based similarity search in web networks. Expert Syst. Appl. 42, 22 (2015), 8868--8880. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mingxi Zhang, Hao Hu, Zhenying He, and Wei Wang. 2015. Top-k similarity search in heterogeneous information networks with x-star network schema. Expert Syst. Appl. 42, 2 (2015), 699--712. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Peixiang Zhao, Jiawei Han, and Yizhou Sun. 2009. P-Rank: a comprehensive structural similarity measure over information networks. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, November 2--6, 2009. 553--562. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    WI '17: Proceedings of the International Conference on Web Intelligence
    August 2017
    1284 pages
    ISBN:9781450349512
    DOI:10.1145/3106426

    Copyright © 2017 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 23 August 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    WI '17 Paper Acceptance Rate118of178submissions,66%Overall Acceptance Rate118of178submissions,66%
  • Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader