ABSTRACT
Algorithms defining similarities between objects of an information network are important of many IR tasks. SimRank algorithm and its variations are popularly used in many applications. Many fast algorithms are also developed. In this note, we first reformulate them as random walks on the network and express them using forward and backward transition probably in a matrix form. Second, we show that P-Rank (SimRank is only the special case of P-Rank) has a unique solution of eeT when decay factor c is equal to 1. We also show that SimFusion algorithm is a special case of P-Rank algorithm and prove that the similarity matrix of SimFusion is the product of PageRank vector. Our experiments on the web datasets show that for P-Rank the decay factor c doesn't seriously affect the similarity accuracy and accuracy of P-Rank is also higher than SimFusion and SimRank.
- Glen, J. and Widom, J., 2002, SimRank: a measure of structural-context similarity., SIGKDD, 538--543. Google ScholarDigital Library
- Zhao, P.X., Han, J.W. and Sun, Y.Z., 2009, P-Rank: a comprehensive structural similarity measure over information networks., CIKM, 553--562. Google ScholarDigital Library
- Xi, W.S., Fox, E.A. and Fan, W.G., Zhang, B.Y., Chen, Z., Yan, J., Zhuang, D., 2005, SimFusion: measuring similarity using unified relationship matrix, SIGIR, 130--137. Google ScholarDigital Library
- CMU Four University Dataset, http://www.cs.cmu.edu/afs/cs/project/theo-20/www/data/Google Scholar
- Jvelin K. and Kenen J., 2002, Cumulated gain-based evaluation of IR techniques, ACM Transactions on Information Systems, 422 -- 446. Google ScholarDigital Library
Index Terms
- Closed form solution of similarity algorithms
Recommendations
An Adaptive Method for the Efficient Similarity Calculation
DASFAA '09: Proceedings of the 14th International Conference on Database Systems for Advanced Applications<em>SimRank</em> is a well-known algorithm for similarity calculation based on object-to-object relationship. However, it suffers from high computation cost. In this paper, we find that the convergence behavior of different object pairs is different ...
Efficient Algorithm for Computing Link-Based Similarity in Real World Networks
ICDM '09: Proceedings of the 2009 Ninth IEEE International Conference on Data MiningSimilarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in characterizing the object ...
Similarity Measures for Closed General Type-2 Fuzzy Sets: Overview, Comparisons, and a Geometric Approach
The similarity between two fuzzy sets (FSs) is an important concept in fuzzy logic. As the research interest on general type-2 (GT2) FSs has increased recently, many similarity measures for them have also been proposed. This paper gives a comprehensive ...
Comments